diff mbox series

configure: Improve alias attribute check

Message ID 20210320042753.69297-1-gshan@redhat.com (mailing list archive)
State New, archived
Headers show
Series configure: Improve alias attribute check | expand

Commit Message

Gavin Shan March 20, 2021, 4:27 a.m. UTC
It's still possible that the wrong value is returned from the alias
of variable even if the program can be compiled without issue. This
improves the check by executing the binary to check the result.

If alias attribute can't be working properly, the @target_page in
exec-vary.c will always return zeroes when we have the following gcc
version.

  # gcc --version
  gcc (GCC) 11.0.0 20210210 (Red Hat 11.0.0-0)

This abstracts the code from exec-vary.c and use it as indicator to
enable gcc alias attribute or not.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 configure | 34 ++++++++++++++++++++++++++++++----
 1 file changed, 30 insertions(+), 4 deletions(-)

Comments

Thomas Huth March 20, 2021, 4:48 a.m. UTC | #1
On 20/03/2021 05.27, Gavin Shan wrote:
> It's still possible that the wrong value is returned from the alias
> of variable even if the program can be compiled without issue. This
> improves the check by executing the binary to check the result.
> 
> If alias attribute can't be working properly, the @target_page in
> exec-vary.c will always return zeroes when we have the following gcc
> version.
> 
>    # gcc --version
>    gcc (GCC) 11.0.0 20210210 (Red Hat 11.0.0-0)
> 
> This abstracts the code from exec-vary.c and use it as indicator to
> enable gcc alias attribute or not.
> 
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>   configure | 34 ++++++++++++++++++++++++++++++----
>   1 file changed, 30 insertions(+), 4 deletions(-)
> 
> diff --git a/configure b/configure
> index f7d022a5db..8321f380d5 100755
> --- a/configure
> +++ b/configure
> @@ -75,6 +75,7 @@ fi
>   
>   TMPB="qemu-conf"
>   TMPC="${TMPDIR1}/${TMPB}.c"
> +TMPC_B="${TMPDIR1}/${TMPB}_b.c"
>   TMPO="${TMPDIR1}/${TMPB}.o"
>   TMPCXX="${TMPDIR1}/${TMPB}.cxx"
>   TMPE="${TMPDIR1}/${TMPB}.exe"
> @@ -4878,13 +4879,38 @@ fi
>   
>   attralias=no
>   cat > $TMPC << EOF
> -int x = 1;
> +static int x;
>   extern const int y __attribute__((alias("x")));
> -int main(void) { return 0; }
> +extern int read_y(void);
> +void write_x(int val);
> +
> +void write_x(int val)
> +{
> +    x = val;
> +}
> +
> +int main(void)
> +{
> +    return read_y();
> +}
>   EOF
> -if compile_prog "" "" ; then
> -    attralias=yes
> +cat > $TMPC_B << EOF
> +extern const int y;
> +extern void write_x(int val);
> +int read_y(void);
> +
> +int read_y(void)
> +{
> +     write_x(1);
> +     return y;
> +}
> +EOF
> +
> +TMPC+=" ${TMPC_B}"
> +if compile_prog "" "" && ! $TMPE; then

What about cross-compiling? Running an executable won't work if QEMU gets 
cross-compiled...

  Thomas
Paolo Bonzini March 20, 2021, 5:52 p.m. UTC | #2
On 20/03/21 05:27, Gavin Shan wrote:
> It's still possible that the wrong value is returned from the alias
> of variable even if the program can be compiled without issue. This
> improves the check by executing the binary to check the result.
> 
> If alias attribute can't be working properly, the @target_page in
> exec-vary.c will always return zeroes when we have the following gcc
> version.
> 
>   # gcc --version
>   gcc (GCC) 11.0.0 20210210 (Red Hat 11.0.0-0)
> 
> This abstracts the code from exec-vary.c and use it as indicator to
> enable gcc alias attribute or not.
> 
> +void write_x(int val);
> +
> +void write_x(int val)
> +{
> +    x = val;
> +}
> +
> +int main(void)
> +{
> +    return read_y();
> +}

I think this should be "read_y() == 1 ? 0 : 1".

I can reproduce it with -flto -O2 but not without -flto, do you agree?

Perhaps we can obtain the same optimization by wrapping reads of the 
page size in an inline __attribute__((const)) function.  Richard, what 
do you think?

Paolo
Richard Henderson March 20, 2021, 10:33 p.m. UTC | #3
On 3/20/21 11:52 AM, Paolo Bonzini wrote:
>> +int main(void)
>> +{
>> +    return read_y();
>> +}
> 
> I think this should be "read_y() == 1 ? 0 : 1".

As a testcase returning 0 on success, yes.

> I can reproduce it with -flto -O2 but not without -flto, do you agree?

Agreed.  Replicated with a random recent gcc 11 snapshot.
This is really annoying of lto.  It's clear something needs to change though.

> Perhaps we can obtain the same optimization by wrapping reads of the page size 
> in an inline __attribute__((const)) function.  Richard, what do you think?

I'll give it a shot and see what happens.


r~
Gavin Shan March 20, 2021, 11:32 p.m. UTC | #4
Hi Thomas,

On 3/20/21 3:48 PM, Thomas Huth wrote:
> On 20/03/2021 05.27, Gavin Shan wrote:
>> It's still possible that the wrong value is returned from the alias
>> of variable even if the program can be compiled without issue. This
>> improves the check by executing the binary to check the result.
>>
>> If alias attribute can't be working properly, the @target_page in
>> exec-vary.c will always return zeroes when we have the following gcc
>> version.
>>
>>    # gcc --version
>>    gcc (GCC) 11.0.0 20210210 (Red Hat 11.0.0-0)
>>
>> This abstracts the code from exec-vary.c and use it as indicator to
>> enable gcc alias attribute or not.
>>
>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>> ---
>>   configure | 34 ++++++++++++++++++++++++++++++----
>>   1 file changed, 30 insertions(+), 4 deletions(-)
>>
>> diff --git a/configure b/configure
>> index f7d022a5db..8321f380d5 100755
>> --- a/configure
>> +++ b/configure
>> @@ -75,6 +75,7 @@ fi
>>   TMPB="qemu-conf"
>>   TMPC="${TMPDIR1}/${TMPB}.c"
>> +TMPC_B="${TMPDIR1}/${TMPB}_b.c"
>>   TMPO="${TMPDIR1}/${TMPB}.o"
>>   TMPCXX="${TMPDIR1}/${TMPB}.cxx"
>>   TMPE="${TMPDIR1}/${TMPB}.exe"
>> @@ -4878,13 +4879,38 @@ fi
>>   attralias=no
>>   cat > $TMPC << EOF
>> -int x = 1;
>> +static int x;
>>   extern const int y __attribute__((alias("x")));
>> -int main(void) { return 0; }
>> +extern int read_y(void);
>> +void write_x(int val);
>> +
>> +void write_x(int val)
>> +{
>> +    x = val;
>> +}
>> +
>> +int main(void)
>> +{
>> +    return read_y();
>> +}
>>   EOF
>> -if compile_prog "" "" ; then
>> -    attralias=yes
>> +cat > $TMPC_B << EOF
>> +extern const int y;
>> +extern void write_x(int val);
>> +int read_y(void);
>> +
>> +int read_y(void)
>> +{
>> +     write_x(1);
>> +     return y;
>> +}
>> +EOF
>> +
>> +TMPC+=" ${TMPC_B}"
>> +if compile_prog "" "" && ! $TMPE; then
> 
> What about cross-compiling? Running an executable won't work if QEMU gets cross-compiled...
> 

Executing the cross-compiled binary returns 126, which means we will
disable gcc alias attribute for cross-compiling case with the following
changes included into v2:

int main(void) { return (read_y() == 1) ? 0 : 1; }

if compile_prog "" "" && $TMPE >/dev/null 2>/dev/null; then
    attralias=yes
fi

Thanks,
Gavin
Gavin Shan March 20, 2021, 11:36 p.m. UTC | #5
Hi Paolo and Richard,

On 3/21/21 9:33 AM, Richard Henderson wrote:
> On 3/20/21 11:52 AM, Paolo Bonzini wrote:
>>> +int main(void)
>>> +{
>>> +    return read_y();
>>> +}
>>
>> I think this should be "read_y() == 1 ? 0 : 1".
> 
> As a testcase returning 0 on success, yes.
> 

Ok. I will include the changes in v2. Also, I will
wrap the lines, for example:

int main(void) { return (read_y() == 1) ? 0 : 1; }

if compile_prog "" "" && $TMPE >/dev/null 2>/dev/null; then
    attralias=yes
fi

>> I can reproduce it with -flto -O2 but not without -flto, do you agree?
> 
> Agreed.  Replicated with a random recent gcc 11 snapshot.
> This is really annoying of lto.  It's clear something needs to change though.
> 

The command I used is:

gcc -O2 -flto=auto config-temp.c config-temp-b.c -o config-temp.exe.

Removing "-O2" or "-flto=auto" can make the gcc alias attribute workable again.

>> Perhaps we can obtain the same optimization by wrapping reads of the page size in an inline __attribute__((const)) function.  Richard, what do you think?
> 
> I'll give it a shot and see what happens.
> 

Thanks,
Gavin
Richard Henderson March 21, 2021, 3:49 p.m. UTC | #6
On 3/20/21 4:33 PM, Richard Henderson wrote:
> On 3/20/21 11:52 AM, Paolo Bonzini wrote:
>>> +int main(void)
>>> +{
>>> +    return read_y();
>>> +}
>>
>> I think this should be "read_y() == 1 ? 0 : 1".
> 
> As a testcase returning 0 on success, yes.
> 
>> I can reproduce it with -flto -O2 but not without -flto, do you agree?
> 
> Agreed.  Replicated with a random recent gcc 11 snapshot.
> This is really annoying of lto.  It's clear something needs to change though.
> 
>> Perhaps we can obtain the same optimization by wrapping reads of the page 
>> size in an inline __attribute__((const)) function.  Richard, what do you think?
> 
> I'll give it a shot and see what happens.

What exact version of gcc are you guys using?  Something from rawhide that I 
can just install?

So far I have failed to compile with gcc master with --enable-lto.  Lots of 
missing symbols reported at link time.  Therefore I've been unable to actually 
test what I intended to test.

That said, I'm not hopeful that __attribute__((const)) will work.  I have an 
idea that early inlining will inline tiny function calls too soon, before the 
attribute has a change to DTRT during CSE.  At which point we're left with bare 
variable references and we're exactly where we would be if we did nothing 
special at all.

Another workaround may be to avoid compiling exec-vary.c with -flto.  I'm not 
sure that my meson fu is up to that.  Paolo?

I have filed a gcc bug report:

   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99696

Hopefully someone can address that before gcc 11 gets released.  At which point 
we need do nothing in qemu.  Aldy?


r~
Paolo Bonzini March 21, 2021, 4:50 p.m. UTC | #7
Il dom 21 mar 2021, 16:49 Richard Henderson <richard.henderson@linaro.org>
ha scritto:

> What exact version of gcc are you guys using?  Something from rawhide that
> I can just install?
>

I am using Fedora 34. I upgraded just to test this bug and it seems stable
except that GNOME Shell extensions need an upgrade. However I haven't tried
building all of QEMU, only the test case.

So far I have failed to compile with gcc master with --enable-lto.  Lots of
> missing symbols reported at link time.  Therefore I've been unable to
> actually
> test what I intended to test.
>
> That said, I'm not hopeful that __attribute__((const)) will work.  I have
> an
> idea that early inlining will inline tiny function calls too soon, before
> the
> attribute has a change to DTRT during CSE.


Yeah that's at least plausible.

Another workaround may be to avoid compiling exec-vary.c with -flto.  I'm
> not
> sure that my meson fu is up to that.  Paolo?
>

You would have to define a static library.

I have filed a gcc bug report:
>
>    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99696
>
> Hopefully someone can address that before gcc 11 gets released.  At which
> point we need do nothing in qemu.  Aldy?


Good point, I can give it a shot too just to see how rusty I am... That
would be the best outcome, though we would have to check LLVM as well. If
const doesn't work it would indeed be prudent to include Gavin's configure
check.

Paolo


>
> r~
>
>
Richard Henderson March 21, 2021, 5:34 p.m. UTC | #8
On 3/21/21 10:50 AM, Paolo Bonzini wrote:
>     Another workaround may be to avoid compiling exec-vary.c with -flto.  I'm not
>     sure that my meson fu is up to that.  Paolo?
> 
> You would have to define a static library.

Ok.  With an extra -fno-lto flag, or can I somehow remove -flto from the 
library's cflags?  Or unset the meson b_lto variable?

>     I have filed a gcc bug report:
> 
>     https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99696
>     <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99696>
> 
>     Hopefully someone can address that before gcc 11 gets released.  At which
>     point we need do nothing in qemu.  Aldy?
> 
> 
> Good point, I can give it a shot too just to see how rusty I am... That would 
> be the best outcome, though we would have to check LLVM as well. If const 
> doesn't work it would indeed be prudent to include Gavin's configure check.

So, I've reproduced the testcase failure with gcc 9.3 (ubuntu 20.04) as well. 
Which means that there are at least two releases for which this has not worked.

I think Gavin's runtime test is unnecessary.  We don't have to check the 
runtime results, we can just [ "$lto" = true ], and we fairly well know it'll fail.


r~
Paolo Bonzini March 21, 2021, 5:43 p.m. UTC | #9
Il dom 21 mar 2021, 18:34 Richard Henderson <richard.henderson@linaro.org>
ha scritto:

> On 3/21/21 10:50 AM, Paolo Bonzini wrote:
> >     Another workaround may be to avoid compiling exec-vary.c with
> -flto.  I'm not
> >     sure that my meson fu is up to that.  Paolo?
> >
> > You would have to define a static library.
>
> Ok.  With an extra -fno-lto flag, or can I somehow remove -flto from the
> library's cflags?  Or unset the meson b_lto variable?
>

-fno-lto should work, yes. b_lto tries to cater to other compilers, but we
don't support anything but gcc-like drivers.

>     I have filed a gcc bug report:
> >
> >     https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99696
> >
> >     Hopefully someone can address that before gcc 11 gets released.  At
> which
> >     point we need do nothing in qemu.  Aldy?
>
> So, I've reproduced the testcase failure with gcc 9.3 (ubuntu 20.04) as
> well.
> Which means that there are at least two releases for which this has not
> worked.
>
> I think Gavin's runtime test is unnecessary.  We don't have to check the
> runtime results, we can just [ "$lto" = true ], and we fairly well know
> it'll fail.
>

Yeah, if anything the test can be used to re-enable attribute((alias)) once
we know there are some fixed compilers. (Though it's quite ugly to have
worse compilation when cross-compiling).

Paolo


>
> r~
>
>
Paolo Bonzini March 21, 2021, 5:46 p.m. UTC | #10
HRM, what about biting the bullet and making exec-vary.c a C++ source?...
Then instead of making it conditional an attribute((alias)), we make it
conditional on having a C++ compiler.

Making cpu-all.h compile as C++ would be complex, but we can isolate all
the required declarations in a separate header file that does not need
either osdep.h or glib-compat.h, and basically just have a global
constructor in exec-vary.cc that forwards to a function in exec.c.

Paolo

Il dom 21 mar 2021, 18:43 Paolo Bonzini <pbonzini@redhat.com> ha scritto:

>
>
> Il dom 21 mar 2021, 18:34 Richard Henderson <richard.henderson@linaro.org>
> ha scritto:
>
>> On 3/21/21 10:50 AM, Paolo Bonzini wrote:
>> >     Another workaround may be to avoid compiling exec-vary.c with
>> -flto.  I'm not
>> >     sure that my meson fu is up to that.  Paolo?
>> >
>> > You would have to define a static library.
>>
>> Ok.  With an extra -fno-lto flag, or can I somehow remove -flto from the
>> library's cflags?  Or unset the meson b_lto variable?
>>
>
> -fno-lto should work, yes. b_lto tries to cater to other compilers, but we
> don't support anything but gcc-like drivers.
>
> >     I have filed a gcc bug report:
>> >
>> >     https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99696
>> >
>> >     Hopefully someone can address that before gcc 11 gets released.  At
>> which
>> >     point we need do nothing in qemu.  Aldy?
>>
>> So, I've reproduced the testcase failure with gcc 9.3 (ubuntu 20.04) as
>> well.
>> Which means that there are at least two releases for which this has not
>> worked.
>>
>> I think Gavin's runtime test is unnecessary.  We don't have to check the
>> runtime results, we can just [ "$lto" = true ], and we fairly well know
>> it'll fail.
>>
>
> Yeah, if anything the test can be used to re-enable attribute((alias))
> once we know there are some fixed compilers. (Though it's quite ugly to
> have worse compilation when cross-compiling).
>
> Paolo
>
>
>>
>> r~
>>
>>
Richard Henderson March 21, 2021, 6:23 p.m. UTC | #11
On 3/21/21 11:46 AM, Paolo Bonzini wrote:
> HRM, what about biting the bullet and making exec-vary.c a C++ source?... Then 
> instead of making it conditional an attribute((alias)), we make it conditional 
> on having a C++ compiler.

Doesn't help.  The gcc bug I filed talks about c++, because that's the closest 
analogy.

But set_preferred_target_page_bits is called *much* later than a constructor. 
Though still before any use of the variable in question, for which we have an 
--enable-debug-tcg assertion.


r~
Gavin Shan March 22, 2021, 10:54 a.m. UTC | #12
Hi Richard and Paolo,

On 3/22/21 5:23 AM, Richard Henderson wrote:
> On 3/21/21 11:46 AM, Paolo Bonzini wrote:
>> HRM, what about biting the bullet and making exec-vary.c a C++ source?... Then instead of making it conditional an attribute((alias)), we make it conditional on having a C++ compiler.
> 
> Doesn't help.  The gcc bug I filed talks about c++, because that's the closest analogy.
> 
> But set_preferred_target_page_bits is called *much* later than a constructor. Though still before any use of the variable in question, for which we have an --enable-debug-tcg assertion.
> 

It looks this issue can be avoided after "volatile" is applied to
@target_page. However, I'm not sure if it's the correct fix to have.
If it is, I can post a formal patch so that it can be included.

--- a/exec-vary.c
+++ b/exec-vary.c
-extern const TargetPageBits target_page
+extern const TargetPageBits volatile target_page
      __attribute__((alias("init_target_page")));
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
-extern const TargetPageBits target_page;
+extern const TargetPageBits volatile target_page;


According to the experiments I did, it would be function call
to set_preferred_target_page_bits() is dropped when the machine
is created. The following c files are used in the experiment:

--- a.c ---

static int x;
const extern int VOLATILE y __attribute__((alias("x")));
extern int read_y(void);

void write_x(int val) { x = 1; }
int main(void) { return read_y(); }

--- b.c---

extern const int VOLATILE y;
extern void write_x(int val);

int read_y(void) { write_x(1); return y; }

# gcc a.c b.c -O2 -flto=auto -DVOLATILE= -o a; ./a; echo $?
0
# gdb -nw ./a
(gdb) disassem main
Dump of assembler code for function main:
    0x0000000000400480 <+0>:	adrp	x1, 0x420000 <__libc_start_main@got.plt>
    0x0000000000400484 <+4>:	mov	w2, #0x1                   	// #1
    0x0000000000400488 <+8>:	mov	w0, #0x0                   	// #0
    0x000000000040048c <+12>:	str	w2, [x1, #32]
    0x0000000000400490 <+16>:	ret
End of assembler dump.

# gcc a.c b.c -O2 -flto=auto -DVOLATILE=volatile -o a; ./a; echo $?
1
# gdb -nw ./a
(gdb) disassem main
Dump of assembler code for function main:
    0x0000000000400480 <+0>:	adrp	x1, 0x420000 <__libc_start_main@got.plt>
    0x0000000000400484 <+4>:	mov	w2, #0x1                   	// #1
    0x0000000000400488 <+8>:	adrp	x0, 0x420000 <__libc_start_main@got.plt>
    0x000000000040048c <+12>:	str	w2, [x1, #32]
    0x0000000000400490 <+16>:	ldr	w0, [x0, #32]
    0x0000000000400494 <+20>:	ret


Thanks,
Gavin
Richard Henderson March 22, 2021, 8:59 p.m. UTC | #13
On 3/22/21 4:54 AM, Gavin Shan wrote:
> It looks this issue can be avoided after "volatile" is applied to
> @target_page. However, I'm not sure if it's the correct fix to have.

Certainly not.

That is the exact opposite of what we want.  We want to minimize the number of 
reads from the variable, not maximize them.


r~
Gavin Shan March 23, 2021, 3:13 a.m. UTC | #14
Hi Richard,

On 3/23/21 7:59 AM, Richard Henderson wrote:
> On 3/22/21 4:54 AM, Gavin Shan wrote:
>> It looks this issue can be avoided after "volatile" is applied to
>> @target_page. However, I'm not sure if it's the correct fix to have.
> 
> Certainly not.
> 
> That is the exact opposite of what we want.  We want to minimize the number of reads from the variable, not maximize them.
> 

Yes, It's something I was thinking of. "volatile" can make
@target_page visible to gcc, but maximizes the number of
reads. By the way, your patch to use "-fno-lto" worked for
me and it has been split into 3 patches by Phil. Richard,
thanks for the quick fixup :)

Thanks,
Gavin
diff mbox series

Patch

diff --git a/configure b/configure
index f7d022a5db..8321f380d5 100755
--- a/configure
+++ b/configure
@@ -75,6 +75,7 @@  fi
 
 TMPB="qemu-conf"
 TMPC="${TMPDIR1}/${TMPB}.c"
+TMPC_B="${TMPDIR1}/${TMPB}_b.c"
 TMPO="${TMPDIR1}/${TMPB}.o"
 TMPCXX="${TMPDIR1}/${TMPB}.cxx"
 TMPE="${TMPDIR1}/${TMPB}.exe"
@@ -4878,13 +4879,38 @@  fi
 
 attralias=no
 cat > $TMPC << EOF
-int x = 1;
+static int x;
 extern const int y __attribute__((alias("x")));
-int main(void) { return 0; }
+extern int read_y(void);
+void write_x(int val);
+
+void write_x(int val)
+{
+    x = val;
+}
+
+int main(void)
+{
+    return read_y();
+}
 EOF
-if compile_prog "" "" ; then
-    attralias=yes
+cat > $TMPC_B << EOF
+extern const int y;
+extern void write_x(int val);
+int read_y(void);
+
+int read_y(void)
+{
+     write_x(1);
+     return y;
+}
+EOF
+
+TMPC+=" ${TMPC_B}"
+if compile_prog "" "" && ! $TMPE; then
+   attralias=yes
 fi
+TMPC="${TMPDIR1}/${TMPB}.c"
 
 ########################################
 # check if getauxval is available.