diff mbox series

linux-user/syscall: Silence warning from the undefined behavior sanitizer

Message ID 20210211132959.574168-1-thuth@redhat.com (mailing list archive)
State New, archived
Headers show
Series linux-user/syscall: Silence warning from the undefined behavior sanitizer | expand

Commit Message

Thomas Huth Feb. 11, 2021, 1:29 p.m. UTC
When compiling QEMU with -fsanitize=undefined, there is a warning when
running "make check-tcg":

  TEST    linux-test on m68k
../linux-user/syscall.c:10499:34: runtime error: member access within
 misaligned address 0x00008006df3c for type 'struct linux_dirent64',
 which requires 8 byte alignment
0x00008006df3c: note: pointer points here
  00 00 00 00 68 03 28 00  00 00 00 00 5b 96 3e e4  61 4b 05 26 18 00 04 2e  00 00 00 00 da 3f 18 00
              ^

It's likely not an issue in reality, since I assume that on hosts where
the alignment really matters (like sparc64), the Linux kernel likely
adds the right padding. Anyway, let's use the stw_p() / stq_p() accessor
helpers here to silence the warning and thus to allow to compile the code
with -fsanitize=undefined, too.

Signed-off-by: Thomas Huth <thuth@redhat.com>
---
 linux-user/syscall.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

Comments

Laurent Vivier Feb. 11, 2021, 9:28 p.m. UTC | #1
Le 11/02/2021 à 14:29, Thomas Huth a écrit :
> When compiling QEMU with -fsanitize=undefined, there is a warning when
> running "make check-tcg":
> 
>   TEST    linux-test on m68k
> ../linux-user/syscall.c:10499:34: runtime error: member access within
>  misaligned address 0x00008006df3c for type 'struct linux_dirent64',
>  which requires 8 byte alignment
> 0x00008006df3c: note: pointer points here
>   00 00 00 00 68 03 28 00  00 00 00 00 5b 96 3e e4  61 4b 05 26 18 00 04 2e  00 00 00 00 da 3f 18 00
>               ^
> 
> It's likely not an issue in reality, since I assume that on hosts where
> the alignment really matters (like sparc64), the Linux kernel likely
> adds the right padding. Anyway, let's use the stw_p() / stq_p() accessor
> helpers here to silence the warning and thus to allow to compile the code
> with -fsanitize=undefined, too.
> 
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>  linux-user/syscall.c | 16 +++++++++-------
>  1 file changed, 9 insertions(+), 7 deletions(-)
> 
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index 34760779c8..50de535ade 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -10491,20 +10491,22 @@ static abi_long do_syscall1(void *cpu_env, int num, abi_long arg1,
>                  return -TARGET_EFAULT;
>              ret = get_errno(sys_getdents64(arg1, dirp, count));
>              if (!is_error(ret)) {
> -                struct linux_dirent64 *de;
> +                char *de;
>                  int len = ret;
>                  int reclen;
> -                de = dirp;
> +                de = (char *)dirp;
> +                #define de64(x) offsetof(struct linux_dirent64, x)

Do we really need the cast to the "(char *)"?

can't we use "&de->XXX" with the accessors?
We don't access the memory, only read the address, the compiler should be happy.


>                  while (len > 0) {
> -                    reclen = de->d_reclen;
> +                    reclen = lduw_he_p(de + de64(d_reclen));

to avoid human error, it would be better to let the compiler take the good accessor:

 ldn_he_p(&de->d_reclen, sizeof(de->d_reclen))

>                      if (reclen > len)
>                          break;
> -                    de->d_reclen = tswap16(reclen);
> -                    tswap64s((uint64_t *)&de->d_ino);
> -                    tswap64s((uint64_t *)&de->d_off);
> -                    de = (struct linux_dirent64 *)((char *)de + reclen);
> +                    stw_p(de + de64(d_reclen), reclen);
> +                    stq_p(de + de64(d_ino), ldq_he_p(de + de64(d_ino)));
> +                    stq_p(de + de64(d_off), ldq_he_p(de + de64(d_off)));

and stwn_he_p() here too.

> +                    de += reclen;
>                      len -= reclen;
>                  }
> +                #undef de64
>              }
>              unlock_user(dirp, arg2, ret);
>          }
> 

Thank you Thomas for your help.

Laurent
Laurent Vivier Feb. 12, 2021, 12:35 a.m. UTC | #2
Le 11/02/2021 à 22:28, Laurent Vivier a écrit :
> Le 11/02/2021 à 14:29, Thomas Huth a écrit :
>> When compiling QEMU with -fsanitize=undefined, there is a warning when
>> running "make check-tcg":
>>
>>   TEST    linux-test on m68k
>> ../linux-user/syscall.c:10499:34: runtime error: member access within
>>  misaligned address 0x00008006df3c for type 'struct linux_dirent64',
>>  which requires 8 byte alignment
>> 0x00008006df3c: note: pointer points here
>>   00 00 00 00 68 03 28 00  00 00 00 00 5b 96 3e e4  61 4b 05 26 18 00 04 2e  00 00 00 00 da 3f 18 00
>>               ^
>>
>> It's likely not an issue in reality, since I assume that on hosts where
>> the alignment really matters (like sparc64), the Linux kernel likely
>> adds the right padding. Anyway, let's use the stw_p() / stq_p() accessor
>> helpers here to silence the warning and thus to allow to compile the code
>> with -fsanitize=undefined, too.

Wait... if the alignment differs between m68k and the host, I guess the size of the structure differs?

In this case we cannot use the guest memory to call the host syscall, we must allocate a host
structure and copy the values into the guest structure.

Thanks,
Laurent


>> Signed-off-by: Thomas Huth <thuth@redhat.com>
>> ---
>>  linux-user/syscall.c | 16 +++++++++-------
>>  1 file changed, 9 insertions(+), 7 deletions(-)
>>
>> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
>> index 34760779c8..50de535ade 100644
>> --- a/linux-user/syscall.c
>> +++ b/linux-user/syscall.c
>> @@ -10491,20 +10491,22 @@ static abi_long do_syscall1(void *cpu_env, int num, abi_long arg1,
>>                  return -TARGET_EFAULT;
>>              ret = get_errno(sys_getdents64(arg1, dirp, count));
>>              if (!is_error(ret)) {
>> -                struct linux_dirent64 *de;
>> +                char *de;
>>                  int len = ret;
>>                  int reclen;
>> -                de = dirp;
>> +                de = (char *)dirp;
>> +                #define de64(x) offsetof(struct linux_dirent64, x)
> 
> Do we really need the cast to the "(char *)"?
> 
> can't we use "&de->XXX" with the accessors?
> We don't access the memory, only read the address, the compiler should be happy.
> 
> 
>>                  while (len > 0) {
>> -                    reclen = de->d_reclen;
>> +                    reclen = lduw_he_p(de + de64(d_reclen));
> 
> to avoid human error, it would be better to let the compiler take the good accessor:
> 
>  ldn_he_p(&de->d_reclen, sizeof(de->d_reclen))
> 
>>                      if (reclen > len)
>>                          break;
>> -                    de->d_reclen = tswap16(reclen);
>> -                    tswap64s((uint64_t *)&de->d_ino);
>> -                    tswap64s((uint64_t *)&de->d_off);
>> -                    de = (struct linux_dirent64 *)((char *)de + reclen);
>> +                    stw_p(de + de64(d_reclen), reclen);
>> +                    stq_p(de + de64(d_ino), ldq_he_p(de + de64(d_ino)));
>> +                    stq_p(de + de64(d_off), ldq_he_p(de + de64(d_off)));
> 
> and stwn_he_p() here too.
> 
>> +                    de += reclen;
>>                      len -= reclen;
>>                  }
>> +                #undef de64
>>              }
>>              unlock_user(dirp, arg2, ret);
>>          }
>>
> 
> Thank you Thomas for your help.
> 
> Laurent
>
Thomas Huth Feb. 12, 2021, 7:45 a.m. UTC | #3
On 11/02/2021 22.28, Laurent Vivier wrote:
> Le 11/02/2021 à 14:29, Thomas Huth a écrit :
>> When compiling QEMU with -fsanitize=undefined, there is a warning when
>> running "make check-tcg":
>>
>>    TEST    linux-test on m68k
>> ../linux-user/syscall.c:10499:34: runtime error: member access within
>>   misaligned address 0x00008006df3c for type 'struct linux_dirent64',
>>   which requires 8 byte alignment
>> 0x00008006df3c: note: pointer points here
>>    00 00 00 00 68 03 28 00  00 00 00 00 5b 96 3e e4  61 4b 05 26 18 00 04 2e  00 00 00 00 da 3f 18 00
>>                ^
>>
>> It's likely not an issue in reality, since I assume that on hosts where
>> the alignment really matters (like sparc64), the Linux kernel likely
>> adds the right padding. Anyway, let's use the stw_p() / stq_p() accessor
>> helpers here to silence the warning and thus to allow to compile the code
>> with -fsanitize=undefined, too.
>>
>> Signed-off-by: Thomas Huth <thuth@redhat.com>
>> ---
>>   linux-user/syscall.c | 16 +++++++++-------
>>   1 file changed, 9 insertions(+), 7 deletions(-)
>>
>> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
>> index 34760779c8..50de535ade 100644
>> --- a/linux-user/syscall.c
>> +++ b/linux-user/syscall.c
>> @@ -10491,20 +10491,22 @@ static abi_long do_syscall1(void *cpu_env, int num, abi_long arg1,
>>                   return -TARGET_EFAULT;
>>               ret = get_errno(sys_getdents64(arg1, dirp, count));
>>               if (!is_error(ret)) {
>> -                struct linux_dirent64 *de;
>> +                char *de;
>>                   int len = ret;
>>                   int reclen;
>> -                de = dirp;
>> +                de = (char *)dirp;
>> +                #define de64(x) offsetof(struct linux_dirent64, x)
> 
> Do we really need the cast to the "(char *)"?
> 
> can't we use "&de->XXX" with the accessors?
> We don't access the memory, only read the address, the compiler should be happy.

That's what I thought and tried first, too. Unfortunately, it did not help 
to fix the issue, I had to take the detour via the char*

I guess the compiler also checks the alignment of the pointer when it gets 
assigned to the next record below ("de = ...").

  Thomas


> 
>>                   while (len > 0) {
>> -                    reclen = de->d_reclen;
>> +                    reclen = lduw_he_p(de + de64(d_reclen));
> 
> to avoid human error, it would be better to let the compiler take the good accessor:
> 
>   ldn_he_p(&de->d_reclen, sizeof(de->d_reclen))
> 
>>                       if (reclen > len)
>>                           break;
>> -                    de->d_reclen = tswap16(reclen);
>> -                    tswap64s((uint64_t *)&de->d_ino);
>> -                    tswap64s((uint64_t *)&de->d_off);
>> -                    de = (struct linux_dirent64 *)((char *)de + reclen);
>> +                    stw_p(de + de64(d_reclen), reclen);
>> +                    stq_p(de + de64(d_ino), ldq_he_p(de + de64(d_ino)));
>> +                    stq_p(de + de64(d_off), ldq_he_p(de + de64(d_off)));
> 
> and stwn_he_p() here too.
> 
>> +                    de += reclen;
>>                       len -= reclen;
>>                   }
>> +                #undef de64
>>               }
>>               unlock_user(dirp, arg2, ret);
>>           }
>>
> 
> Thank you Thomas for your help.
> 
> Laurent
>
Thomas Huth Feb. 12, 2021, 7:56 a.m. UTC | #4
On 12/02/2021 01.35, Laurent Vivier wrote:
> Le 11/02/2021 à 22:28, Laurent Vivier a écrit :
>> Le 11/02/2021 à 14:29, Thomas Huth a écrit :
>>> When compiling QEMU with -fsanitize=undefined, there is a warning when
>>> running "make check-tcg":
>>>
>>>    TEST    linux-test on m68k
>>> ../linux-user/syscall.c:10499:34: runtime error: member access within
>>>   misaligned address 0x00008006df3c for type 'struct linux_dirent64',
>>>   which requires 8 byte alignment
>>> 0x00008006df3c: note: pointer points here
>>>    00 00 00 00 68 03 28 00  00 00 00 00 5b 96 3e e4  61 4b 05 26 18 00 04 2e  00 00 00 00 da 3f 18 00
>>>                ^
>>>
>>> It's likely not an issue in reality, since I assume that on hosts where
>>> the alignment really matters (like sparc64), the Linux kernel likely
>>> adds the right padding. Anyway, let's use the stw_p() / stq_p() accessor
>>> helpers here to silence the warning and thus to allow to compile the code
>>> with -fsanitize=undefined, too.
> 
> Wait... if the alignment differs between m68k and the host, I guess the size of the structure differs?

No, as far as I understand, the size and layout of the structure are the 
same on all architecture. The problem is that the "dirp = 
lock_user(VERIFY_WRITE, arg2, count, 0)" already ends up in a pointer that 
is only aligned to a 4-byte boundary. Since the m68k code is only restricted 
to a 4-byte alignment, arg2 is only aligned to 4 bytes. But if the host 
needs 8-byte alignment for the struct, we've certainly lost here.

Having said that, I think my patch is still wrong. It silences the ubsan 
warnings, but it won't fix the problem when the code is e.g. running on a 
sparc64 host. There we likely need to make sure that the buffer on the host 
is already aligned to an 8-byte boundary when doing the sys_getdents64() 
call to the host kernel.
So I guess we need a bounce buffer here anyway?

  Thomas
diff mbox series

Patch

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 34760779c8..50de535ade 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -10491,20 +10491,22 @@  static abi_long do_syscall1(void *cpu_env, int num, abi_long arg1,
                 return -TARGET_EFAULT;
             ret = get_errno(sys_getdents64(arg1, dirp, count));
             if (!is_error(ret)) {
-                struct linux_dirent64 *de;
+                char *de;
                 int len = ret;
                 int reclen;
-                de = dirp;
+                de = (char *)dirp;
+                #define de64(x) offsetof(struct linux_dirent64, x)
                 while (len > 0) {
-                    reclen = de->d_reclen;
+                    reclen = lduw_he_p(de + de64(d_reclen));
                     if (reclen > len)
                         break;
-                    de->d_reclen = tswap16(reclen);
-                    tswap64s((uint64_t *)&de->d_ino);
-                    tswap64s((uint64_t *)&de->d_off);
-                    de = (struct linux_dirent64 *)((char *)de + reclen);
+                    stw_p(de + de64(d_reclen), reclen);
+                    stq_p(de + de64(d_ino), ldq_he_p(de + de64(d_ino)));
+                    stq_p(de + de64(d_off), ldq_he_p(de + de64(d_off)));
+                    de += reclen;
                     len -= reclen;
                 }
+                #undef de64
             }
             unlock_user(dirp, arg2, ret);
         }