diff mbox

[2/7] QEMU does not currently support host pages that are larger than guest pages, likely due to glibc using fixed mmap requests.

Message ID 5765E31F.2060806@raptorengineeringinc.com (mailing list archive)
State New, archived
Headers show

Commit Message

Timothy Pearson June 19, 2016, 12:11 a.m. UTC
Attempting to use host pages larger than the guest leads to
alignment errors during ELF load in the best case, and an
initialization failure inside NPTL in the worst case, causing
all fork() requests inside the guest to fail.

Warn when thread space cannot be set up, and suggest reducing
host page size if applicable.

Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
---
 linux-user/syscall.c | 27 +++++++++++++++++++++------
 1 file changed, 21 insertions(+), 6 deletions(-)

         } else {
             ret = -1;
         }
@@ -5514,10 +5519,20 @@ static int do_fork(CPUArchState *env, unsigned
int flags, abi_ulong newsp,
                (not implemented) or having *_tidptr to point at a
shared memory
                mapping.  We can't repeat the spinlock hack used above
because
                the child process gets its own copy of the lock.  */
-            if (flags & CLONE_CHILD_SETTID)
-                put_user_u32(gettid(), child_tidptr);
-            if (flags & CLONE_PARENT_SETTID)
-                put_user_u32(gettid(), parent_tidptr);
+            if (flags & CLONE_CHILD_SETTID) {
+                if (put_user_u32(gettid(), child_tidptr)) {
+                    fprintf(stderr, "do_fork: put_user_u32() failed,
child process state invalid\n");
+                    if (qemu_real_host_page_size > TARGET_PAGE_SIZE)
+                        fprintf(stderr, "do_fork: host page size >
target page size; reduce host page size and try again\n");
+                }
+            }
+            if (flags & CLONE_PARENT_SETTID) {
+                if (put_user_u32(gettid(), parent_tidptr)) {
+                    fprintf(stderr, "do_fork: put_user_u32() failed,
child process state invalid\n");
+                    if (qemu_real_host_page_size > TARGET_PAGE_SIZE)
+                        fprintf(stderr, "do_fork: host page size >
target page size; reduce host page size and try again\n");
+                }
+            }
             ts = (TaskState *)cpu->opaque;
             if (flags & CLONE_SETTLS)
                 cpu_set_tls (env, newtls);

Comments

Peter Maydell June 19, 2016, 9:46 a.m. UTC | #1
On 19 June 2016 at 01:11, Timothy Pearson
<tpearson@raptorengineering.com> wrote:
> Attempting to use host pages larger than the guest leads to
> alignment errors during ELF load in the best case, and an
> initialization failure inside NPTL in the worst case, causing
> all fork() requests inside the guest to fail.
>
> Warn when thread space cannot be set up, and suggest reducing
> host page size if applicable.

This is supposed to work -- for instance the linux-user/mmap.c
code has support for host pages and target pages not being the same.
In particular for ARM guests TARGET_PAGE_SIZE is 1K but the
host page size is 4K, so the config of "host page larger than
guest" isn't untested.

> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -5482,8 +5482,13 @@ static int do_fork(CPUArchState *env, unsigned
> int flags, abi_ulong newsp,
>              /* Wait for the child to initialize.  */
>              pthread_cond_wait(&info.cond, &info.mutex);
>              ret = info.tid;
> -            if (flags & CLONE_PARENT_SETTID)
> -                put_user_u32(ret, parent_tidptr);
> +            if (flags & CLONE_PARENT_SETTID) {
> +                if (put_user_u32(ret, parent_tidptr)) {
> +                    fprintf(stderr, "do_fork: put_user_u32() failed,
> child process state invalid\n");
> +                    if (qemu_real_host_page_size > TARGET_PAGE_SIZE)
> +                        fprintf(stderr, "do_fork: host page size >
> target page size; reduce host page size and try again\n");
> +                }
> +            }

I think we should figure out why these put_user_u32() calls
are failing and fix them.

thanks
-- PMM
Richard Henderson June 19, 2016, 6:24 p.m. UTC | #2
On 06/19/2016 02:46 AM, Peter Maydell wrote:
> On 19 June 2016 at 01:11, Timothy Pearson
> <tpearson@raptorengineering.com> wrote:
>> Attempting to use host pages larger than the guest leads to
>> alignment errors during ELF load in the best case, and an
>> initialization failure inside NPTL in the worst case, causing
>> all fork() requests inside the guest to fail.
>>
>> Warn when thread space cannot be set up, and suggest reducing
>> host page size if applicable.
>
> This is supposed to work -- for instance the linux-user/mmap.c
> code has support for host pages and target pages not being the same.
> In particular for ARM guests TARGET_PAGE_SIZE is 1K but the
> host page size is 4K, so the config of "host page larger than
> guest" isn't untested.

You're right, it isn't untested, it's completely broken.

The arm 1k example isn't really realistic, because the actual arm kernel doles 
out 4k units, and that's what userspace was built to expect.  Try running i386 
on aarch64 (or ppc64) with a 64k page kernel.  Maps fail very very quickly.

This is one of the reasons why I think there's no way to do such emulation 
without softmmu for linux-user.


r~
Peter Maydell June 20, 2016, 1:25 p.m. UTC | #3
On 19 June 2016 at 19:24, Richard Henderson <rth@twiddle.net> wrote:
> On 06/19/2016 02:46 AM, Peter Maydell wrote:
>> This is supposed to work -- for instance the linux-user/mmap.c
>> code has support for host pages and target pages not being the same.
>> In particular for ARM guests TARGET_PAGE_SIZE is 1K but the
>> host page size is 4K, so the config of "host page larger than
>> guest" isn't untested.
>
>
> You're right, it isn't untested, it's completely broken.

:-(

Is there a way we can warn about this which doesn't give
unnecessary warnings for:
 (a) the ARM case
 (b) things which don't actually care (I'm thinking the
     gcc test suite and similar stuff which doesn't actually
     mess with mmap in practice)

?

thanks
-- PMM
Richard Henderson June 22, 2016, 12:55 a.m. UTC | #4
On 06/20/2016 06:25 AM, Peter Maydell wrote:
> On 19 June 2016 at 19:24, Richard Henderson <rth@twiddle.net> wrote:
>> On 06/19/2016 02:46 AM, Peter Maydell wrote:
>>> This is supposed to work -- for instance the linux-user/mmap.c
>>> code has support for host pages and target pages not being the same.
>>> In particular for ARM guests TARGET_PAGE_SIZE is 1K but the
>>> host page size is 4K, so the config of "host page larger than
>>> guest" isn't untested.
>>
>>
>> You're right, it isn't untested, it's completely broken.
>
> :-(
>
> Is there a way we can warn about this which doesn't give
> unnecessary warnings for:
>  (a) the ARM case
>  (b) things which don't actually care (I'm thinking the
>      gcc test suite and similar stuff which doesn't actually
>      mess with mmap in practice)

Not that I'm aware of.

What's worse is that even the gcc testsuite winds up caring, because the guest 
binary can't even be mapped, much less run far enough to get to a mmap syscall. 
  See e.g. that aarch64 64k example, and look at our standard linux-user-0.3 tests.


r~
diff mbox

Patch

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 1c17b74..2968b57 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -5482,8 +5482,13 @@  static int do_fork(CPUArchState *env, unsigned
int flags, abi_ulong newsp,
             /* Wait for the child to initialize.  */
             pthread_cond_wait(&info.cond, &info.mutex);
             ret = info.tid;
-            if (flags & CLONE_PARENT_SETTID)
-                put_user_u32(ret, parent_tidptr);
+            if (flags & CLONE_PARENT_SETTID) {
+                if (put_user_u32(ret, parent_tidptr)) {
+                    fprintf(stderr, "do_fork: put_user_u32() failed,
child process state invalid\n");
+                    if (qemu_real_host_page_size > TARGET_PAGE_SIZE)
+                        fprintf(stderr, "do_fork: host page size >
target page size; reduce host page size and try again\n");
+                }
+            }