diff mbox

QEMU postcopy-test failing on ppc64

Message ID 81408cfa-c1a2-ce87-8a31-4ae94d49fbfe@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Thomas Huth Nov. 15, 2016, 6:48 p.m. UTC
On 15.11.2016 19:13, Greg Kurz wrote:
> On Tue, 15 Nov 2016 17:43:39 +0000
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> 
>> * Thomas Huth (thuth@redhat.com) wrote:
>>> On 15.11.2016 16:18, Stefan Hajnoczi wrote:  
>>>> On Tue, Nov 15, 2016 at 2:56 PM, Greg Kurz <groug@kaod.org> wrote:  
>>>>> On Tue, 15 Nov 2016 11:14:57 +0000
>>>>> Stefan Hajnoczi <stefanha@gmail.com> wrote:
>>>>>  
>>>>>> On Tue, Nov 15, 2016 at 10:09 AM, Greg Kurz <groug@kaod.org> wrote:  
>>>>>>> On Tue, 15 Nov 2016 10:53:35 +0100
>>>>>>> Laurent Vivier <lvivier@redhat.com> wrote:
>>>>>>>  
>>>>>>>> On 14/11/2016 21:52, Stefan Hajnoczi wrote:  
>>>>>>>>> I hit a failure running "make check" on ppc64 for the first time.  Ideas?
>>>>>>>>>
>>>>>>>>> Stefan
>>>>>>>>>
>>>>>>>>> commit 682df581c65ed2c1b9e77093e332214ecaa1ee93
>>>>>>>>>
>>>>>>>>>   GTESTER check-qtest-ppc64
>>>>>>>>> Memory content inconsistency at 5af4000 first_byte = 1b last_byte = 1a
>>>>>>>>> current = 7c hit_edge = 1
>>>>>>>>> Memory content inconsistency at 5af5000 first_byte = 1b last_byte = 7c
>>>>>>>>> current = 1b hit_edge = 1
>>>>>>>>> Memory content inconsistency at 5e59000 first_byte = 1b last_byte = 1b
>>>>>>>>> current = 1a hit_edge = 1
>>>>>>>>> **
>>>>>>>>> ERROR:tests/postcopy-test.c:345:check_guests_ram: 'bad' should be FALSE
>>>>>>>>> GTester: last random seed: R02S9d79166a1ca7e21940a0f4b0b1255d5b
>>>>>>>>>  
>>>>>>>>
>>>>>>>> Are you using KVM PR?
>>>>>>>>
>>>>>>>> it  was working fine with TCG and KVM HV.
>>>>>>>>
>>>>>>>> Apparently, USERFAULTFD doesn't work with KVM PR.
>>>>>>>>
>>>>>>>> I've already seen this kind of error with nested KVM on Power:
>>>>>>>> guest in guest with KVM PR in host.
>>>>>>>>
>>>>>>>> This problem was reported on IRC by Greg if I remember correctly (CC:)
>>>>>>>>  
>>>>>>>
>>>>>>> Yeah I hit this when running make check in a PPC64 BE guest which
>>>>>>> has kvm_pr loaded. I did not find time to investigate though... I've
>>>>>>> switched to run make check on bare metal POWER7 instead.  
>>>>>>
>>>>>> Right, it's POWER7 PPC64 BE with kvm_pr.
>>>>>>  
>>>>>
>>>>> And you cannot install a PPC64 BE distro on this POWER7 host and use
>>>>> kvm_hv instead ?  
>>>>
>>>> Sure, I could get a non-kvm_pr environment.  I'm just concerned that
>>>> QEMU 2.8-rc0 is being tagged today and there is a POWER environment
>>>> where "make check" fails.
>>>>
>>>> Is kvm_pr supported by QEMU?  
>>>
>>> Basically yes, it's just like Greg said in another mail, it does not get
>>> much attention these days.
>>>   
>>>> If it is supported then "make check" ought to pass.  
>>>
>>> OK, since nobody got a really, really good idea how to fix this so far:
>>> What about changing the QEMU test to use only tcg for now, so we've got
>>> a clean release with QEMU v2.8? And then afterwards, we can either fix
>>> kvm-pr in the kernel, or introduce some other mechanism to select
>>> whether the test should be run with KVM or not (either a detection of
>>> the KVM type, or an environment variable or something similar).  
>>
>> I'm OK if you want to do that for Power; I'd prefer it kept preferring KVM on
>> x86.
> 
> Even for Power, I'd prefer to keep KVM since the problem only happens with
> KVM PR which isn't the preferred way to do KVM on bare metal... until this
> get fixed, I'd rather suggest people to run make check with KVM HV.

OK ... what do you think about a patch like this:


That way, accel=kvm:tcg is only used if the kvm_hv module is loaded,
otherwise it will use accel=tcg instead.

 Thomas

Comments

Eric Blake Nov. 15, 2016, 7 p.m. UTC | #1
On 11/15/2016 12:48 PM, Thomas Huth wrote:

>> Even for Power, I'd prefer to keep KVM since the problem only happens with
>> KVM PR which isn't the preferred way to do KVM on bare metal... until this
>> get fixed, I'd rather suggest people to run make check with KVM HV.
> 
> OK ... what do you think about a patch like this:
> 
> diff --git a/tests/postcopy-test.c b/tests/postcopy-test.c
> --- a/tests/postcopy-test.c
> +++ b/tests/postcopy-test.c
> @@ -380,17 +380,19 @@ static void test_migrate(void)
>                                    " -incoming %s",
>                                    tmpfs, bootpath, uri);
>      } else if (strcmp(arch, "ppc64") == 0) {
> +        const char *accel;
>          init_bootfile_ppc(bootpath);
> -        cmd_src = g_strdup_printf("-machine accel=kvm:tcg -m 256M"
> +        accel = system("/sbin/lsmod | grep -q kvm_hv") ? "tcg" : "kvm:tcg";

Unsafe use of system() (all I have to do is stick a counterfeit 'grep'
earlier on my PATH to mess you up). Is there a safer way to grab that
information without having to call out to the shell?

> That way, accel=kvm:tcg is only used if the kvm_hv module is loaded,
> otherwise it will use accel=tcg instead.
Greg Kurz Nov. 15, 2016, 8:26 p.m. UTC | #2
On Tue, 15 Nov 2016 19:48:20 +0100
Thomas Huth <thuth@redhat.com> wrote:

> On 15.11.2016 19:13, Greg Kurz wrote:
> > On Tue, 15 Nov 2016 17:43:39 +0000
> > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> >   
> >> * Thomas Huth (thuth@redhat.com) wrote:  
> >>> On 15.11.2016 16:18, Stefan Hajnoczi wrote:    
> >>>> On Tue, Nov 15, 2016 at 2:56 PM, Greg Kurz <groug@kaod.org> wrote:    
> >>>>> On Tue, 15 Nov 2016 11:14:57 +0000
> >>>>> Stefan Hajnoczi <stefanha@gmail.com> wrote:
> >>>>>    
> >>>>>> On Tue, Nov 15, 2016 at 10:09 AM, Greg Kurz <groug@kaod.org> wrote:    
> >>>>>>> On Tue, 15 Nov 2016 10:53:35 +0100
> >>>>>>> Laurent Vivier <lvivier@redhat.com> wrote:
> >>>>>>>    
> >>>>>>>> On 14/11/2016 21:52, Stefan Hajnoczi wrote:    
> >>>>>>>>> I hit a failure running "make check" on ppc64 for the first time.  Ideas?
> >>>>>>>>>
> >>>>>>>>> Stefan
> >>>>>>>>>
> >>>>>>>>> commit 682df581c65ed2c1b9e77093e332214ecaa1ee93
> >>>>>>>>>
> >>>>>>>>>   GTESTER check-qtest-ppc64
> >>>>>>>>> Memory content inconsistency at 5af4000 first_byte = 1b last_byte = 1a
> >>>>>>>>> current = 7c hit_edge = 1
> >>>>>>>>> Memory content inconsistency at 5af5000 first_byte = 1b last_byte = 7c
> >>>>>>>>> current = 1b hit_edge = 1
> >>>>>>>>> Memory content inconsistency at 5e59000 first_byte = 1b last_byte = 1b
> >>>>>>>>> current = 1a hit_edge = 1
> >>>>>>>>> **
> >>>>>>>>> ERROR:tests/postcopy-test.c:345:check_guests_ram: 'bad' should be FALSE
> >>>>>>>>> GTester: last random seed: R02S9d79166a1ca7e21940a0f4b0b1255d5b
> >>>>>>>>>    
> >>>>>>>>
> >>>>>>>> Are you using KVM PR?
> >>>>>>>>
> >>>>>>>> it  was working fine with TCG and KVM HV.
> >>>>>>>>
> >>>>>>>> Apparently, USERFAULTFD doesn't work with KVM PR.
> >>>>>>>>
> >>>>>>>> I've already seen this kind of error with nested KVM on Power:
> >>>>>>>> guest in guest with KVM PR in host.
> >>>>>>>>
> >>>>>>>> This problem was reported on IRC by Greg if I remember correctly (CC:)
> >>>>>>>>    
> >>>>>>>
> >>>>>>> Yeah I hit this when running make check in a PPC64 BE guest which
> >>>>>>> has kvm_pr loaded. I did not find time to investigate though... I've
> >>>>>>> switched to run make check on bare metal POWER7 instead.    
> >>>>>>
> >>>>>> Right, it's POWER7 PPC64 BE with kvm_pr.
> >>>>>>    
> >>>>>
> >>>>> And you cannot install a PPC64 BE distro on this POWER7 host and use
> >>>>> kvm_hv instead ?    
> >>>>
> >>>> Sure, I could get a non-kvm_pr environment.  I'm just concerned that
> >>>> QEMU 2.8-rc0 is being tagged today and there is a POWER environment
> >>>> where "make check" fails.
> >>>>
> >>>> Is kvm_pr supported by QEMU?    
> >>>
> >>> Basically yes, it's just like Greg said in another mail, it does not get
> >>> much attention these days.
> >>>     
> >>>> If it is supported then "make check" ought to pass.    
> >>>
> >>> OK, since nobody got a really, really good idea how to fix this so far:
> >>> What about changing the QEMU test to use only tcg for now, so we've got
> >>> a clean release with QEMU v2.8? And then afterwards, we can either fix
> >>> kvm-pr in the kernel, or introduce some other mechanism to select
> >>> whether the test should be run with KVM or not (either a detection of
> >>> the KVM type, or an environment variable or something similar).    
> >>
> >> I'm OK if you want to do that for Power; I'd prefer it kept preferring KVM on
> >> x86.  
> > 
> > Even for Power, I'd prefer to keep KVM since the problem only happens with
> > KVM PR which isn't the preferred way to do KVM on bare metal... until this
> > get fixed, I'd rather suggest people to run make check with KVM HV.  
> 
> OK ... what do you think about a patch like this:
> 
> diff --git a/tests/postcopy-test.c b/tests/postcopy-test.c
> --- a/tests/postcopy-test.c
> +++ b/tests/postcopy-test.c
> @@ -380,17 +380,19 @@ static void test_migrate(void)
>                                    " -incoming %s",
>                                    tmpfs, bootpath, uri);
>      } else if (strcmp(arch, "ppc64") == 0) {
> +        const char *accel;
>          init_bootfile_ppc(bootpath);
> -        cmd_src = g_strdup_printf("-machine accel=kvm:tcg -m 256M"
> +        accel = system("/sbin/lsmod | grep -q kvm_hv") ? "tcg" : "kvm:tcg";
> +        cmd_src = g_strdup_printf("-machine accel=%s -m 256M"
>                                    " -name pcsource,debug-threads=on"
>                                    " -serial file:%s/src_serial"
>                                    " -drive file=%s,if=pflash,format=raw",
> -                                  tmpfs, bootpath);
> -        cmd_dst = g_strdup_printf("-machine accel=kvm:tcg -m 256M"
> +                                  accel, tmpfs, bootpath);
> +        cmd_dst = g_strdup_printf("-machine accel=%s -m 256M"
>                                    " -name pcdest,debug-threads=on"
>                                    " -serial file:%s/dest_serial"
>                                    " -incoming %s",
> -                                  tmpfs, uri);
> +                                  accel, tmpfs, uri);
>      } else {
>          g_assert_not_reached();
>      }
> 
> That way, accel=kvm:tcg is only used if the kvm_hv module is loaded,
> otherwise it will use accel=tcg instead.
> 

I guess it's acceptable to assume that the test will more likely
pass when using KVM HV... but I'm not very comfortable as QEMU is
still supposed to work with KVM PR and we don't know yet what's
happening.

Cheers.

--
Greg

>  Thomas
>
Laurent Vivier Nov. 15, 2016, 8:39 p.m. UTC | #3
On 15/11/2016 20:00, Eric Blake wrote:
> On 11/15/2016 12:48 PM, Thomas Huth wrote:
> 
>>> Even for Power, I'd prefer to keep KVM since the problem only happens with
>>> KVM PR which isn't the preferred way to do KVM on bare metal... until this
>>> get fixed, I'd rather suggest people to run make check with KVM HV.
>>
>> OK ... what do you think about a patch like this:
>>
>> diff --git a/tests/postcopy-test.c b/tests/postcopy-test.c
>> --- a/tests/postcopy-test.c
>> +++ b/tests/postcopy-test.c
>> @@ -380,17 +380,19 @@ static void test_migrate(void)
>>                                    " -incoming %s",
>>                                    tmpfs, bootpath, uri);
>>      } else if (strcmp(arch, "ppc64") == 0) {
>> +        const char *accel;
>>          init_bootfile_ppc(bootpath);
>> -        cmd_src = g_strdup_printf("-machine accel=kvm:tcg -m 256M"
>> +        accel = system("/sbin/lsmod | grep -q kvm_hv") ? "tcg" : "kvm:tcg";
> 
> Unsafe use of system() (all I have to do is stick a counterfeit 'grep'
> earlier on my PATH to mess you up). Is there a safer way to grab that
> information without having to call out to the shell?

I think trying to open "/dev/kvm" would be enough to know if kvm is
available or not.

Laurent
Laurent Vivier Nov. 15, 2016, 8:44 p.m. UTC | #4
On 15/11/2016 21:39, Laurent Vivier wrote:
> 
> 
> On 15/11/2016 20:00, Eric Blake wrote:
>> On 11/15/2016 12:48 PM, Thomas Huth wrote:
>>
>>>> Even for Power, I'd prefer to keep KVM since the problem only happens with
>>>> KVM PR which isn't the preferred way to do KVM on bare metal... until this
>>>> get fixed, I'd rather suggest people to run make check with KVM HV.
>>>
>>> OK ... what do you think about a patch like this:
>>>
>>> diff --git a/tests/postcopy-test.c b/tests/postcopy-test.c
>>> --- a/tests/postcopy-test.c
>>> +++ b/tests/postcopy-test.c
>>> @@ -380,17 +380,19 @@ static void test_migrate(void)
>>>                                    " -incoming %s",
>>>                                    tmpfs, bootpath, uri);
>>>      } else if (strcmp(arch, "ppc64") == 0) {
>>> +        const char *accel;
>>>          init_bootfile_ppc(bootpath);
>>> -        cmd_src = g_strdup_printf("-machine accel=kvm:tcg -m 256M"
>>> +        accel = system("/sbin/lsmod | grep -q kvm_hv") ? "tcg" : "kvm:tcg";
>>
>> Unsafe use of system() (all I have to do is stick a counterfeit 'grep'
>> earlier on my PATH to mess you up). Is there a safer way to grab that
>> information without having to call out to the shell?
> 
> I think trying to open "/dev/kvm" would be enough to know if kvm is
> available or not.

OK, I've missed you want to check KVM HV only...

Laurent
Laurent Vivier Nov. 15, 2016, 8:45 p.m. UTC | #5
On 15/11/2016 21:44, Laurent Vivier wrote:
> 
> 
> On 15/11/2016 21:39, Laurent Vivier wrote:
>>
>>
>> On 15/11/2016 20:00, Eric Blake wrote:
>>> On 11/15/2016 12:48 PM, Thomas Huth wrote:
>>>
>>>>> Even for Power, I'd prefer to keep KVM since the problem only happens with
>>>>> KVM PR which isn't the preferred way to do KVM on bare metal... until this
>>>>> get fixed, I'd rather suggest people to run make check with KVM HV.
>>>>
>>>> OK ... what do you think about a patch like this:
>>>>
>>>> diff --git a/tests/postcopy-test.c b/tests/postcopy-test.c
>>>> --- a/tests/postcopy-test.c
>>>> +++ b/tests/postcopy-test.c
>>>> @@ -380,17 +380,19 @@ static void test_migrate(void)
>>>>                                    " -incoming %s",
>>>>                                    tmpfs, bootpath, uri);
>>>>      } else if (strcmp(arch, "ppc64") == 0) {
>>>> +        const char *accel;
>>>>          init_bootfile_ppc(bootpath);
>>>> -        cmd_src = g_strdup_printf("-machine accel=kvm:tcg -m 256M"
>>>> +        accel = system("/sbin/lsmod | grep -q kvm_hv") ? "tcg" : "kvm:tcg";
>>>
>>> Unsafe use of system() (all I have to do is stick a counterfeit 'grep'
>>> earlier on my PATH to mess you up). Is there a safer way to grab that
>>> information without having to call out to the shell?
>>
>> I think trying to open "/dev/kvm" would be enough to know if kvm is
>> available or not.
> 
> OK, I've missed you want to check KVM HV only...

So, I think you can check for "/sys/module/kvm_hv"

Laurent
diff mbox

Patch

diff --git a/tests/postcopy-test.c b/tests/postcopy-test.c
--- a/tests/postcopy-test.c
+++ b/tests/postcopy-test.c
@@ -380,17 +380,19 @@  static void test_migrate(void)
                                   " -incoming %s",
                                   tmpfs, bootpath, uri);
     } else if (strcmp(arch, "ppc64") == 0) {
+        const char *accel;
         init_bootfile_ppc(bootpath);
-        cmd_src = g_strdup_printf("-machine accel=kvm:tcg -m 256M"
+        accel = system("/sbin/lsmod | grep -q kvm_hv") ? "tcg" : "kvm:tcg";
+        cmd_src = g_strdup_printf("-machine accel=%s -m 256M"
                                   " -name pcsource,debug-threads=on"
                                   " -serial file:%s/src_serial"
                                   " -drive file=%s,if=pflash,format=raw",
-                                  tmpfs, bootpath);
-        cmd_dst = g_strdup_printf("-machine accel=kvm:tcg -m 256M"
+                                  accel, tmpfs, bootpath);
+        cmd_dst = g_strdup_printf("-machine accel=%s -m 256M"
                                   " -name pcdest,debug-threads=on"
                                   " -serial file:%s/dest_serial"
                                   " -incoming %s",
-                                  tmpfs, uri);
+                                  accel, tmpfs, uri);
     } else {
         g_assert_not_reached();
     }