diff mbox

[OSSTEST,2/3] ap-common: Switch to Linux 4.9 by default

Message ID 1495467922-30085-2-git-send-email-ian.jackson@eu.citrix.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ian Jackson May 22, 2017, 3:45 p.m. UTC
I ran a special report[1] to see what to expect and:

   Tests which did not succeed and are blocking,
   including tests which could not be run:
    test-amd64-i386-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail REGR.
    test-amd64-i386-xl-qemut-win7-amd64 15 guest-localmigrate/x10 fail REGR.

These Windows 7 migration tests have been failing on many branches and
don't look like they are something to do with the version of Linux
used in dom0.

Accordingly I intend to push this change to switch osstest to using
Linux 4.9 by default.  ARM tests are not affected at this time.

[1] ./sg-report-flight --that-linux=b65f2f457c49b2cfd7967c34b7a0b04c25587f13 --this-linux=f5eea276d8de10a32e68721707ae8f2fdfaa0960 --branches-also=linux-3.14,linux-arm-xen 109662 |less

CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: Juergen Gross <jgross@suse.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 ap-common | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Roger Pau Monne May 23, 2017, 1:03 p.m. UTC | #1
On Mon, May 22, 2017 at 04:45:21PM +0100, Ian Jackson wrote:
> I ran a special report[1] to see what to expect and:
> 
>    Tests which did not succeed and are blocking,
>    including tests which could not be run:
>     test-amd64-i386-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail REGR.
>     test-amd64-i386-xl-qemut-win7-amd64 15 guest-localmigrate/x10 fail REGR.
> 
> These Windows 7 migration tests have been failing on many branches and
> don't look like they are something to do with the version of Linux
> used in dom0.
> 
> Accordingly I intend to push this change to switch osstest to using
> Linux 4.9 by default.  ARM tests are not affected at this time.
> 
> [1] ./sg-report-flight --that-linux=b65f2f457c49b2cfd7967c34b7a0b04c25587f13 --this-linux=f5eea276d8de10a32e68721707ae8f2fdfaa0960 --branches-also=linux-3.14,linux-arm-xen 109662 |less
> 
> CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> CC: Juergen Gross <jgross@suse.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Wei Liu <wei.liu2@citrix.com>
> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> CC: Roger Pau Monné <roger.pau@citrix.com>
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

Yes please:

Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Ian Jackson May 30, 2017, 2:28 p.m. UTC | #2
Ian Jackson writes ("[OSSTEST PATCH 2/3] ap-common: Switch to Linux 4.9 by default"):
> I ran a special report[1] to see what to expect and:
...

It seems I ran or read this wrong.  In fact, my change is stuck in the
osstest self push gate because:

osstest service owner writes ("[osstest test] 109837: regressions - FAIL"):
> flight 109837 osstest real [real]
> http://logs.test-lab.xenproject.org/osstest/logs/109837/
> 
> Regressions :-(
> 
> Tests which did not succeed and are blocking,
> including tests which could not be run:
>  test-amd64-amd64-qemuu-nested-amd 13 xen-boot/l1   fail REGR. vs. 109601
>  test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1 fail REGR. vs. 109601

The L1 console log is here:

 http://logs.test-lab.xenproject.org/osstest/logs/109837/test-amd64-amd64-qemuu-nested-intel/chardonnay1---var-log-xen-osstest-serial-l1.guest.osstest.log

this seems to show it hanging during the boot.  osstest times it out
because it doesn't come onto the network.  However,

 http://logs.test-lab.xenproject.org/osstest/logs/109837/test-amd64-amd64-qemuu-nested-intel/chardonnay1---var-log-xen-console-guest-l1.guest.osstest.log

shows a login prompt on the L1's PV console.

Could someone investigate please ?

Thanks,
Ian.
Boris Ostrovsky May 30, 2017, 2:57 p.m. UTC | #3
On 05/30/2017 10:28 AM, Ian Jackson wrote:
> Ian Jackson writes ("[OSSTEST PATCH 2/3] ap-common: Switch to Linux 4.9 by default"):
>> I ran a special report[1] to see what to expect and:
> ...
>
> It seems I ran or read this wrong.  In fact, my change is stuck in the
> osstest self push gate because:
>
> osstest service owner writes ("[osstest test] 109837: regressions - FAIL"):
>> flight 109837 osstest real [real]
>> http://logs.test-lab.xenproject.org/osstest/logs/109837/
>>
>> Regressions :-(
>>
>> Tests which did not succeed and are blocking,
>> including tests which could not be run:
>>  test-amd64-amd64-qemuu-nested-amd 13 xen-boot/l1   fail REGR. vs. 109601
>>  test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1 fail REGR. vs. 109601
> The L1 console log is here:
>
>  http://logs.test-lab.xenproject.org/osstest/logs/109837/test-amd64-amd64-qemuu-nested-intel/chardonnay1---var-log-xen-osstest-serial-l1.guest.osstest.log
>
> this seems to show it hanging during the boot.  osstest times it out
> because it doesn't come onto the network.  However,
>
>  http://logs.test-lab.xenproject.org/osstest/logs/109837/test-amd64-amd64-qemuu-nested-intel/chardonnay1---var-log-xen-console-guest-l1.guest.osstest.log
>
> shows a login prompt on the L1's PV console.
>
> Could someone investigate please ?


The test is using 4.9.21 kernel and it looks like the patch that fixed
earlier regression shows up in 4.9 tree at 4.9.28 (commit
5d7ab8339a9a9e745c672279437657654268be81).

-boris
Ian Jackson May 30, 2017, 3:47 p.m. UTC | #4
Boris Ostrovsky writes ("Re: Nested virt broken in Linux 4.9 (was Re: [OSSTEST PATCH 2/3] ap-common: Switch to Linux 4.9 by default [and 1 more messages])"):
> On 05/30/2017 10:28 AM, Ian Jackson wrote:
> > osstest service owner writes ("[osstest test] 109837: regressions - FAIL"):
> >> Tests which did not succeed and are blocking,
> >> including tests which could not be run:
> >>  test-amd64-amd64-qemuu-nested-amd 13 xen-boot/l1   fail REGR. vs. 109601
> >>  test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1 fail REGR. vs. 109601
> > The L1 console log is here:
...
> The test is using 4.9.21 kernel and it looks like the patch that fixed
> earlier regression shows up in 4.9 tree at 4.9.28 (commit
> 5d7ab8339a9a9e745c672279437657654268be81).

Thanks for investigating.  osstest tries to to track 4.9.y.

However, it is blocked because of persistent failures like this one:

osstest service owner writes ("[linux-4.9 test] 109836: regressions - trouble: broken/fail/pass"):
> flight 109836 linux-4.9 real [real]
> http://logs.test-lab.xenproject.org/osstest/logs/109836/
> 
> Regressions :-(
> 
> Tests which did not succeed and are blocking,
> including tests which could not be run:
>  test-armhf-armhf-xl-credit2   6 xen-boot          fail REGR. vs. 107358

Does anyone have any idea why this test should fail consistently ?
The corresponding test with the default credit1 scheduler fails too.
But it works on the other branches (which are using linux 3.18):
  http://logs.test-lab.xenproject.org/osstest/results/history/test-armhf-armhf-xl/ALL
  http://logs.test-lab.xenproject.org/osstest/results/history/test-armhf-armhf-xl-credit2/ALL

These are, admittedly, on the unreliable arndales, but:
  http://logs.test-lab.xenproject.org/osstest/results/host/arndale-bluewater.html

So I think this is a problem with 4.9.

Ian.
Julien Grall May 30, 2017, 3:53 p.m. UTC | #5
Hi Ian,

On 30/05/17 16:47, Ian Jackson wrote:
> Boris Ostrovsky writes ("Re: Nested virt broken in Linux 4.9 (was Re: [OSSTEST PATCH 2/3] ap-common: Switch to Linux 4.9 by default [and 1 more messages])"):
>> On 05/30/2017 10:28 AM, Ian Jackson wrote:
>>> osstest service owner writes ("[osstest test] 109837: regressions - FAIL"):
>>>> Tests which did not succeed and are blocking,
>>>> including tests which could not be run:
>>>>  test-amd64-amd64-qemuu-nested-amd 13 xen-boot/l1   fail REGR. vs. 109601
>>>>  test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1 fail REGR. vs. 109601
>>> The L1 console log is here:
> ...
>> The test is using 4.9.21 kernel and it looks like the patch that fixed
>> earlier regression shows up in 4.9 tree at 4.9.28 (commit
>> 5d7ab8339a9a9e745c672279437657654268be81).
>
> Thanks for investigating.  osstest tries to to track 4.9.y.
>
> However, it is blocked because of persistent failures like this one:
>
> osstest service owner writes ("[linux-4.9 test] 109836: regressions - trouble: broken/fail/pass"):
>> flight 109836 linux-4.9 real [real]
>> http://logs.test-lab.xenproject.org/osstest/logs/109836/
>>
>> Regressions :-(
>>
>> Tests which did not succeed and are blocking,
>> including tests which could not be run:
>>  test-armhf-armhf-xl-credit2   6 xen-boot          fail REGR. vs. 107358
>
> Does anyone have any idea why this test should fail consistently ?
> The corresponding test with the default credit1 scheduler fails too.
> But it works on the other branches (which are using linux 3.18):
>   http://logs.test-lab.xenproject.org/osstest/results/history/test-armhf-armhf-xl/ALL
>   http://logs.test-lab.xenproject.org/osstest/results/history/test-armhf-armhf-xl-credit2/ALL
>
> These are, admittedly, on the unreliable arndales, but:
>   http://logs.test-lab.xenproject.org/osstest/results/host/arndale-bluewater.html
>
> So I think this is a problem with 4.9.

There are two missing patches in Linux 4.9 to be able to boot on the 
Arndale:
	- c827283586a4 "ARM: 8636/1: Cleanup sanity_check_meminfo"
	- 92ed32019d0d "ARM: 8637/1: Adjust memory boundaries after reservation"

They are part of the linux-arm-xen branch (also based on Xen 4.9) but I 
haven't yet requested to backport them in staging.

Cheers,
Ian Jackson May 30, 2017, 4:13 p.m. UTC | #6
Julien Grall writes ("Re: Nested virt broken in Linux 4.9 (was Re: [OSSTEST PATCH 2/3] ap-common: Switch to Linux 4.9 by default [and 1 more messages])"):
> There are two missing patches in Linux 4.9 to be able to boot on the 
> Arndale:
> 	- c827283586a4 "ARM: 8636/1: Cleanup sanity_check_meminfo"
> 	- 92ed32019d0d "ARM: 8637/1: Adjust memory boundaries after reservation"
> 
> They are part of the linux-arm-xen branch (also based on Xen 4.9) but I 
> haven't yet requested to backport them in staging.

I see.  Can you do that please ?  It's blocking moving our testing to
a non-ancient kernel.

Thanks,
Ian.
Julien Grall May 30, 2017, 4:15 p.m. UTC | #7
Hi Ian,

On 30/05/17 17:13, Ian Jackson wrote:
> Julien Grall writes ("Re: Nested virt broken in Linux 4.9 (was Re: [OSSTEST PATCH 2/3] ap-common: Switch to Linux 4.9 by default [and 1 more messages])"):
>> There are two missing patches in Linux 4.9 to be able to boot on the
>> Arndale:
>> 	- c827283586a4 "ARM: 8636/1: Cleanup sanity_check_meminfo"
>> 	- 92ed32019d0d "ARM: 8637/1: Adjust memory boundaries after reservation"
>>
>> They are part of the linux-arm-xen branch (also based on Xen 4.9) but I
>> haven't yet requested to backport them in staging.
>
> I see.  Can you do that please ?  It's blocking moving our testing to
> a non-ancient kernel.

I will do that. However, we don't use Linux 4.9 branch for arm64/arm32 
testing. So why are we blocking on those boards?

Cheers,
Ian Jackson May 30, 2017, 4:22 p.m. UTC | #8
Julien Grall writes ("Re: Nested virt broken in Linux 4.9 (was Re: [OSSTEST PATCH 2/3] ap-common: Switch to Linux 4.9 by default [and 1 more messages])"):
> On 30/05/17 17:13, Ian Jackson wrote:
...
> > I see.  Can you do that please ?  It's blocking moving our testing to
> > a non-ancient kernel.
> 
> I will do that. However, we don't use Linux 4.9 branch for arm64/arm32 
> testing. So why are we blocking on those boards?

It works like this:

The CI in general has a notion of the default Linux.  That is
currently Linux 3.18.  But, there is a special case, and for ARM it is
the special linux-arm-xen branch.

I am trying to update the non-ARM default Linux from 3.18 to 4.9.  For
that to be true, there must be no regressions between 3.18 and 4.9.

Specifically, because changing to 4.9 as the the non-ARM default Linux
version would mean using _the version of Linux 4.9 that has itself
passed osstest's tests_, there must be no x86 regressions between
Linux 3.18 as currently used for the x86 tests, and the
osstest-approved 4.9.

Currently the osstest-approved linux-4.9 contains some x86 nested virt
regressions compared to the osstest-approved linux-3.18.  These are
not regressions _within osstest's view of linux-4.9_ because they
never worked there.

They would be fixed if osstest were to update its version of linux-4.9
to a more recent one, which has bugfixes for the x86 nested virt bugs.

But osstest does not want to update its linux-4.9 branch to include
those x86 nested fixes, because doing so would introduce an ARM
regression _within the 4.9 branch_.

This is relevant because, obviously, when osstest is testing
linux-4.9, it uses it for all architectures.

I hope we can get this fixed soon.  We cannot update to Linux 4.9
until we have a version of 4.9 which doesn't have any regressions.  It
seems people keep breaking it.  If the time to fix any particular
regression too much exceeds the time between new regressions being
introduced, we will never succeed.

Ian.
Ian Jackson June 6, 2017, 11:16 a.m. UTC | #9
Ian Jackson writes ("Re: Nested virt broken in Linux 4.9 (was Re: [OSSTEST PATCH 2/3] ap-common: Switch to Linux 4.9 by default [and 1 more messages])"):
> But osstest does not want to update its linux-4.9 branch to include
> those x86 nested fixes, because doing so would introduce an ARM
> regression _within the 4.9 branch_.
...
> I hope we can get this fixed soon.  We cannot update to Linux 4.9
> until we have a version of 4.9 which doesn't have any regressions.  It
> seems people keep breaking it.  If the time to fix any particular
> regression too much exceeds the time between new regressions being
> introduced, we will never succeed.

Julien, what is the state of the arndale fixes for 4.9 ?

Ian.
Julien Grall June 6, 2017, 3:56 p.m. UTC | #10
Hi Ian,

On 06/06/17 12:16, Ian Jackson wrote:
> Ian Jackson writes ("Re: Nested virt broken in Linux 4.9 (was Re: [OSSTEST PATCH 2/3] ap-common: Switch to Linux 4.9 by default [and 1 more messages])"):
>> But osstest does not want to update its linux-4.9 branch to include
>> those x86 nested fixes, because doing so would introduce an ARM
>> regression _within the 4.9 branch_.
> ...
>> I hope we can get this fixed soon.  We cannot update to Linux 4.9
>> until we have a version of 4.9 which doesn't have any regressions.  It
>> seems people keep breaking it.  If the time to fix any particular
>> regression too much exceeds the time between new regressions being
>> introduced, we will never succeed.
>
> Julien, what is the state of the arndale fixes for 4.9 ?

Just sent it (you are CCed). Sorry for the late.

Cheers,
diff mbox

Patch

diff --git a/ap-common b/ap-common
index cbb815c..bc7c03c 100644
--- a/ap-common
+++ b/ap-common
@@ -60,7 +60,7 @@ 
 
 : ${PUSH_TREE_LINUX:=$XENBITS:/home/xen/git/linux-pvops.git}
 : ${BASE_TREE_LINUX:=git://xenbits.xen.org/linux-pvops.git}
-: ${BASE_TAG_LINUX:=tested/linux-3.14}
+: ${BASE_TAG_LINUX:=tested/linux-4.9}
 : ${BASE_TAG_LINUX_ARM:=tested/linux-arm-xen}
 
 if [ "x${TREE_LINUX}" = x ]; then