Message ID | 1495467922-30085-2-git-send-email-ian.jackson@eu.citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, May 22, 2017 at 04:45:21PM +0100, Ian Jackson wrote: > I ran a special report[1] to see what to expect and: > > Tests which did not succeed and are blocking, > including tests which could not be run: > test-amd64-i386-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail REGR. > test-amd64-i386-xl-qemut-win7-amd64 15 guest-localmigrate/x10 fail REGR. > > These Windows 7 migration tests have been failing on many branches and > don't look like they are something to do with the version of Linux > used in dom0. > > Accordingly I intend to push this change to switch osstest to using > Linux 4.9 by default. ARM tests are not affected at this time. > > [1] ./sg-report-flight --that-linux=b65f2f457c49b2cfd7967c34b7a0b04c25587f13 --this-linux=f5eea276d8de10a32e68721707ae8f2fdfaa0960 --branches-also=linux-3.14,linux-arm-xen 109662 |less > > CC: Boris Ostrovsky <boris.ostrovsky@oracle.com> > CC: Juergen Gross <jgross@suse.com> > CC: Stefano Stabellini <sstabellini@kernel.org> > CC: Wei Liu <wei.liu2@citrix.com> > CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > CC: Roger Pau Monné <roger.pau@citrix.com> > Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Yes please: Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Ian Jackson writes ("[OSSTEST PATCH 2/3] ap-common: Switch to Linux 4.9 by default"): > I ran a special report[1] to see what to expect and: ... It seems I ran or read this wrong. In fact, my change is stuck in the osstest self push gate because: osstest service owner writes ("[osstest test] 109837: regressions - FAIL"): > flight 109837 osstest real [real] > http://logs.test-lab.xenproject.org/osstest/logs/109837/ > > Regressions :-( > > Tests which did not succeed and are blocking, > including tests which could not be run: > test-amd64-amd64-qemuu-nested-amd 13 xen-boot/l1 fail REGR. vs. 109601 > test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1 fail REGR. vs. 109601 The L1 console log is here: http://logs.test-lab.xenproject.org/osstest/logs/109837/test-amd64-amd64-qemuu-nested-intel/chardonnay1---var-log-xen-osstest-serial-l1.guest.osstest.log this seems to show it hanging during the boot. osstest times it out because it doesn't come onto the network. However, http://logs.test-lab.xenproject.org/osstest/logs/109837/test-amd64-amd64-qemuu-nested-intel/chardonnay1---var-log-xen-console-guest-l1.guest.osstest.log shows a login prompt on the L1's PV console. Could someone investigate please ? Thanks, Ian.
On 05/30/2017 10:28 AM, Ian Jackson wrote: > Ian Jackson writes ("[OSSTEST PATCH 2/3] ap-common: Switch to Linux 4.9 by default"): >> I ran a special report[1] to see what to expect and: > ... > > It seems I ran or read this wrong. In fact, my change is stuck in the > osstest self push gate because: > > osstest service owner writes ("[osstest test] 109837: regressions - FAIL"): >> flight 109837 osstest real [real] >> http://logs.test-lab.xenproject.org/osstest/logs/109837/ >> >> Regressions :-( >> >> Tests which did not succeed and are blocking, >> including tests which could not be run: >> test-amd64-amd64-qemuu-nested-amd 13 xen-boot/l1 fail REGR. vs. 109601 >> test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1 fail REGR. vs. 109601 > The L1 console log is here: > > http://logs.test-lab.xenproject.org/osstest/logs/109837/test-amd64-amd64-qemuu-nested-intel/chardonnay1---var-log-xen-osstest-serial-l1.guest.osstest.log > > this seems to show it hanging during the boot. osstest times it out > because it doesn't come onto the network. However, > > http://logs.test-lab.xenproject.org/osstest/logs/109837/test-amd64-amd64-qemuu-nested-intel/chardonnay1---var-log-xen-console-guest-l1.guest.osstest.log > > shows a login prompt on the L1's PV console. > > Could someone investigate please ? The test is using 4.9.21 kernel and it looks like the patch that fixed earlier regression shows up in 4.9 tree at 4.9.28 (commit 5d7ab8339a9a9e745c672279437657654268be81). -boris
Boris Ostrovsky writes ("Re: Nested virt broken in Linux 4.9 (was Re: [OSSTEST PATCH 2/3] ap-common: Switch to Linux 4.9 by default [and 1 more messages])"): > On 05/30/2017 10:28 AM, Ian Jackson wrote: > > osstest service owner writes ("[osstest test] 109837: regressions - FAIL"): > >> Tests which did not succeed and are blocking, > >> including tests which could not be run: > >> test-amd64-amd64-qemuu-nested-amd 13 xen-boot/l1 fail REGR. vs. 109601 > >> test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1 fail REGR. vs. 109601 > > The L1 console log is here: ... > The test is using 4.9.21 kernel and it looks like the patch that fixed > earlier regression shows up in 4.9 tree at 4.9.28 (commit > 5d7ab8339a9a9e745c672279437657654268be81). Thanks for investigating. osstest tries to to track 4.9.y. However, it is blocked because of persistent failures like this one: osstest service owner writes ("[linux-4.9 test] 109836: regressions - trouble: broken/fail/pass"): > flight 109836 linux-4.9 real [real] > http://logs.test-lab.xenproject.org/osstest/logs/109836/ > > Regressions :-( > > Tests which did not succeed and are blocking, > including tests which could not be run: > test-armhf-armhf-xl-credit2 6 xen-boot fail REGR. vs. 107358 Does anyone have any idea why this test should fail consistently ? The corresponding test with the default credit1 scheduler fails too. But it works on the other branches (which are using linux 3.18): http://logs.test-lab.xenproject.org/osstest/results/history/test-armhf-armhf-xl/ALL http://logs.test-lab.xenproject.org/osstest/results/history/test-armhf-armhf-xl-credit2/ALL These are, admittedly, on the unreliable arndales, but: http://logs.test-lab.xenproject.org/osstest/results/host/arndale-bluewater.html So I think this is a problem with 4.9. Ian.
Hi Ian, On 30/05/17 16:47, Ian Jackson wrote: > Boris Ostrovsky writes ("Re: Nested virt broken in Linux 4.9 (was Re: [OSSTEST PATCH 2/3] ap-common: Switch to Linux 4.9 by default [and 1 more messages])"): >> On 05/30/2017 10:28 AM, Ian Jackson wrote: >>> osstest service owner writes ("[osstest test] 109837: regressions - FAIL"): >>>> Tests which did not succeed and are blocking, >>>> including tests which could not be run: >>>> test-amd64-amd64-qemuu-nested-amd 13 xen-boot/l1 fail REGR. vs. 109601 >>>> test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1 fail REGR. vs. 109601 >>> The L1 console log is here: > ... >> The test is using 4.9.21 kernel and it looks like the patch that fixed >> earlier regression shows up in 4.9 tree at 4.9.28 (commit >> 5d7ab8339a9a9e745c672279437657654268be81). > > Thanks for investigating. osstest tries to to track 4.9.y. > > However, it is blocked because of persistent failures like this one: > > osstest service owner writes ("[linux-4.9 test] 109836: regressions - trouble: broken/fail/pass"): >> flight 109836 linux-4.9 real [real] >> http://logs.test-lab.xenproject.org/osstest/logs/109836/ >> >> Regressions :-( >> >> Tests which did not succeed and are blocking, >> including tests which could not be run: >> test-armhf-armhf-xl-credit2 6 xen-boot fail REGR. vs. 107358 > > Does anyone have any idea why this test should fail consistently ? > The corresponding test with the default credit1 scheduler fails too. > But it works on the other branches (which are using linux 3.18): > http://logs.test-lab.xenproject.org/osstest/results/history/test-armhf-armhf-xl/ALL > http://logs.test-lab.xenproject.org/osstest/results/history/test-armhf-armhf-xl-credit2/ALL > > These are, admittedly, on the unreliable arndales, but: > http://logs.test-lab.xenproject.org/osstest/results/host/arndale-bluewater.html > > So I think this is a problem with 4.9. There are two missing patches in Linux 4.9 to be able to boot on the Arndale: - c827283586a4 "ARM: 8636/1: Cleanup sanity_check_meminfo" - 92ed32019d0d "ARM: 8637/1: Adjust memory boundaries after reservation" They are part of the linux-arm-xen branch (also based on Xen 4.9) but I haven't yet requested to backport them in staging. Cheers,
Julien Grall writes ("Re: Nested virt broken in Linux 4.9 (was Re: [OSSTEST PATCH 2/3] ap-common: Switch to Linux 4.9 by default [and 1 more messages])"): > There are two missing patches in Linux 4.9 to be able to boot on the > Arndale: > - c827283586a4 "ARM: 8636/1: Cleanup sanity_check_meminfo" > - 92ed32019d0d "ARM: 8637/1: Adjust memory boundaries after reservation" > > They are part of the linux-arm-xen branch (also based on Xen 4.9) but I > haven't yet requested to backport them in staging. I see. Can you do that please ? It's blocking moving our testing to a non-ancient kernel. Thanks, Ian.
Hi Ian, On 30/05/17 17:13, Ian Jackson wrote: > Julien Grall writes ("Re: Nested virt broken in Linux 4.9 (was Re: [OSSTEST PATCH 2/3] ap-common: Switch to Linux 4.9 by default [and 1 more messages])"): >> There are two missing patches in Linux 4.9 to be able to boot on the >> Arndale: >> - c827283586a4 "ARM: 8636/1: Cleanup sanity_check_meminfo" >> - 92ed32019d0d "ARM: 8637/1: Adjust memory boundaries after reservation" >> >> They are part of the linux-arm-xen branch (also based on Xen 4.9) but I >> haven't yet requested to backport them in staging. > > I see. Can you do that please ? It's blocking moving our testing to > a non-ancient kernel. I will do that. However, we don't use Linux 4.9 branch for arm64/arm32 testing. So why are we blocking on those boards? Cheers,
Julien Grall writes ("Re: Nested virt broken in Linux 4.9 (was Re: [OSSTEST PATCH 2/3] ap-common: Switch to Linux 4.9 by default [and 1 more messages])"): > On 30/05/17 17:13, Ian Jackson wrote: ... > > I see. Can you do that please ? It's blocking moving our testing to > > a non-ancient kernel. > > I will do that. However, we don't use Linux 4.9 branch for arm64/arm32 > testing. So why are we blocking on those boards? It works like this: The CI in general has a notion of the default Linux. That is currently Linux 3.18. But, there is a special case, and for ARM it is the special linux-arm-xen branch. I am trying to update the non-ARM default Linux from 3.18 to 4.9. For that to be true, there must be no regressions between 3.18 and 4.9. Specifically, because changing to 4.9 as the the non-ARM default Linux version would mean using _the version of Linux 4.9 that has itself passed osstest's tests_, there must be no x86 regressions between Linux 3.18 as currently used for the x86 tests, and the osstest-approved 4.9. Currently the osstest-approved linux-4.9 contains some x86 nested virt regressions compared to the osstest-approved linux-3.18. These are not regressions _within osstest's view of linux-4.9_ because they never worked there. They would be fixed if osstest were to update its version of linux-4.9 to a more recent one, which has bugfixes for the x86 nested virt bugs. But osstest does not want to update its linux-4.9 branch to include those x86 nested fixes, because doing so would introduce an ARM regression _within the 4.9 branch_. This is relevant because, obviously, when osstest is testing linux-4.9, it uses it for all architectures. I hope we can get this fixed soon. We cannot update to Linux 4.9 until we have a version of 4.9 which doesn't have any regressions. It seems people keep breaking it. If the time to fix any particular regression too much exceeds the time between new regressions being introduced, we will never succeed. Ian.
Ian Jackson writes ("Re: Nested virt broken in Linux 4.9 (was Re: [OSSTEST PATCH 2/3] ap-common: Switch to Linux 4.9 by default [and 1 more messages])"): > But osstest does not want to update its linux-4.9 branch to include > those x86 nested fixes, because doing so would introduce an ARM > regression _within the 4.9 branch_. ... > I hope we can get this fixed soon. We cannot update to Linux 4.9 > until we have a version of 4.9 which doesn't have any regressions. It > seems people keep breaking it. If the time to fix any particular > regression too much exceeds the time between new regressions being > introduced, we will never succeed. Julien, what is the state of the arndale fixes for 4.9 ? Ian.
Hi Ian, On 06/06/17 12:16, Ian Jackson wrote: > Ian Jackson writes ("Re: Nested virt broken in Linux 4.9 (was Re: [OSSTEST PATCH 2/3] ap-common: Switch to Linux 4.9 by default [and 1 more messages])"): >> But osstest does not want to update its linux-4.9 branch to include >> those x86 nested fixes, because doing so would introduce an ARM >> regression _within the 4.9 branch_. > ... >> I hope we can get this fixed soon. We cannot update to Linux 4.9 >> until we have a version of 4.9 which doesn't have any regressions. It >> seems people keep breaking it. If the time to fix any particular >> regression too much exceeds the time between new regressions being >> introduced, we will never succeed. > > Julien, what is the state of the arndale fixes for 4.9 ? Just sent it (you are CCed). Sorry for the late. Cheers,
diff --git a/ap-common b/ap-common index cbb815c..bc7c03c 100644 --- a/ap-common +++ b/ap-common @@ -60,7 +60,7 @@ : ${PUSH_TREE_LINUX:=$XENBITS:/home/xen/git/linux-pvops.git} : ${BASE_TREE_LINUX:=git://xenbits.xen.org/linux-pvops.git} -: ${BASE_TAG_LINUX:=tested/linux-3.14} +: ${BASE_TAG_LINUX:=tested/linux-4.9} : ${BASE_TAG_LINUX_ARM:=tested/linux-arm-xen} if [ "x${TREE_LINUX}" = x ]; then
I ran a special report[1] to see what to expect and: Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail REGR. test-amd64-i386-xl-qemut-win7-amd64 15 guest-localmigrate/x10 fail REGR. These Windows 7 migration tests have been failing on many branches and don't look like they are something to do with the version of Linux used in dom0. Accordingly I intend to push this change to switch osstest to using Linux 4.9 by default. ARM tests are not affected at this time. [1] ./sg-report-flight --that-linux=b65f2f457c49b2cfd7967c34b7a0b04c25587f13 --this-linux=f5eea276d8de10a32e68721707ae8f2fdfaa0960 --branches-also=linux-3.14,linux-arm-xen 109662 |less CC: Boris Ostrovsky <boris.ostrovsky@oracle.com> CC: Juergen Gross <jgross@suse.com> CC: Stefano Stabellini <sstabellini@kernel.org> CC: Wei Liu <wei.liu2@citrix.com> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> CC: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> --- ap-common | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)