diff mbox series

[PULL,06/14] ci: Add a migration compatibility test job

Message ID 20240129030405.177100-7-peterx@redhat.com (mailing list archive)
State New, archived
Headers show
Series [PULL,01/14] userfaultfd: use 1ULL to build ioctl masks | expand

Commit Message

Peter Xu Jan. 29, 2024, 3:03 a.m. UTC
From: Fabiano Rosas <farosas@suse.de>

The migration tests have support for being passed two QEMU binaries to
test migration compatibility.

Add a CI job that builds the lastest release of QEMU and another job
that uses that version plus an already present build of the current
version and run the migration tests with the two, both as source and
destination. I.e.:

 old QEMU (n-1) -> current QEMU (development tree)
 current QEMU (development tree) -> old QEMU (n-1)

The purpose of this CI job is to ensure the code we're about to merge
will not cause a migration compatibility problem when migrating the
next release (which will contain that code) to/from the previous
release.

The version of migration-test used will be the one matching the older
QEMU. That way we can avoid special-casing new tests that wouldn't be
compatible with the older QEMU.

Note: for user forks, the version tags need to be pushed to gitlab
otherwise it won't be able to checkout a different version.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240118164951.30350-3-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 .gitlab-ci.d/buildtest.yml | 60 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 60 insertions(+)

Comments

Peter Maydell Feb. 2, 2024, 1:22 p.m. UTC | #1
On Mon, 29 Jan 2024 at 03:04, <peterx@redhat.com> wrote:
>
> From: Fabiano Rosas <farosas@suse.de>
>
> The migration tests have support for being passed two QEMU binaries to
> test migration compatibility.
>
> Add a CI job that builds the lastest release of QEMU and another job
> that uses that version plus an already present build of the current
> version and run the migration tests with the two, both as source and
> destination. I.e.:
>
>  old QEMU (n-1) -> current QEMU (development tree)
>  current QEMU (development tree) -> old QEMU (n-1)
>
> The purpose of this CI job is to ensure the code we're about to merge
> will not cause a migration compatibility problem when migrating the
> next release (which will contain that code) to/from the previous
> release.
>
> The version of migration-test used will be the one matching the older
> QEMU. That way we can avoid special-casing new tests that wouldn't be
> compatible with the older QEMU.
>
> Note: for user forks, the version tags need to be pushed to gitlab
> otherwise it won't be able to checkout a different version.
>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> Link: https://lore.kernel.org/r/20240118164951.30350-3-farosas@suse.de
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  .gitlab-ci.d/buildtest.yml | 60 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 60 insertions(+)
>
> diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
> index e1c7801598..f0b0edc634 100644
> --- a/.gitlab-ci.d/buildtest.yml
> +++ b/.gitlab-ci.d/buildtest.yml
> @@ -167,6 +167,66 @@ build-system-centos:
>        x86_64-softmmu rx-softmmu sh4-softmmu nios2-softmmu
>      MAKE_CHECK_ARGS: check-build
>
> +# Previous QEMU release. Used for cross-version migration tests.
> +build-previous-qemu:
> +  extends: .native_build_job_template
> +  artifacts:
> +    when: on_success
> +    expire_in: 2 days
> +    paths:
> +      - build-previous
> +    exclude:
> +      - build-previous/**/*.p
> +      - build-previous/**/*.a.p
> +      - build-previous/**/*.fa.p
> +      - build-previous/**/*.c.o
> +      - build-previous/**/*.c.o.d
> +      - build-previous/**/*.fa
> +  needs:
> +    job: amd64-opensuse-leap-container
> +  variables:
> +    IMAGE: opensuse-leap
> +    TARGETS: x86_64-softmmu aarch64-softmmu
> +  before_script:
> +    - export QEMU_PREV_VERSION="$(sed 's/\([0-9.]*\)\.[0-9]*/v\1.0/' VERSION)"
> +    - git checkout $QEMU_PREV_VERSION
> +  after_script:
> +    - mv build build-previous

There seems to be a problem with this new CI job. Running a CI
run in my local repository it fails:

https://gitlab.com/pm215/qemu/-/jobs/6075873685

$ export QEMU_PREV_VERSION="$(sed 's/\([0-9.]*\)\.[0-9]*/v .0/' VERSION)"
$ git checkout $QEMU_PREV_VERSION
error: pathspec 'v8.2.0' did not match any file(s) known to git
Running after_script
Running after script...
$ mv build build-previous
mv: cannot stat 'build': No such file or directory
WARNING: after_script failed, but job will continue unaffected: exit code 1
Saving cache for failed job


I don't think you can assume that private forks doing submaintainer CI
runs necessarily have the full set of tags that the main repo does.

I suspect the sed run will also do the wrong thing when run on the
commit that updates the version, because then it will replace
"9.0.0" with "9.0.0".

thanks
-- PMM
Fabiano Rosas Feb. 2, 2024, 1:47 p.m. UTC | #2
Peter Maydell <peter.maydell@linaro.org> writes:

> On Mon, 29 Jan 2024 at 03:04, <peterx@redhat.com> wrote:
>>
>> From: Fabiano Rosas <farosas@suse.de>
>>
>> The migration tests have support for being passed two QEMU binaries to
>> test migration compatibility.
>>
>> Add a CI job that builds the lastest release of QEMU and another job
>> that uses that version plus an already present build of the current
>> version and run the migration tests with the two, both as source and
>> destination. I.e.:
>>
>>  old QEMU (n-1) -> current QEMU (development tree)
>>  current QEMU (development tree) -> old QEMU (n-1)
>>
>> The purpose of this CI job is to ensure the code we're about to merge
>> will not cause a migration compatibility problem when migrating the
>> next release (which will contain that code) to/from the previous
>> release.
>>
>> The version of migration-test used will be the one matching the older
>> QEMU. That way we can avoid special-casing new tests that wouldn't be
>> compatible with the older QEMU.
>>
>> Note: for user forks, the version tags need to be pushed to gitlab
>> otherwise it won't be able to checkout a different version.
>>
>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> Link: https://lore.kernel.org/r/20240118164951.30350-3-farosas@suse.de
>> Signed-off-by: Peter Xu <peterx@redhat.com>
>> ---
>>  .gitlab-ci.d/buildtest.yml | 60 ++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 60 insertions(+)
>>
>> diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
>> index e1c7801598..f0b0edc634 100644
>> --- a/.gitlab-ci.d/buildtest.yml
>> +++ b/.gitlab-ci.d/buildtest.yml
>> @@ -167,6 +167,66 @@ build-system-centos:
>>        x86_64-softmmu rx-softmmu sh4-softmmu nios2-softmmu
>>      MAKE_CHECK_ARGS: check-build
>>
>> +# Previous QEMU release. Used for cross-version migration tests.
>> +build-previous-qemu:
>> +  extends: .native_build_job_template
>> +  artifacts:
>> +    when: on_success
>> +    expire_in: 2 days
>> +    paths:
>> +      - build-previous
>> +    exclude:
>> +      - build-previous/**/*.p
>> +      - build-previous/**/*.a.p
>> +      - build-previous/**/*.fa.p
>> +      - build-previous/**/*.c.o
>> +      - build-previous/**/*.c.o.d
>> +      - build-previous/**/*.fa
>> +  needs:
>> +    job: amd64-opensuse-leap-container
>> +  variables:
>> +    IMAGE: opensuse-leap
>> +    TARGETS: x86_64-softmmu aarch64-softmmu
>> +  before_script:
>> +    - export QEMU_PREV_VERSION="$(sed 's/\([0-9.]*\)\.[0-9]*/v\1.0/' VERSION)"
>> +    - git checkout $QEMU_PREV_VERSION
>> +  after_script:
>> +    - mv build build-previous
>
> There seems to be a problem with this new CI job. Running a CI
> run in my local repository it fails:
>
> https://gitlab.com/pm215/qemu/-/jobs/6075873685
>
> $ export QEMU_PREV_VERSION="$(sed 's/\([0-9.]*\)\.[0-9]*/v .0/' VERSION)"
> $ git checkout $QEMU_PREV_VERSION
> error: pathspec 'v8.2.0' did not match any file(s) known to git
> Running after_script
> Running after script...
> $ mv build build-previous
> mv: cannot stat 'build': No such file or directory
> WARNING: after_script failed, but job will continue unaffected: exit code 1
> Saving cache for failed job
>
>
> I don't think you can assume that private forks doing submaintainer CI
> runs necessarily have the full set of tags that the main repo does.

Yes, I thought this would be rare enough not to be an issue, but it
seems it's not. I don't know what could be done here, if there's no tag,
then there's no way to resolve the actual commit hash I think.

> I suspect the sed run will also do the wrong thing when run on the
> commit that updates the version, because then it will replace
> "9.0.0" with "9.0.0".

I just ignored this completly because my initial idea was to leave this
job disabled and only run it for migration patchsets and pull requests,
so it wouldn't make sense to run at that commit.

This job is also not entirely fail proof by design because we could
always be hitting bugs in the older QEMU version that were already fixed
in the new version.

I think the simplest fix here is to leave the test disabled, possibly
with an env variable to enable it.
Peter Xu Feb. 5, 2024, 3:25 a.m. UTC | #3
On Fri, Feb 02, 2024 at 10:47:05AM -0300, Fabiano Rosas wrote:
> Peter Maydell <peter.maydell@linaro.org> writes:
> 
> > On Mon, 29 Jan 2024 at 03:04, <peterx@redhat.com> wrote:
> >>
> >> From: Fabiano Rosas <farosas@suse.de>
> >>
> >> The migration tests have support for being passed two QEMU binaries to
> >> test migration compatibility.
> >>
> >> Add a CI job that builds the lastest release of QEMU and another job
> >> that uses that version plus an already present build of the current
> >> version and run the migration tests with the two, both as source and
> >> destination. I.e.:
> >>
> >>  old QEMU (n-1) -> current QEMU (development tree)
> >>  current QEMU (development tree) -> old QEMU (n-1)
> >>
> >> The purpose of this CI job is to ensure the code we're about to merge
> >> will not cause a migration compatibility problem when migrating the
> >> next release (which will contain that code) to/from the previous
> >> release.
> >>
> >> The version of migration-test used will be the one matching the older
> >> QEMU. That way we can avoid special-casing new tests that wouldn't be
> >> compatible with the older QEMU.
> >>
> >> Note: for user forks, the version tags need to be pushed to gitlab
> >> otherwise it won't be able to checkout a different version.
> >>
> >> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> >> Link: https://lore.kernel.org/r/20240118164951.30350-3-farosas@suse.de
> >> Signed-off-by: Peter Xu <peterx@redhat.com>
> >> ---
> >>  .gitlab-ci.d/buildtest.yml | 60 ++++++++++++++++++++++++++++++++++++++
> >>  1 file changed, 60 insertions(+)
> >>
> >> diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
> >> index e1c7801598..f0b0edc634 100644
> >> --- a/.gitlab-ci.d/buildtest.yml
> >> +++ b/.gitlab-ci.d/buildtest.yml
> >> @@ -167,6 +167,66 @@ build-system-centos:
> >>        x86_64-softmmu rx-softmmu sh4-softmmu nios2-softmmu
> >>      MAKE_CHECK_ARGS: check-build
> >>
> >> +# Previous QEMU release. Used for cross-version migration tests.
> >> +build-previous-qemu:
> >> +  extends: .native_build_job_template
> >> +  artifacts:
> >> +    when: on_success
> >> +    expire_in: 2 days
> >> +    paths:
> >> +      - build-previous
> >> +    exclude:
> >> +      - build-previous/**/*.p
> >> +      - build-previous/**/*.a.p
> >> +      - build-previous/**/*.fa.p
> >> +      - build-previous/**/*.c.o
> >> +      - build-previous/**/*.c.o.d
> >> +      - build-previous/**/*.fa
> >> +  needs:
> >> +    job: amd64-opensuse-leap-container
> >> +  variables:
> >> +    IMAGE: opensuse-leap
> >> +    TARGETS: x86_64-softmmu aarch64-softmmu
> >> +  before_script:
> >> +    - export QEMU_PREV_VERSION="$(sed 's/\([0-9.]*\)\.[0-9]*/v\1.0/' VERSION)"
> >> +    - git checkout $QEMU_PREV_VERSION
> >> +  after_script:
> >> +    - mv build build-previous
> >
> > There seems to be a problem with this new CI job. Running a CI
> > run in my local repository it fails:
> >
> > https://gitlab.com/pm215/qemu/-/jobs/6075873685
> >
> > $ export QEMU_PREV_VERSION="$(sed 's/\([0-9.]*\)\.[0-9]*/v .0/' VERSION)"
> > $ git checkout $QEMU_PREV_VERSION
> > error: pathspec 'v8.2.0' did not match any file(s) known to git
> > Running after_script
> > Running after script...
> > $ mv build build-previous
> > mv: cannot stat 'build': No such file or directory
> > WARNING: after_script failed, but job will continue unaffected: exit code 1
> > Saving cache for failed job
> >
> >
> > I don't think you can assume that private forks doing submaintainer CI
> > runs necessarily have the full set of tags that the main repo does.
> 
> Yes, I thought this would be rare enough not to be an issue, but it
> seems it's not. I don't know what could be done here, if there's no tag,
> then there's no way to resolve the actual commit hash I think.
> 
> > I suspect the sed run will also do the wrong thing when run on the
> > commit that updates the version, because then it will replace
> > "9.0.0" with "9.0.0".
> 
> I just ignored this completly because my initial idea was to leave this
> job disabled and only run it for migration patchsets and pull requests,
> so it wouldn't make sense to run at that commit.
> 
> This job is also not entirely fail proof by design because we could
> always be hitting bugs in the older QEMU version that were already fixed
> in the new version.
> 
> I think the simplest fix here is to leave the test disabled, possibly
> with an env variable to enable it.

However if so that'll be unfortunate.. because the goal of the "n-1" test
is to fail the exact commit that will break compatibility and make it
enforced, IMHO.

Failing for some migration guy pushing CI can be better than nothing
indeed, but it is just less ideal..  we want the developer / module
maintainer notice this issue, fix it instead of merging something wrong
already, then we try to find what is broken and ask for a fix (where there
will still be a window it's broken; and if unlucky across major releases).

Currently the coverage of n-1 test is indeed still more focused on
migration framework, but it'll also cover quite some default configs of the
system layout (even if only x86 is covered), and some default devices IIRC.
We can already attach a few more standard devices in the cmdline so more
things can get covered.

A pretty dumb (but might be working?) solution is we keep commit ID rather
than tags to avoid all kinds of tag hassles:

  PREVIOUS_VERSION_COMMIT_ID=1600b9f46b1bd08b00fe86c46ef6dbb48cbe10d6

Then we boost it after a release.  I think it'll also work for the release
commit then.

Note that there can be a small window we run n-2 -> n test at the start,
but that's fine IMHO, as we should still allow that to work.  Fabiano's
"auto choose latest shared machine type" would be useful here, and I
assume it should just work.

With that, we try to figure something that can be smarter.  Would that
work for us?
Daniel P. Berrangé Feb. 5, 2024, 10:22 a.m. UTC | #4
On Mon, Feb 05, 2024 at 11:25:13AM +0800, Peter Xu wrote:
> On Fri, Feb 02, 2024 at 10:47:05AM -0300, Fabiano Rosas wrote:
> > Peter Maydell <peter.maydell@linaro.org> writes:
> > 
> > > On Mon, 29 Jan 2024 at 03:04, <peterx@redhat.com> wrote:
> > >>
> > >> From: Fabiano Rosas <farosas@suse.de>
> > >>
> > >> The migration tests have support for being passed two QEMU binaries to
> > >> test migration compatibility.
> > >>
> > >> Add a CI job that builds the lastest release of QEMU and another job
> > >> that uses that version plus an already present build of the current
> > >> version and run the migration tests with the two, both as source and
> > >> destination. I.e.:
> > >>
> > >>  old QEMU (n-1) -> current QEMU (development tree)
> > >>  current QEMU (development tree) -> old QEMU (n-1)
> > >>
> > >> The purpose of this CI job is to ensure the code we're about to merge
> > >> will not cause a migration compatibility problem when migrating the
> > >> next release (which will contain that code) to/from the previous
> > >> release.
> > >>
> > >> The version of migration-test used will be the one matching the older
> > >> QEMU. That way we can avoid special-casing new tests that wouldn't be
> > >> compatible with the older QEMU.
> > >>
> > >> Note: for user forks, the version tags need to be pushed to gitlab
> > >> otherwise it won't be able to checkout a different version.
> > >>
> > >> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> > >> Link: https://lore.kernel.org/r/20240118164951.30350-3-farosas@suse.de
> > >> Signed-off-by: Peter Xu <peterx@redhat.com>
> > >> ---
> > >>  .gitlab-ci.d/buildtest.yml | 60 ++++++++++++++++++++++++++++++++++++++
> > >>  1 file changed, 60 insertions(+)
> > >>
> > >> diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
> > >> index e1c7801598..f0b0edc634 100644
> > >> --- a/.gitlab-ci.d/buildtest.yml
> > >> +++ b/.gitlab-ci.d/buildtest.yml
> > >> @@ -167,6 +167,66 @@ build-system-centos:
> > >>        x86_64-softmmu rx-softmmu sh4-softmmu nios2-softmmu
> > >>      MAKE_CHECK_ARGS: check-build
> > >>
> > >> +# Previous QEMU release. Used for cross-version migration tests.
> > >> +build-previous-qemu:
> > >> +  extends: .native_build_job_template
> > >> +  artifacts:
> > >> +    when: on_success
> > >> +    expire_in: 2 days
> > >> +    paths:
> > >> +      - build-previous
> > >> +    exclude:
> > >> +      - build-previous/**/*.p
> > >> +      - build-previous/**/*.a.p
> > >> +      - build-previous/**/*.fa.p
> > >> +      - build-previous/**/*.c.o
> > >> +      - build-previous/**/*.c.o.d
> > >> +      - build-previous/**/*.fa
> > >> +  needs:
> > >> +    job: amd64-opensuse-leap-container
> > >> +  variables:
> > >> +    IMAGE: opensuse-leap
> > >> +    TARGETS: x86_64-softmmu aarch64-softmmu
> > >> +  before_script:
> > >> +    - export QEMU_PREV_VERSION="$(sed 's/\([0-9.]*\)\.[0-9]*/v\1.0/' VERSION)"
> > >> +    - git checkout $QEMU_PREV_VERSION
> > >> +  after_script:
> > >> +    - mv build build-previous
> > >
> > > There seems to be a problem with this new CI job. Running a CI
> > > run in my local repository it fails:
> > >
> > > https://gitlab.com/pm215/qemu/-/jobs/6075873685
> > >
> > > $ export QEMU_PREV_VERSION="$(sed 's/\([0-9.]*\)\.[0-9]*/v .0/' VERSION)"
> > > $ git checkout $QEMU_PREV_VERSION
> > > error: pathspec 'v8.2.0' did not match any file(s) known to git
> > > Running after_script
> > > Running after script...
> > > $ mv build build-previous
> > > mv: cannot stat 'build': No such file or directory
> > > WARNING: after_script failed, but job will continue unaffected: exit code 1
> > > Saving cache for failed job
> > >
> > >
> > > I don't think you can assume that private forks doing submaintainer CI
> > > runs necessarily have the full set of tags that the main repo does.
> > 
> > Yes, I thought this would be rare enough not to be an issue, but it
> > seems it's not. I don't know what could be done here, if there's no tag,
> > then there's no way to resolve the actual commit hash I think.
> > 
> > > I suspect the sed run will also do the wrong thing when run on the
> > > commit that updates the version, because then it will replace
> > > "9.0.0" with "9.0.0".
> > 
> > I just ignored this completly because my initial idea was to leave this
> > job disabled and only run it for migration patchsets and pull requests,
> > so it wouldn't make sense to run at that commit.
> > 
> > This job is also not entirely fail proof by design because we could
> > always be hitting bugs in the older QEMU version that were already fixed
> > in the new version.
> > 
> > I think the simplest fix here is to leave the test disabled, possibly
> > with an env variable to enable it.
> 
> However if so that'll be unfortunate.. because the goal of the "n-1" test
> is to fail the exact commit that will break compatibility and make it
> enforced, IMHO.
> 
> Failing for some migration guy pushing CI can be better than nothing
> indeed, but it is just less ideal..  we want the developer / module
> maintainer notice this issue, fix it instead of merging something wrong
> already, then we try to find what is broken and ask for a fix (where there
> will still be a window it's broken; and if unlucky across major releases).
> 
> Currently the coverage of n-1 test is indeed still more focused on
> migration framework, but it'll also cover quite some default configs of the
> system layout (even if only x86 is covered), and some default devices IIRC.
> We can already attach a few more standard devices in the cmdline so more
> things can get covered.
> 
> A pretty dumb (but might be working?) solution is we keep commit ID rather
> than tags to avoid all kinds of tag hassles:
> 
>   PREVIOUS_VERSION_COMMIT_ID=1600b9f46b1bd08b00fe86c46ef6dbb48cbe10d6
> 
> Then we boost it after a release.  I think it'll also work for the release
> commit then.

Please don't go for hardcoding stuff. AFAICS, the solution is very easy
and only requires adding two git commands to the test job:

  export QEMU_PREV_VERSION="$(sed 's/\([0-9.]*\)\.[0-9]*/v\1.0/' VERSION)"
  git remote add upstream https://gitlab.com/qemu-project/qemu
  git fetch upstream $QEMU_PRRV_VERSION
  git checkout $QEMU_PREV_VERSION

With regards,
Daniel
Peter Xu Feb. 5, 2024, 10:45 a.m. UTC | #5
On Mon, Feb 05, 2024 at 10:22:35AM +0000, Daniel P. Berrangé wrote:
> On Mon, Feb 05, 2024 at 11:25:13AM +0800, Peter Xu wrote:
> > On Fri, Feb 02, 2024 at 10:47:05AM -0300, Fabiano Rosas wrote:
> > > Peter Maydell <peter.maydell@linaro.org> writes:
> > > 
> > > > On Mon, 29 Jan 2024 at 03:04, <peterx@redhat.com> wrote:
> > > >>
> > > >> From: Fabiano Rosas <farosas@suse.de>
> > > >>
> > > >> The migration tests have support for being passed two QEMU binaries to
> > > >> test migration compatibility.
> > > >>
> > > >> Add a CI job that builds the lastest release of QEMU and another job
> > > >> that uses that version plus an already present build of the current
> > > >> version and run the migration tests with the two, both as source and
> > > >> destination. I.e.:
> > > >>
> > > >>  old QEMU (n-1) -> current QEMU (development tree)
> > > >>  current QEMU (development tree) -> old QEMU (n-1)
> > > >>
> > > >> The purpose of this CI job is to ensure the code we're about to merge
> > > >> will not cause a migration compatibility problem when migrating the
> > > >> next release (which will contain that code) to/from the previous
> > > >> release.
> > > >>
> > > >> The version of migration-test used will be the one matching the older
> > > >> QEMU. That way we can avoid special-casing new tests that wouldn't be
> > > >> compatible with the older QEMU.
> > > >>
> > > >> Note: for user forks, the version tags need to be pushed to gitlab
> > > >> otherwise it won't be able to checkout a different version.
> > > >>
> > > >> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> > > >> Link: https://lore.kernel.org/r/20240118164951.30350-3-farosas@suse.de
> > > >> Signed-off-by: Peter Xu <peterx@redhat.com>
> > > >> ---
> > > >>  .gitlab-ci.d/buildtest.yml | 60 ++++++++++++++++++++++++++++++++++++++
> > > >>  1 file changed, 60 insertions(+)
> > > >>
> > > >> diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
> > > >> index e1c7801598..f0b0edc634 100644
> > > >> --- a/.gitlab-ci.d/buildtest.yml
> > > >> +++ b/.gitlab-ci.d/buildtest.yml
> > > >> @@ -167,6 +167,66 @@ build-system-centos:
> > > >>        x86_64-softmmu rx-softmmu sh4-softmmu nios2-softmmu
> > > >>      MAKE_CHECK_ARGS: check-build
> > > >>
> > > >> +# Previous QEMU release. Used for cross-version migration tests.
> > > >> +build-previous-qemu:
> > > >> +  extends: .native_build_job_template
> > > >> +  artifacts:
> > > >> +    when: on_success
> > > >> +    expire_in: 2 days
> > > >> +    paths:
> > > >> +      - build-previous
> > > >> +    exclude:
> > > >> +      - build-previous/**/*.p
> > > >> +      - build-previous/**/*.a.p
> > > >> +      - build-previous/**/*.fa.p
> > > >> +      - build-previous/**/*.c.o
> > > >> +      - build-previous/**/*.c.o.d
> > > >> +      - build-previous/**/*.fa
> > > >> +  needs:
> > > >> +    job: amd64-opensuse-leap-container
> > > >> +  variables:
> > > >> +    IMAGE: opensuse-leap
> > > >> +    TARGETS: x86_64-softmmu aarch64-softmmu
> > > >> +  before_script:
> > > >> +    - export QEMU_PREV_VERSION="$(sed 's/\([0-9.]*\)\.[0-9]*/v\1.0/' VERSION)"
> > > >> +    - git checkout $QEMU_PREV_VERSION
> > > >> +  after_script:
> > > >> +    - mv build build-previous
> > > >
> > > > There seems to be a problem with this new CI job. Running a CI
> > > > run in my local repository it fails:
> > > >
> > > > https://gitlab.com/pm215/qemu/-/jobs/6075873685
> > > >
> > > > $ export QEMU_PREV_VERSION="$(sed 's/\([0-9.]*\)\.[0-9]*/v .0/' VERSION)"
> > > > $ git checkout $QEMU_PREV_VERSION
> > > > error: pathspec 'v8.2.0' did not match any file(s) known to git
> > > > Running after_script
> > > > Running after script...
> > > > $ mv build build-previous
> > > > mv: cannot stat 'build': No such file or directory
> > > > WARNING: after_script failed, but job will continue unaffected: exit code 1
> > > > Saving cache for failed job
> > > >
> > > >
> > > > I don't think you can assume that private forks doing submaintainer CI
> > > > runs necessarily have the full set of tags that the main repo does.
> > > 
> > > Yes, I thought this would be rare enough not to be an issue, but it
> > > seems it's not. I don't know what could be done here, if there's no tag,
> > > then there's no way to resolve the actual commit hash I think.
> > > 
> > > > I suspect the sed run will also do the wrong thing when run on the
> > > > commit that updates the version, because then it will replace
> > > > "9.0.0" with "9.0.0".
> > > 
> > > I just ignored this completly because my initial idea was to leave this
> > > job disabled and only run it for migration patchsets and pull requests,
> > > so it wouldn't make sense to run at that commit.
> > > 
> > > This job is also not entirely fail proof by design because we could
> > > always be hitting bugs in the older QEMU version that were already fixed
> > > in the new version.
> > > 
> > > I think the simplest fix here is to leave the test disabled, possibly
> > > with an env variable to enable it.
> > 
> > However if so that'll be unfortunate.. because the goal of the "n-1" test
> > is to fail the exact commit that will break compatibility and make it
> > enforced, IMHO.
> > 
> > Failing for some migration guy pushing CI can be better than nothing
> > indeed, but it is just less ideal..  we want the developer / module
> > maintainer notice this issue, fix it instead of merging something wrong
> > already, then we try to find what is broken and ask for a fix (where there
> > will still be a window it's broken; and if unlucky across major releases).
> > 
> > Currently the coverage of n-1 test is indeed still more focused on
> > migration framework, but it'll also cover quite some default configs of the
> > system layout (even if only x86 is covered), and some default devices IIRC.
> > We can already attach a few more standard devices in the cmdline so more
> > things can get covered.
> > 
> > A pretty dumb (but might be working?) solution is we keep commit ID rather
> > than tags to avoid all kinds of tag hassles:
> > 
> >   PREVIOUS_VERSION_COMMIT_ID=1600b9f46b1bd08b00fe86c46ef6dbb48cbe10d6
> > 
> > Then we boost it after a release.  I think it'll also work for the release
> > commit then.
> 
> Please don't go for hardcoding stuff. AFAICS, the solution is very easy
> and only requires adding two git commands to the test job:
> 
>   export QEMU_PREV_VERSION="$(sed 's/\([0-9.]*\)\.[0-9]*/v\1.0/' VERSION)"
>   git remote add upstream https://gitlab.com/qemu-project/qemu
>   git fetch upstream $QEMU_PRRV_VERSION
>   git checkout $QEMU_PREV_VERSION

True...  I'm as stupid as I could have. :)  Thanks.

For the CI test when at exactly the commit to release QEMU: I assume it's
fine to simply run it with 9.0 <-> 9.0 for example, which is one more time
of current migration qtest. IIUC that shouldn't be a big deal.
Daniel P. Berrangé Feb. 5, 2024, 10:49 a.m. UTC | #6
On Mon, Feb 05, 2024 at 06:45:23PM +0800, Peter Xu wrote:
> On Mon, Feb 05, 2024 at 10:22:35AM +0000, Daniel P. Berrangé wrote:
> > On Mon, Feb 05, 2024 at 11:25:13AM +0800, Peter Xu wrote:
> > > On Fri, Feb 02, 2024 at 10:47:05AM -0300, Fabiano Rosas wrote:
> > > > Peter Maydell <peter.maydell@linaro.org> writes:
> > > > 
> > > > > On Mon, 29 Jan 2024 at 03:04, <peterx@redhat.com> wrote:
> > > > >>
> > > > >> From: Fabiano Rosas <farosas@suse.de>
> > > > >>
> > > > >> The migration tests have support for being passed two QEMU binaries to
> > > > >> test migration compatibility.
> > > > >>
> > > > >> Add a CI job that builds the lastest release of QEMU and another job
> > > > >> that uses that version plus an already present build of the current
> > > > >> version and run the migration tests with the two, both as source and
> > > > >> destination. I.e.:
> > > > >>
> > > > >>  old QEMU (n-1) -> current QEMU (development tree)
> > > > >>  current QEMU (development tree) -> old QEMU (n-1)
> > > > >>
> > > > >> The purpose of this CI job is to ensure the code we're about to merge
> > > > >> will not cause a migration compatibility problem when migrating the
> > > > >> next release (which will contain that code) to/from the previous
> > > > >> release.
> > > > >>
> > > > >> The version of migration-test used will be the one matching the older
> > > > >> QEMU. That way we can avoid special-casing new tests that wouldn't be
> > > > >> compatible with the older QEMU.
> > > > >>
> > > > >> Note: for user forks, the version tags need to be pushed to gitlab
> > > > >> otherwise it won't be able to checkout a different version.
> > > > >>
> > > > >> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> > > > >> Link: https://lore.kernel.org/r/20240118164951.30350-3-farosas@suse.de
> > > > >> Signed-off-by: Peter Xu <peterx@redhat.com>
> > > > >> ---
> > > > >>  .gitlab-ci.d/buildtest.yml | 60 ++++++++++++++++++++++++++++++++++++++
> > > > >>  1 file changed, 60 insertions(+)
> > > > >>
> > > > >> diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
> > > > >> index e1c7801598..f0b0edc634 100644
> > > > >> --- a/.gitlab-ci.d/buildtest.yml
> > > > >> +++ b/.gitlab-ci.d/buildtest.yml
> > > > >> @@ -167,6 +167,66 @@ build-system-centos:
> > > > >>        x86_64-softmmu rx-softmmu sh4-softmmu nios2-softmmu
> > > > >>      MAKE_CHECK_ARGS: check-build
> > > > >>
> > > > >> +# Previous QEMU release. Used for cross-version migration tests.
> > > > >> +build-previous-qemu:
> > > > >> +  extends: .native_build_job_template
> > > > >> +  artifacts:
> > > > >> +    when: on_success
> > > > >> +    expire_in: 2 days
> > > > >> +    paths:
> > > > >> +      - build-previous
> > > > >> +    exclude:
> > > > >> +      - build-previous/**/*.p
> > > > >> +      - build-previous/**/*.a.p
> > > > >> +      - build-previous/**/*.fa.p
> > > > >> +      - build-previous/**/*.c.o
> > > > >> +      - build-previous/**/*.c.o.d
> > > > >> +      - build-previous/**/*.fa
> > > > >> +  needs:
> > > > >> +    job: amd64-opensuse-leap-container
> > > > >> +  variables:
> > > > >> +    IMAGE: opensuse-leap
> > > > >> +    TARGETS: x86_64-softmmu aarch64-softmmu
> > > > >> +  before_script:
> > > > >> +    - export QEMU_PREV_VERSION="$(sed 's/\([0-9.]*\)\.[0-9]*/v\1.0/' VERSION)"
> > > > >> +    - git checkout $QEMU_PREV_VERSION
> > > > >> +  after_script:
> > > > >> +    - mv build build-previous
> > > > >
> > > > > There seems to be a problem with this new CI job. Running a CI
> > > > > run in my local repository it fails:
> > > > >
> > > > > https://gitlab.com/pm215/qemu/-/jobs/6075873685
> > > > >
> > > > > $ export QEMU_PREV_VERSION="$(sed 's/\([0-9.]*\)\.[0-9]*/v .0/' VERSION)"
> > > > > $ git checkout $QEMU_PREV_VERSION
> > > > > error: pathspec 'v8.2.0' did not match any file(s) known to git
> > > > > Running after_script
> > > > > Running after script...
> > > > > $ mv build build-previous
> > > > > mv: cannot stat 'build': No such file or directory
> > > > > WARNING: after_script failed, but job will continue unaffected: exit code 1
> > > > > Saving cache for failed job
> > > > >
> > > > >
> > > > > I don't think you can assume that private forks doing submaintainer CI
> > > > > runs necessarily have the full set of tags that the main repo does.
> > > > 
> > > > Yes, I thought this would be rare enough not to be an issue, but it
> > > > seems it's not. I don't know what could be done here, if there's no tag,
> > > > then there's no way to resolve the actual commit hash I think.
> > > > 
> > > > > I suspect the sed run will also do the wrong thing when run on the
> > > > > commit that updates the version, because then it will replace
> > > > > "9.0.0" with "9.0.0".
> > > > 
> > > > I just ignored this completly because my initial idea was to leave this
> > > > job disabled and only run it for migration patchsets and pull requests,
> > > > so it wouldn't make sense to run at that commit.
> > > > 
> > > > This job is also not entirely fail proof by design because we could
> > > > always be hitting bugs in the older QEMU version that were already fixed
> > > > in the new version.
> > > > 
> > > > I think the simplest fix here is to leave the test disabled, possibly
> > > > with an env variable to enable it.
> > > 
> > > However if so that'll be unfortunate.. because the goal of the "n-1" test
> > > is to fail the exact commit that will break compatibility and make it
> > > enforced, IMHO.
> > > 
> > > Failing for some migration guy pushing CI can be better than nothing
> > > indeed, but it is just less ideal..  we want the developer / module
> > > maintainer notice this issue, fix it instead of merging something wrong
> > > already, then we try to find what is broken and ask for a fix (where there
> > > will still be a window it's broken; and if unlucky across major releases).
> > > 
> > > Currently the coverage of n-1 test is indeed still more focused on
> > > migration framework, but it'll also cover quite some default configs of the
> > > system layout (even if only x86 is covered), and some default devices IIRC.
> > > We can already attach a few more standard devices in the cmdline so more
> > > things can get covered.
> > > 
> > > A pretty dumb (but might be working?) solution is we keep commit ID rather
> > > than tags to avoid all kinds of tag hassles:
> > > 
> > >   PREVIOUS_VERSION_COMMIT_ID=1600b9f46b1bd08b00fe86c46ef6dbb48cbe10d6
> > > 
> > > Then we boost it after a release.  I think it'll also work for the release
> > > commit then.
> > 
> > Please don't go for hardcoding stuff. AFAICS, the solution is very easy
> > and only requires adding two git commands to the test job:
> > 
> >   export QEMU_PREV_VERSION="$(sed 's/\([0-9.]*\)\.[0-9]*/v\1.0/' VERSION)"
> >   git remote add upstream https://gitlab.com/qemu-project/qemu
> >   git fetch upstream $QEMU_PRRV_VERSION
> >   git checkout $QEMU_PREV_VERSION
> 
> True...  I'm as stupid as I could have. :)  Thanks.
> 
> For the CI test when at exactly the commit to release QEMU: I assume it's
> fine to simply run it with 9.0 <-> 9.0 for example, which is one more time
> of current migration qtest. IIUC that shouldn't be a big deal.

Yes, that should be harmless, and by the time we hit the 9.0 tag,
then it is too late to fix any problem with 8.2 -> 9.0 migration
anyway.


With regards,
Daniel
diff mbox series

Patch

diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
index e1c7801598..f0b0edc634 100644
--- a/.gitlab-ci.d/buildtest.yml
+++ b/.gitlab-ci.d/buildtest.yml
@@ -167,6 +167,66 @@  build-system-centos:
       x86_64-softmmu rx-softmmu sh4-softmmu nios2-softmmu
     MAKE_CHECK_ARGS: check-build
 
+# Previous QEMU release. Used for cross-version migration tests.
+build-previous-qemu:
+  extends: .native_build_job_template
+  artifacts:
+    when: on_success
+    expire_in: 2 days
+    paths:
+      - build-previous
+    exclude:
+      - build-previous/**/*.p
+      - build-previous/**/*.a.p
+      - build-previous/**/*.fa.p
+      - build-previous/**/*.c.o
+      - build-previous/**/*.c.o.d
+      - build-previous/**/*.fa
+  needs:
+    job: amd64-opensuse-leap-container
+  variables:
+    IMAGE: opensuse-leap
+    TARGETS: x86_64-softmmu aarch64-softmmu
+  before_script:
+    - export QEMU_PREV_VERSION="$(sed 's/\([0-9.]*\)\.[0-9]*/v\1.0/' VERSION)"
+    - git checkout $QEMU_PREV_VERSION
+  after_script:
+    - mv build build-previous
+
+.migration-compat-common:
+  extends: .common_test_job_template
+  needs:
+    - job: build-previous-qemu
+    - job: build-system-opensuse
+  # The old QEMU could have bugs unrelated to migration that are
+  # already fixed in the current development branch, so this test
+  # might fail.
+  allow_failure: true
+  variables:
+    IMAGE: opensuse-leap
+    MAKE_CHECK_ARGS: check-build
+  script:
+    # Use the migration-tests from the older QEMU tree. This avoids
+    # testing an old QEMU against new features/tests that it is not
+    # compatible with.
+    - cd build-previous
+    # old to new
+    - QTEST_QEMU_BINARY_SRC=./qemu-system-${TARGET}
+          QTEST_QEMU_BINARY=../build/qemu-system-${TARGET} ./tests/qtest/migration-test
+    # new to old
+    - QTEST_QEMU_BINARY_DST=./qemu-system-${TARGET}
+          QTEST_QEMU_BINARY=../build/qemu-system-${TARGET} ./tests/qtest/migration-test
+
+migration-compat-aarch64:
+  extends: .migration-compat-common
+  variables:
+    TARGET: aarch64
+
+migration-compat-x86_64:
+  extends: .migration-compat-common
+  variables:
+    TARGET: x86_64
+
 check-system-centos:
   extends: .native_test_job_template
   needs: