Message ID | 1743082598-428927-1-git-send-email-steven.sistare@oracle.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [V1] migration: cpr breaks SNP guest | expand |
On 3/27/25 08:36, Steve Sistare wrote: > With aux-ram-share=off, booting an SNP guest fails with: > > ../util/error.c:68: error_setv: Assertion `*errp == NULL' failed. > > This is because a CPR blocker for the guest_memfd ramblock is added > twice, once in ram_block_add_cpr_blocker because aux-ram-share=off so > rb->fd < 0, and once in ram_block_add for a specific guest_memfd blocker. > > To fix, add the guest_memfd blocker iff a generic one would not be > added by ram_block_add_cpr_blocker. > > Fixes: 094a3dbc55df ("migration: ram block cpr blockers") > Reported-by: Tom Lendacky <thomas.lendacky@amd.com> > Reported-by: Michael Roth <michael.roth@amd.com> > Signed-off-by: Steve Sistare <steven.sistare@oracle.com> SNP guest launch works for me with this fix. Thanks, Steve! Tested-by: Tom Lendacky <thomas.lendacky@amd.com> > --- > system/physmem.c | 19 ++++++++++++------- > 1 file changed, 12 insertions(+), 7 deletions(-) > > diff --git a/system/physmem.c b/system/physmem.c > index e97de3e..cfafb06 100644 > --- a/system/physmem.c > +++ b/system/physmem.c > @@ -1908,13 +1908,18 @@ static void ram_block_add(RAMBlock *new_block, Error **errp) > goto out_free; > } > > - error_setg(&new_block->cpr_blocker, > - "Memory region %s uses guest_memfd, " > - "which is not supported with CPR.", > - memory_region_name(new_block->mr)); > - migrate_add_blocker_modes(&new_block->cpr_blocker, errp, > - MIG_MODE_CPR_TRANSFER, > - -1); > + /* > + * Add a specific guest_memfd blocker if a generic one would not be > + * added by ram_block_add_cpr_blocker. > + */ > + if (new_block->fd >= 0 && qemu_ram_is_shared(new_block)) { > + error_setg(&new_block->cpr_blocker, > + "Memory region %s uses guest_memfd, " > + "which is not supported with CPR.", > + memory_region_name(new_block->mr)); > + migrate_add_blocker_modes(&new_block->cpr_blocker, errp, > + MIG_MODE_CPR_TRANSFER, -1); > + } > } > > ram_size = (new_block->offset + new_block->max_length) >> TARGET_PAGE_BITS;
Steve Sistare <steven.sistare@oracle.com> writes: > With aux-ram-share=off, booting an SNP guest fails with: > > ../util/error.c:68: error_setv: Assertion `*errp == NULL' failed. > > This is because a CPR blocker for the guest_memfd ramblock is added > twice, once in ram_block_add_cpr_blocker because aux-ram-share=off so > rb->fd < 0, and once in ram_block_add for a specific guest_memfd blocker. > > To fix, add the guest_memfd blocker iff a generic one would not be > added by ram_block_add_cpr_blocker. > > Fixes: 094a3dbc55df ("migration: ram block cpr blockers") > Reported-by: Tom Lendacky <thomas.lendacky@amd.com> > Reported-by: Michael Roth <michael.roth@amd.com> > Signed-off-by: Steve Sistare <steven.sistare@oracle.com> > --- > system/physmem.c | 19 ++++++++++++------- > 1 file changed, 12 insertions(+), 7 deletions(-) > > diff --git a/system/physmem.c b/system/physmem.c > index e97de3e..cfafb06 100644 > --- a/system/physmem.c > +++ b/system/physmem.c > @@ -1908,13 +1908,18 @@ static void ram_block_add(RAMBlock *new_block, Error **errp) > goto out_free; > } > > - error_setg(&new_block->cpr_blocker, > - "Memory region %s uses guest_memfd, " > - "which is not supported with CPR.", > - memory_region_name(new_block->mr)); > - migrate_add_blocker_modes(&new_block->cpr_blocker, errp, > - MIG_MODE_CPR_TRANSFER, > - -1); > + /* > + * Add a specific guest_memfd blocker if a generic one would not be > + * added by ram_block_add_cpr_blocker. > + */ > + if (new_block->fd >= 0 && qemu_ram_is_shared(new_block)) { Can we avoid having two instances of the same condition that will need to be kept in sync? What about: /* * Preserving guest_memfd may be sufficient for CPR, but it has not * been tested yet. */ if (ram_is_cpr_compatible(new_block)) { ... > + error_setg(&new_block->cpr_blocker, > + "Memory region %s uses guest_memfd, " > + "which is not supported with CPR.", > + memory_region_name(new_block->mr)); > + migrate_add_blocker_modes(&new_block->cpr_blocker, errp, > + MIG_MODE_CPR_TRANSFER, -1); > + } > } > > ram_size = (new_block->offset + new_block->max_length) >> TARGET_PAGE_BITS;
On 3/27/2025 10:21 AM, Fabiano Rosas wrote: > Steve Sistare <steven.sistare@oracle.com> writes: > >> With aux-ram-share=off, booting an SNP guest fails with: >> >> ../util/error.c:68: error_setv: Assertion `*errp == NULL' failed. >> >> This is because a CPR blocker for the guest_memfd ramblock is added >> twice, once in ram_block_add_cpr_blocker because aux-ram-share=off so >> rb->fd < 0, and once in ram_block_add for a specific guest_memfd blocker. >> >> To fix, add the guest_memfd blocker iff a generic one would not be >> added by ram_block_add_cpr_blocker. >> >> Fixes: 094a3dbc55df ("migration: ram block cpr blockers") >> Reported-by: Tom Lendacky <thomas.lendacky@amd.com> >> Reported-by: Michael Roth <michael.roth@amd.com> >> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> >> --- >> system/physmem.c | 19 ++++++++++++------- >> 1 file changed, 12 insertions(+), 7 deletions(-) >> >> diff --git a/system/physmem.c b/system/physmem.c >> index e97de3e..cfafb06 100644 >> --- a/system/physmem.c >> +++ b/system/physmem.c >> @@ -1908,13 +1908,18 @@ static void ram_block_add(RAMBlock *new_block, Error **errp) >> goto out_free; >> } >> >> - error_setg(&new_block->cpr_blocker, >> - "Memory region %s uses guest_memfd, " >> - "which is not supported with CPR.", >> - memory_region_name(new_block->mr)); >> - migrate_add_blocker_modes(&new_block->cpr_blocker, errp, >> - MIG_MODE_CPR_TRANSFER, >> - -1); >> + /* >> + * Add a specific guest_memfd blocker if a generic one would not be >> + * added by ram_block_add_cpr_blocker. >> + */ >> + if (new_block->fd >= 0 && qemu_ram_is_shared(new_block)) { > > Can we avoid having two instances of the same condition that will need > to be kept in sync? What about: > > /* > * Preserving guest_memfd may be sufficient for CPR, but it has not > * been tested yet. > */ > if (ram_is_cpr_compatible(new_block)) { > ... > Sure, that is better. I will send V2 shortly. - Steve >> + error_setg(&new_block->cpr_blocker, >> + "Memory region %s uses guest_memfd, " >> + "which is not supported with CPR.", >> + memory_region_name(new_block->mr)); >> + migrate_add_blocker_modes(&new_block->cpr_blocker, errp, >> + MIG_MODE_CPR_TRANSFER, -1); >> + } >> } >> >> ram_size = (new_block->offset + new_block->max_length) >> TARGET_PAGE_BITS;
diff --git a/system/physmem.c b/system/physmem.c index e97de3e..cfafb06 100644 --- a/system/physmem.c +++ b/system/physmem.c @@ -1908,13 +1908,18 @@ static void ram_block_add(RAMBlock *new_block, Error **errp) goto out_free; } - error_setg(&new_block->cpr_blocker, - "Memory region %s uses guest_memfd, " - "which is not supported with CPR.", - memory_region_name(new_block->mr)); - migrate_add_blocker_modes(&new_block->cpr_blocker, errp, - MIG_MODE_CPR_TRANSFER, - -1); + /* + * Add a specific guest_memfd blocker if a generic one would not be + * added by ram_block_add_cpr_blocker. + */ + if (new_block->fd >= 0 && qemu_ram_is_shared(new_block)) { + error_setg(&new_block->cpr_blocker, + "Memory region %s uses guest_memfd, " + "which is not supported with CPR.", + memory_region_name(new_block->mr)); + migrate_add_blocker_modes(&new_block->cpr_blocker, errp, + MIG_MODE_CPR_TRANSFER, -1); + } } ram_size = (new_block->offset + new_block->max_length) >> TARGET_PAGE_BITS;
With aux-ram-share=off, booting an SNP guest fails with: ../util/error.c:68: error_setv: Assertion `*errp == NULL' failed. This is because a CPR blocker for the guest_memfd ramblock is added twice, once in ram_block_add_cpr_blocker because aux-ram-share=off so rb->fd < 0, and once in ram_block_add for a specific guest_memfd blocker. To fix, add the guest_memfd blocker iff a generic one would not be added by ram_block_add_cpr_blocker. Fixes: 094a3dbc55df ("migration: ram block cpr blockers") Reported-by: Tom Lendacky <thomas.lendacky@amd.com> Reported-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> --- system/physmem.c | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-)