diff mbox series

[v1] KVM: s390: disable migration mode when dirty tracking is disabled

Message ID 20230120075406.101436-1-nrb@linux.ibm.com (mailing list archive)
State New, archived
Headers show
Series [v1] KVM: s390: disable migration mode when dirty tracking is disabled | expand

Commit Message

Nico Boehr Jan. 20, 2023, 7:54 a.m. UTC
Migration mode is a VM attribute which enables tracking of changes in
storage attributes (PGSTE). It assumes dirty tracking is enabled on all
memslots to keep a dirty bitmap of pages with changed storage attributes.

When enabling migration mode, we currently check that dirty tracking is
enabled for all memslots. However, userspace can disable dirty tracking
without disabling migration mode.

Since migration mode is pointless with dirty tracking disabled, disable
migration mode whenever userspace disables dirty tracking on any slot.

Also update the documentation to clarify that dirty tracking must be
enabled when enabling migration mode, which is already enforced by the
code in kvm_s390_vm_start_migration().

To disable migration mode, slots_lock should be held, which is taken
in kvm_set_memory_region() and thus held in
kvm_arch_prepare_memory_region().

Restructure the prepare code a bit so all the sanity checking is done
before disabling migration mode. This ensures migration mode isn't
disabled when some sanity check fails.

Cc: stable@vger.kernel.org
Fixes: 190df4a212a7 ("KVM: s390: CMMA tracking, ESSA emulation, migration mode")
Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
---
 Documentation/virt/kvm/devices/vm.rst |  4 +++
 arch/s390/kvm/kvm-s390.c              | 41 ++++++++++++++++++---------
 2 files changed, 32 insertions(+), 13 deletions(-)

Comments

Claudio Imbrenda Jan. 23, 2023, 10:29 a.m. UTC | #1
On Fri, 20 Jan 2023 08:54:06 +0100
Nico Boehr <nrb@linux.ibm.com> wrote:

> Migration mode is a VM attribute which enables tracking of changes in
> storage attributes (PGSTE). It assumes dirty tracking is enabled on all
> memslots to keep a dirty bitmap of pages with changed storage attributes.
> 
> When enabling migration mode, we currently check that dirty tracking is
> enabled for all memslots. However, userspace can disable dirty tracking
> without disabling migration mode.
> 
> Since migration mode is pointless with dirty tracking disabled, disable
> migration mode whenever userspace disables dirty tracking on any slot.
> 
> Also update the documentation to clarify that dirty tracking must be
> enabled when enabling migration mode, which is already enforced by the
> code in kvm_s390_vm_start_migration().
> 
> To disable migration mode, slots_lock should be held, which is taken
> in kvm_set_memory_region() and thus held in
> kvm_arch_prepare_memory_region().
> 
> Restructure the prepare code a bit so all the sanity checking is done
> before disabling migration mode. This ensures migration mode isn't
> disabled when some sanity check fails.
> 
> Cc: stable@vger.kernel.org
> Fixes: 190df4a212a7 ("KVM: s390: CMMA tracking, ESSA emulation, migration mode")
> Signed-off-by: Nico Boehr <nrb@linux.ibm.com>

Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>

> ---
>  Documentation/virt/kvm/devices/vm.rst |  4 +++
>  arch/s390/kvm/kvm-s390.c              | 41 ++++++++++++++++++---------
>  2 files changed, 32 insertions(+), 13 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/devices/vm.rst b/Documentation/virt/kvm/devices/vm.rst
> index 60acc39e0e93..147efec626e5 100644
> --- a/Documentation/virt/kvm/devices/vm.rst
> +++ b/Documentation/virt/kvm/devices/vm.rst
> @@ -302,6 +302,10 @@ Allows userspace to start migration mode, needed for PGSTE migration.
>  Setting this attribute when migration mode is already active will have
>  no effects.
>  
> +Dirty tracking must be enabled on all memslots, else -EINVAL is returned. When
> +dirty tracking is disabled on any memslot, migration mode is automatically
> +stopped.
> +
>  :Parameters: none
>  :Returns:   -ENOMEM if there is not enough free memory to start migration mode;
>  	    -EINVAL if the state of the VM is invalid (e.g. no memory defined);
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index e4890e04b210..4785f002cd93 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -5628,28 +5628,43 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
>  				   enum kvm_mr_change change)
>  {
>  	gpa_t size;
> +	int rc;
>  
>  	/* When we are protected, we should not change the memory slots */
>  	if (kvm_s390_pv_get_handle(kvm))
>  		return -EINVAL;
>  
> -	if (change == KVM_MR_DELETE || change == KVM_MR_FLAGS_ONLY)
> -		return 0;
> +	if (change != KVM_MR_DELETE && change != KVM_MR_FLAGS_ONLY) {
> +		/* A few sanity checks. We can have memory slots which have to be
> +		 * located/ended at a segment boundary (1MB). The memory in userland is
> +		 * ok to be fragmented into various different vmas. It is okay to mmap()
> +		 * and munmap() stuff in this slot after doing this call at any time
> +		 */
>  
> -	/* A few sanity checks. We can have memory slots which have to be
> -	   located/ended at a segment boundary (1MB). The memory in userland is
> -	   ok to be fragmented into various different vmas. It is okay to mmap()
> -	   and munmap() stuff in this slot after doing this call at any time */
> +		if (new->userspace_addr & 0xffffful)
> +			return -EINVAL;
>  
> -	if (new->userspace_addr & 0xffffful)
> -		return -EINVAL;
> +		size = new->npages * PAGE_SIZE;
> +		if (size & 0xffffful)
> +			return -EINVAL;
>  
> -	size = new->npages * PAGE_SIZE;
> -	if (size & 0xffffful)
> -		return -EINVAL;
> +		if ((new->base_gfn * PAGE_SIZE) + size > kvm->arch.mem_limit)
> +			return -EINVAL;
> +	}
>  
> -	if ((new->base_gfn * PAGE_SIZE) + size > kvm->arch.mem_limit)
> -		return -EINVAL;
> +	/* Turn off migration mode when userspace disables dirty page logging.
> +	 * Migration mode expects dirty page logging being enabled to store
> +	 * its dirty bitmap.
> +	 */
> +	if (kvm->arch.migration_mode) {
> +		if ((old->flags & KVM_MEM_LOG_DIRTY_PAGES) &&
> +		    !(new->flags & KVM_MEM_LOG_DIRTY_PAGES)) {
> +			rc = kvm_s390_vm_stop_migration(kvm);
> +
> +			if (rc)
> +				pr_warn("Failed to stop migration mode\n");
> +		}
> +	}
>  
>  	return 0;
>  }
Janosch Frank Jan. 25, 2023, 1:55 p.m. UTC | #2
On 1/20/23 08:54, Nico Boehr wrote:
> Migration mode is a VM attribute which enables tracking of changes in
> storage attributes (PGSTE). It assumes dirty tracking is enabled on all
> memslots to keep a dirty bitmap of pages with changed storage attributes.
> 
> When enabling migration mode, we currently check that dirty tracking is
> enabled for all memslots. However, userspace can disable dirty tracking
> without disabling migration mode.
> 
> Since migration mode is pointless with dirty tracking disabled, disable
> migration mode whenever userspace disables dirty tracking on any slot.

Will userspace be able to handle the sudden -EINVAL rcs on 
KVM_S390_GET_CMMA_BITS and KVM_S390_SET_CMMA_BITS?

I.e. what allows us to simply turn it off without the userspace knowing 
about it?

> 
> Also update the documentation to clarify that dirty tracking must be
> enabled when enabling migration mode, which is already enforced by the
> code in kvm_s390_vm_start_migration().
> 
> To disable migration mode, slots_lock should be held, which is taken
> in kvm_set_memory_region() and thus held in
> kvm_arch_prepare_memory_region().
> 
> Restructure the prepare code a bit so all the sanity checking is done
> before disabling migration mode. This ensures migration mode isn't
> disabled when some sanity check fails.
> 
> Cc: stable@vger.kernel.org
> Fixes: 190df4a212a7 ("KVM: s390: CMMA tracking, ESSA emulation, migration mode")
> Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
> ---
>   Documentation/virt/kvm/devices/vm.rst |  4 +++
>   arch/s390/kvm/kvm-s390.c              | 41 ++++++++++++++++++---------
>   2 files changed, 32 insertions(+), 13 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/devices/vm.rst b/Documentation/virt/kvm/devices/vm.rst
> index 60acc39e0e93..147efec626e5 100644
> --- a/Documentation/virt/kvm/devices/vm.rst
> +++ b/Documentation/virt/kvm/devices/vm.rst
> @@ -302,6 +302,10 @@ Allows userspace to start migration mode, needed for PGSTE migration.
>   Setting this attribute when migration mode is already active will have
>   no effects.
>   
> +Dirty tracking must be enabled on all memslots, else -EINVAL is returned. When
> +dirty tracking is disabled on any memslot, migration mode is automatically
> +stopped.

Do we also need to add a warning to the CMMA IOCTLs?

> +
>   :Parameters: none
>   :Returns:   -ENOMEM if there is not enough free memory to start migration mode;
>   	    -EINVAL if the state of the VM is invalid (e.g. no memory defined);
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index e4890e04b210..4785f002cd93 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -5628,28 +5628,43 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
>   				   enum kvm_mr_change change)
>   {
>   	gpa_t size;
> +	int rc;

Not sure why you added rc even though it doesn't need to be used.

>   
>   	/* When we are protected, we should not change the memory slots */
>   	if (kvm_s390_pv_get_handle(kvm))
>   		return -EINVAL;
>   
> -	if (change == KVM_MR_DELETE || change == KVM_MR_FLAGS_ONLY)
> -		return 0;
> +	if (change != KVM_MR_DELETE && change != KVM_MR_FLAGS_ONLY) {
> +		/* A few sanity checks. We can have memory slots which have to be
> +		 * located/ended at a segment boundary (1MB). The memory in userland is
> +		 * ok to be fragmented into various different vmas. It is okay to mmap()
> +		 * and munmap() stuff in this slot after doing this call at any time
> +		 */

This isn't net code, we usually start our comments on a "*" line.

>   
> -	/* A few sanity checks. We can have memory slots which have to be
> -	   located/ended at a segment boundary (1MB). The memory in userland is
> -	   ok to be fragmented into various different vmas. It is okay to mmap()
> -	   and munmap() stuff in this slot after doing this call at any time */
> +		if (new->userspace_addr & 0xffffful)
> +			return -EINVAL;
>   
> -	if (new->userspace_addr & 0xffffful)
> -		return -EINVAL;
> +		size = new->npages * PAGE_SIZE;
> +		if (size & 0xffffful)
> +			return -EINVAL;
>   
> -	size = new->npages * PAGE_SIZE;
> -	if (size & 0xffffful)
> -		return -EINVAL;
> +		if ((new->base_gfn * PAGE_SIZE) + size > kvm->arch.mem_limit)
> +			return -EINVAL;
> +	}
>   
> -	if ((new->base_gfn * PAGE_SIZE) + size > kvm->arch.mem_limit)
> -		return -EINVAL;
> +	/* Turn off migration mode when userspace disables dirty page logging.
> +	 * Migration mode expects dirty page logging being enabled to store
> +	 * its dirty bitmap.
> +	 */
> +	if (kvm->arch.migration_mode) {
> +		if ((old->flags & KVM_MEM_LOG_DIRTY_PAGES) &&
> +		    !(new->flags & KVM_MEM_LOG_DIRTY_PAGES)) {
> +			rc = kvm_s390_vm_stop_migration(kvm);
> +
> +			if (rc)
> +				pr_warn("Failed to stop migration mode\n");

As the results were rather catastrophic it might make more sense to use 
WARN_ONCE() and condense these 3 lines into one.

> +		}
> +	}
>   
>   	return 0;
>   }
Claudio Imbrenda Jan. 25, 2023, 3:53 p.m. UTC | #3
On Wed, 25 Jan 2023 14:55:59 +0100
Janosch Frank <frankja@linux.ibm.com> wrote:

> On 1/20/23 08:54, Nico Boehr wrote:
> > Migration mode is a VM attribute which enables tracking of changes in
> > storage attributes (PGSTE). It assumes dirty tracking is enabled on all
> > memslots to keep a dirty bitmap of pages with changed storage attributes.
> > 
> > When enabling migration mode, we currently check that dirty tracking is
> > enabled for all memslots. However, userspace can disable dirty tracking
> > without disabling migration mode.
> > 
> > Since migration mode is pointless with dirty tracking disabled, disable
> > migration mode whenever userspace disables dirty tracking on any slot.  
> 
> Will userspace be able to handle the sudden -EINVAL rcs on 
> KVM_S390_GET_CMMA_BITS and KVM_S390_SET_CMMA_BITS?
> 
> I.e. what allows us to simply turn it off without the userspace knowing 
> about it?

if we are here, userspace is disabling dirty tracking without having
disabled migration mode. it should have disabled migration mode before
disabling dirty tracking. also, migration mode does not actually impact
userspace (at least at this point). it's just an indication from
userspace to the kernel that userspace is trying to migrate. disabling
dirty tracking is a rather explicit way for userspace to tell the
kernel that the migration is over (one way or the other)

> 
> > 
> > Also update the documentation to clarify that dirty tracking must be
> > enabled when enabling migration mode, which is already enforced by the
> > code in kvm_s390_vm_start_migration().
> > 
> > To disable migration mode, slots_lock should be held, which is taken
> > in kvm_set_memory_region() and thus held in
> > kvm_arch_prepare_memory_region().
> > 
> > Restructure the prepare code a bit so all the sanity checking is done
> > before disabling migration mode. This ensures migration mode isn't
> > disabled when some sanity check fails.
> > 
> > Cc: stable@vger.kernel.org
> > Fixes: 190df4a212a7 ("KVM: s390: CMMA tracking, ESSA emulation, migration mode")
> > Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
> > ---
> >   Documentation/virt/kvm/devices/vm.rst |  4 +++
> >   arch/s390/kvm/kvm-s390.c              | 41 ++++++++++++++++++---------
> >   2 files changed, 32 insertions(+), 13 deletions(-)
> > 
> > diff --git a/Documentation/virt/kvm/devices/vm.rst b/Documentation/virt/kvm/devices/vm.rst
> > index 60acc39e0e93..147efec626e5 100644
> > --- a/Documentation/virt/kvm/devices/vm.rst
> > +++ b/Documentation/virt/kvm/devices/vm.rst
> > @@ -302,6 +302,10 @@ Allows userspace to start migration mode, needed for PGSTE migration.
> >   Setting this attribute when migration mode is already active will have
> >   no effects.
> >   
> > +Dirty tracking must be enabled on all memslots, else -EINVAL is returned. When
> > +dirty tracking is disabled on any memslot, migration mode is automatically
> > +stopped.  
> 
> Do we also need to add a warning to the CMMA IOCTLs?
> 
> > +
> >   :Parameters: none
> >   :Returns:   -ENOMEM if there is not enough free memory to start migration mode;
> >   	    -EINVAL if the state of the VM is invalid (e.g. no memory defined);
> > diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> > index e4890e04b210..4785f002cd93 100644
> > --- a/arch/s390/kvm/kvm-s390.c
> > +++ b/arch/s390/kvm/kvm-s390.c
> > @@ -5628,28 +5628,43 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
> >   				   enum kvm_mr_change change)
> >   {
> >   	gpa_t size;
> > +	int rc;  
> 
> Not sure why you added rc even though it doesn't need to be used.
> 
> >   
> >   	/* When we are protected, we should not change the memory slots */
> >   	if (kvm_s390_pv_get_handle(kvm))
> >   		return -EINVAL;
> >   
> > -	if (change == KVM_MR_DELETE || change == KVM_MR_FLAGS_ONLY)
> > -		return 0;
> > +	if (change != KVM_MR_DELETE && change != KVM_MR_FLAGS_ONLY) {
> > +		/* A few sanity checks. We can have memory slots which have to be
> > +		 * located/ended at a segment boundary (1MB). The memory in userland is
> > +		 * ok to be fragmented into various different vmas. It is okay to mmap()
> > +		 * and munmap() stuff in this slot after doing this call at any time
> > +		 */  
> 
> This isn't net code, we usually start our comments on a "*" line.

like this:

/*
 * blah
 */

I missed this when I reviewed the patch

> 
> >   
> > -	/* A few sanity checks. We can have memory slots which have to be
> > -	   located/ended at a segment boundary (1MB). The memory in userland is
> > -	   ok to be fragmented into various different vmas. It is okay to mmap()
> > -	   and munmap() stuff in this slot after doing this call at any time */
> > +		if (new->userspace_addr & 0xffffful)
> > +			return -EINVAL;
> >   
> > -	if (new->userspace_addr & 0xffffful)
> > -		return -EINVAL;
> > +		size = new->npages * PAGE_SIZE;
> > +		if (size & 0xffffful)
> > +			return -EINVAL;
> >   
> > -	size = new->npages * PAGE_SIZE;
> > -	if (size & 0xffffful)
> > -		return -EINVAL;
> > +		if ((new->base_gfn * PAGE_SIZE) + size > kvm->arch.mem_limit)
> > +			return -EINVAL;
> > +	}
> >   
> > -	if ((new->base_gfn * PAGE_SIZE) + size > kvm->arch.mem_limit)
> > -		return -EINVAL;
> > +	/* Turn off migration mode when userspace disables dirty page logging.
> > +	 * Migration mode expects dirty page logging being enabled to store
> > +	 * its dirty bitmap.
> > +	 */
> > +	if (kvm->arch.migration_mode) {
> > +		if ((old->flags & KVM_MEM_LOG_DIRTY_PAGES) &&
> > +		    !(new->flags & KVM_MEM_LOG_DIRTY_PAGES)) {
> > +			rc = kvm_s390_vm_stop_migration(kvm);
> > +
> > +			if (rc)
> > +				pr_warn("Failed to stop migration mode\n");  
> 
> As the results were rather catastrophic it might make more sense to use 
> WARN_ONCE() and condense these 3 lines into one.

(in which case rc is not needed)

is WARN_ONCE even enough? the results are indeed potentially
catastrophic, would it not make sense to WARN (without _ONCE) ?

> 
> > +		}
> > +	}
> >   
> >   	return 0;
> >   }  
>
Nico Boehr Jan. 26, 2023, 8:41 a.m. UTC | #4
Quoting Janosch Frank (2023-01-25 14:55:59)
> On 1/20/23 08:54, Nico Boehr wrote:
> > Migration mode is a VM attribute which enables tracking of changes in
> > storage attributes (PGSTE). It assumes dirty tracking is enabled on all
> > memslots to keep a dirty bitmap of pages with changed storage attributes.
> > 
> > When enabling migration mode, we currently check that dirty tracking is
> > enabled for all memslots. However, userspace can disable dirty tracking
> > without disabling migration mode.
> > 
> > Since migration mode is pointless with dirty tracking disabled, disable
> > migration mode whenever userspace disables dirty tracking on any slot.
> 
> Will userspace be able to handle the sudden -EINVAL rcs on 
> KVM_S390_GET_CMMA_BITS and KVM_S390_SET_CMMA_BITS?

QEMU has proper error handling on the GET_CMMA_BITS code path and will not
attempt GET_CMMA_BITS after it disabled dirty tracking. So yes, userspace can
handle this fine. In addition, as mentioned in the commit, it was never allowed
to have migration mode without dirty tracking. It was checked when migration
mode is enabled, just wasn't enforced when dirty tracking went off. The
alternative would be to refuse disabling dirty tracking when migration mode is
on; and that would _really_ break userspace.

Or we just leave migration mode on and check on every emulation/ioctl that a
dirty bitmap is still there, which would change absolutely nothing about the
return value of GET_CMMA_BITS.

Or we allocate the dirty bitmap for storage attributes independent of the dirty
bitmap for pages, which increases memory usage and makes this patch quite a bit
more complex, risking that we break more than what is already broken.

This approach really seems like the sane option to me.

For SET_CMMA_BIT, nothing changes.

> I.e. what allows us to simply turn it off without the userspace knowing 
> about it?
> 
> > 
> > Also update the documentation to clarify that dirty tracking must be
> > enabled when enabling migration mode, which is already enforced by the
> > code in kvm_s390_vm_start_migration().
> > 
> > To disable migration mode, slots_lock should be held, which is taken
> > in kvm_set_memory_region() and thus held in
> > kvm_arch_prepare_memory_region().
> > 
> > Restructure the prepare code a bit so all the sanity checking is done
> > before disabling migration mode. This ensures migration mode isn't
> > disabled when some sanity check fails.
> > 
> > Cc: stable@vger.kernel.org
> > Fixes: 190df4a212a7 ("KVM: s390: CMMA tracking, ESSA emulation, migration mode")
> > Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
> > ---
> >   Documentation/virt/kvm/devices/vm.rst |  4 +++
> >   arch/s390/kvm/kvm-s390.c              | 41 ++++++++++++++++++---------
> >   2 files changed, 32 insertions(+), 13 deletions(-)
> > 
> > diff --git a/Documentation/virt/kvm/devices/vm.rst b/Documentation/virt/kvm/devices/vm.rst
> > index 60acc39e0e93..147efec626e5 100644
> > --- a/Documentation/virt/kvm/devices/vm.rst
> > +++ b/Documentation/virt/kvm/devices/vm.rst
> > @@ -302,6 +302,10 @@ Allows userspace to start migration mode, needed for PGSTE migration.
> >   Setting this attribute when migration mode is already active will have
> >   no effects.
> >   
> > +Dirty tracking must be enabled on all memslots, else -EINVAL is returned. When
> > +dirty tracking is disabled on any memslot, migration mode is automatically
> > +stopped.
> 
> Do we also need to add a warning to the CMMA IOCTLs?

No, it is already documented there:

> This ioctl can fail with [...] > -EINVAL if KVM_S390_CMMA_PEEK is not set
> but migration mode was not enabled

[...]
> > diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> > index e4890e04b210..4785f002cd93 100644
> > --- a/arch/s390/kvm/kvm-s390.c
> > +++ b/arch/s390/kvm/kvm-s390.c
> > @@ -5628,28 +5628,43 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
> >                                  enum kvm_mr_change change)
> >   {
> >       gpa_t size;
> > +     int rc;
> 
> Not sure why you added rc even though it doesn't need to be used.

You prefer a line which is 100 chars wide over a new variable? OK fine for me.
Janosch Frank Jan. 26, 2023, 9:48 a.m. UTC | #5
On 1/26/23 09:41, Nico Boehr wrote:
> Quoting Janosch Frank (2023-01-25 14:55:59)
>> On 1/20/23 08:54, Nico Boehr wrote:
>>> Migration mode is a VM attribute which enables tracking of changes in
>>> storage attributes (PGSTE). It assumes dirty tracking is enabled on all
>>> memslots to keep a dirty bitmap of pages with changed storage attributes.
>>>
>>> When enabling migration mode, we currently check that dirty tracking is
>>> enabled for all memslots. However, userspace can disable dirty tracking
>>> without disabling migration mode.
>>>
>>> Since migration mode is pointless with dirty tracking disabled, disable
>>> migration mode whenever userspace disables dirty tracking on any slot.
>>
>> Will userspace be able to handle the sudden -EINVAL rcs on
>> KVM_S390_GET_CMMA_BITS and KVM_S390_SET_CMMA_BITS?
> 
> QEMU has proper error handling on the GET_CMMA_BITS code path and will not
> attempt GET_CMMA_BITS after it disabled dirty tracking. So yes, userspace can
> handle this fine. In addition, as mentioned in the commit, it was never allowed
> to have migration mode without dirty tracking. It was checked when migration
> mode is enabled, just wasn't enforced when dirty tracking went off. The
> alternative would be to refuse disabling dirty tracking when migration mode is
> on; and that would _really_ break userspace.
> 
> Or we just leave migration mode on and check on every emulation/ioctl that a
> dirty bitmap is still there, which would change absolutely nothing about the
> return value of GET_CMMA_BITS.
> 
> Or we allocate the dirty bitmap for storage attributes independent of the dirty
> bitmap for pages, which increases memory usage and makes this patch quite a bit
> more complex, risking that we break more than what is already broken.
> 
> This approach really seems like the sane option to me.

Jup, I'm just trying to consider all possibilities to find the best one 
and that includes asking all the questions.

>>>    
>>> +Dirty tracking must be enabled on all memslots, else -EINVAL is returned. When
>>> +dirty tracking is disabled on any memslot, migration mode is automatically
>>> +stopped.
>>
>> Do we also need to add a warning to the CMMA IOCTLs?
> 
> No, it is already documented there:
> 
>> This ioctl can fail with [...] > -EINVAL if KVM_S390_CMMA_PEEK is not set
>> but migration mode was not enabled

That's fine but doesn't include the part where migration mode can 
suddenly be disabled via a memory region change.

If we implicitly change migration mode disablement then we need to 
document that as much as possible to cover all bases.

> 
> [...]
>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>> index e4890e04b210..4785f002cd93 100644
>>> --- a/arch/s390/kvm/kvm-s390.c
>>> +++ b/arch/s390/kvm/kvm-s390.c
>>> @@ -5628,28 +5628,43 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
>>>                                   enum kvm_mr_change change)
>>>    {
>>>        gpa_t size;
>>> +     int rc;
>>
>> Not sure why you added rc even though it doesn't need to be used.
> 
> You prefer a line which is 100 chars wide over a new variable? OK fine for me.

I'm not sure how you manage to get over 100 chars, did I miss something?


if (!kvm->arch.migration_mode)
	return 0;

if ((old->flags & KVM_MEM_LOG_DIRTY_PAGES) &&
     !(new->flags & KVM_MEM_LOG_DIRTY_PAGES)) {
	WARN(kvm_s390_vm_stop_migration(kvm),
	     "Migration mode could not be disabled");

return 0;
diff mbox series

Patch

diff --git a/Documentation/virt/kvm/devices/vm.rst b/Documentation/virt/kvm/devices/vm.rst
index 60acc39e0e93..147efec626e5 100644
--- a/Documentation/virt/kvm/devices/vm.rst
+++ b/Documentation/virt/kvm/devices/vm.rst
@@ -302,6 +302,10 @@  Allows userspace to start migration mode, needed for PGSTE migration.
 Setting this attribute when migration mode is already active will have
 no effects.
 
+Dirty tracking must be enabled on all memslots, else -EINVAL is returned. When
+dirty tracking is disabled on any memslot, migration mode is automatically
+stopped.
+
 :Parameters: none
 :Returns:   -ENOMEM if there is not enough free memory to start migration mode;
 	    -EINVAL if the state of the VM is invalid (e.g. no memory defined);
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index e4890e04b210..4785f002cd93 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -5628,28 +5628,43 @@  int kvm_arch_prepare_memory_region(struct kvm *kvm,
 				   enum kvm_mr_change change)
 {
 	gpa_t size;
+	int rc;
 
 	/* When we are protected, we should not change the memory slots */
 	if (kvm_s390_pv_get_handle(kvm))
 		return -EINVAL;
 
-	if (change == KVM_MR_DELETE || change == KVM_MR_FLAGS_ONLY)
-		return 0;
+	if (change != KVM_MR_DELETE && change != KVM_MR_FLAGS_ONLY) {
+		/* A few sanity checks. We can have memory slots which have to be
+		 * located/ended at a segment boundary (1MB). The memory in userland is
+		 * ok to be fragmented into various different vmas. It is okay to mmap()
+		 * and munmap() stuff in this slot after doing this call at any time
+		 */
 
-	/* A few sanity checks. We can have memory slots which have to be
-	   located/ended at a segment boundary (1MB). The memory in userland is
-	   ok to be fragmented into various different vmas. It is okay to mmap()
-	   and munmap() stuff in this slot after doing this call at any time */
+		if (new->userspace_addr & 0xffffful)
+			return -EINVAL;
 
-	if (new->userspace_addr & 0xffffful)
-		return -EINVAL;
+		size = new->npages * PAGE_SIZE;
+		if (size & 0xffffful)
+			return -EINVAL;
 
-	size = new->npages * PAGE_SIZE;
-	if (size & 0xffffful)
-		return -EINVAL;
+		if ((new->base_gfn * PAGE_SIZE) + size > kvm->arch.mem_limit)
+			return -EINVAL;
+	}
 
-	if ((new->base_gfn * PAGE_SIZE) + size > kvm->arch.mem_limit)
-		return -EINVAL;
+	/* Turn off migration mode when userspace disables dirty page logging.
+	 * Migration mode expects dirty page logging being enabled to store
+	 * its dirty bitmap.
+	 */
+	if (kvm->arch.migration_mode) {
+		if ((old->flags & KVM_MEM_LOG_DIRTY_PAGES) &&
+		    !(new->flags & KVM_MEM_LOG_DIRTY_PAGES)) {
+			rc = kvm_s390_vm_stop_migration(kvm);
+
+			if (rc)
+				pr_warn("Failed to stop migration mode\n");
+		}
+	}
 
 	return 0;
 }