diff mbox series

[v1,2/2] KVM: s390: fix KVM_S390_GET_CMMA_BITS for GFNs in memslot holes

Message ID 20230208144827.131300-2-nrb@linux.ibm.com (mailing list archive)
State Accepted
Commit 285cff4c0454340a4dc53f46e67f2cb1c293bd74
Headers show
Series KVM: s390: CMMA migration selftest and small bugfix | expand

Commit Message

Nico Boehr Feb. 8, 2023, 2:48 p.m. UTC
The KVM_S390_GET_CMMA_BITS ioctl may return incorrect values when userspace
specifies a start_gfn outside of memslots.

This can occur when a VM has multiple memslots with a hole in between:

+-----+----------+--------+--------+
| ... | Slot N-1 | <hole> | Slot N |
+-----+----------+--------+--------+
      ^          ^        ^        ^
      |          |        |        |
GFN   A          A+B      |        |
                          A+B+C    |
			           A+B+C+D

When userspace specifies a GFN in [A+B, A+B+C), it would expect to get the
CMMA values of the first dirty page in Slot N. However, userspace may get a
start_gfn of A+B+C+D with a count of 0, hence completely skipping over any
dirty pages in slot N.

The error is in kvm_s390_next_dirty_cmma(), which assumes
gfn_to_memslot_approx() will return the memslot _below_ the specified GFN
when the specified GFN lies outside a memslot. In reality it may return
either the memslot below or above the specified GFN.

When a memslot above the specified GFN is returned this happens:

- ofs is calculated, but since the memslot's base_gfn is larger than the
  specified cur_gfn, ofs will underflow to a huge number.
- ofs is passed to find_next_bit(). Since ofs will exceed the memslot's
  number of pages, the number of pages in the memslot is returned,
  completely skipping over all bits in the memslot userspace would be
  interested in.

Fix this by resetting ofs to zero when a memslot _above_ cur_gfn is
returned (cur_gfn < ms->base_gfn).

Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
---
 arch/s390/kvm/kvm-s390.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Claudio Imbrenda March 23, 2023, 12:36 p.m. UTC | #1
On Wed,  8 Feb 2023 15:48:26 +0100
Nico Boehr <nrb@linux.ibm.com> wrote:

> The KVM_S390_GET_CMMA_BITS ioctl may return incorrect values when userspace
> specifies a start_gfn outside of memslots.
> 
> This can occur when a VM has multiple memslots with a hole in between:
> 
> +-----+----------+--------+--------+
> | ... | Slot N-1 | <hole> | Slot N |
> +-----+----------+--------+--------+
>       ^          ^        ^        ^
>       |          |        |        |
> GFN   A          A+B      |        |
>                           A+B+C    |
> 			           A+B+C+D
> 
> When userspace specifies a GFN in [A+B, A+B+C), it would expect to get the
> CMMA values of the first dirty page in Slot N. However, userspace may get a
> start_gfn of A+B+C+D with a count of 0, hence completely skipping over any
> dirty pages in slot N.
> 
> The error is in kvm_s390_next_dirty_cmma(), which assumes
> gfn_to_memslot_approx() will return the memslot _below_ the specified GFN
> when the specified GFN lies outside a memslot. In reality it may return
> either the memslot below or above the specified GFN.
> 
> When a memslot above the specified GFN is returned this happens:
> 
> - ofs is calculated, but since the memslot's base_gfn is larger than the
>   specified cur_gfn, ofs will underflow to a huge number.
> - ofs is passed to find_next_bit(). Since ofs will exceed the memslot's
>   number of pages, the number of pages in the memslot is returned,
>   completely skipping over all bits in the memslot userspace would be
>   interested in.
> 
> Fix this by resetting ofs to zero when a memslot _above_ cur_gfn is
> returned (cur_gfn < ms->base_gfn).
> 
> Signed-off-by: Nico Boehr <nrb@linux.ibm.com>

Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>

> ---
>  arch/s390/kvm/kvm-s390.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index e4890e04b210..a171a66681b4 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -2158,6 +2158,10 @@ static unsigned long kvm_s390_next_dirty_cmma(struct kvm_memslots *slots,
>  		ms = container_of(mnode, struct kvm_memory_slot, gfn_node[slots->node_idx]);
>  		ofs = 0;
>  	}
> +
> +	if (cur_gfn < ms->base_gfn)
> +		ofs = 0;
> +
>  	ofs = find_next_bit(kvm_second_dirty_bitmap(ms), ms->npages, ofs);
>  	while (ofs >= ms->npages && (mnode = rb_next(mnode))) {
>  		ms = container_of(mnode, struct kvm_memory_slot, gfn_node[slots->node_idx]);
diff mbox series

Patch

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index e4890e04b210..a171a66681b4 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2158,6 +2158,10 @@  static unsigned long kvm_s390_next_dirty_cmma(struct kvm_memslots *slots,
 		ms = container_of(mnode, struct kvm_memory_slot, gfn_node[slots->node_idx]);
 		ofs = 0;
 	}
+
+	if (cur_gfn < ms->base_gfn)
+		ofs = 0;
+
 	ofs = find_next_bit(kvm_second_dirty_bitmap(ms), ms->npages, ofs);
 	while (ofs >= ms->npages && (mnode = rb_next(mnode))) {
 		ms = container_of(mnode, struct kvm_memory_slot, gfn_node[slots->node_idx]);