diff mbox

[v3,05/11] KVM: page track: introduce kvm_page_track_{add,remove}_page

Message ID 1455449503-20993-6-git-send-email-guangrong.xiao@linux.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Xiao Guangrong Feb. 14, 2016, 11:31 a.m. UTC
These two functions are the user APIs:
- kvm_page_track_add_page(): add the page to the tracking pool after
  that later specified access on that page will be tracked

- kvm_page_track_remove_page(): remove the page from the tracking pool,
  the specified access on the page is not tracked after the last user is
  gone

Both of these are called under the protection of kvm->srcu or
kvm->slots_lock

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 arch/x86/include/asm/kvm_page_track.h |  13 ++++
 arch/x86/kvm/page_track.c             | 124 ++++++++++++++++++++++++++++++++++
 2 files changed, 137 insertions(+)

Comments

Paolo Bonzini Feb. 19, 2016, 11:37 a.m. UTC | #1
On 14/02/2016 12:31, Xiao Guangrong wrote:
> +	/* does tracking count wrap? */
> +	WARN_ON((count > 0) && (val + count < val));

This doesn't work, because "val + count" is an int.

> +	/* the last tracker has already gone? */
> +	WARN_ON((count < 0) && (val < !count));

Also, here any underflow should warn.

You can actually use the fact that val + count is an int like this:

    WARN_ON(val + count < 0 || val + count > USHRT_MAX)

and also please return if the warning fires.

> +void kvm_page_track_add_page(struct kvm *kvm, gfn_t gfn,
> +			     enum kvm_page_track_mode mode)
> +{
> +	struct kvm_memslots *slots;
> +	struct kvm_memory_slot *slot;
> +	int i;
> +
> +	for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
> +		slots = __kvm_memslots(kvm, i);
> +
> +		slot = __gfn_to_memslot(slots, gfn);
> +		if (!slot)
> +			continue;
> +
> +		spin_lock(&kvm->mmu_lock);
> +		kvm_slot_page_track_add_page_nolock(kvm, slot, gfn, mode);
> +		spin_unlock(&kvm->mmu_lock);
> +	}
> +}

I don't think it is right to walk all address spaces.  The good news is
that you're not using kvm_page_track_{add,remove}_page at all as far as
I can see, so you can just remove them.

Also, when you will need it, I think it's better to move the
spin_lock/spin_unlock pair outside the for loop.  With this change,
perhaps it's better to leave it to the caller completely---but I cannot
say until I see the caller.

In the meanwhile, please leave out _nolock from the other functions' name.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paolo Bonzini Feb. 19, 2016, 11:37 a.m. UTC | #2
On 14/02/2016 12:31, Xiao Guangrong wrote:
> +static bool check_mode(enum kvm_page_track_mode mode)
> +{
> +	if (mode < 0 || mode >= KVM_PAGE_TRACK_MAX)
> +		return false;
> +
> +	return true;
> +}

Oops, forgot about this; please rename to page_track_mode_is_valid and
make it "static inline".

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xiao Guangrong Feb. 23, 2016, 4:18 a.m. UTC | #3
On 02/19/2016 07:37 PM, Paolo Bonzini wrote:
>
>
> On 14/02/2016 12:31, Xiao Guangrong wrote:
>> +	/* does tracking count wrap? */
>> +	WARN_ON((count > 0) && (val + count < val));
>
> This doesn't work, because "val + count" is an int.

val is 'unsigned short val' and count is 'short', so
'val + count' is not int...

>
>> +	/* the last tracker has already gone? */
>> +	WARN_ON((count < 0) && (val < !count));
>
> Also, here any underflow should warn.
>
> You can actually use the fact that val + count is an int like this:
>
>      WARN_ON(val + count < 0 || val + count > USHRT_MAX)
>

It looks nice, i will change the type of val to int to simplify the
code.

> and also please return if the warning fires.
>

Okay.

>> +void kvm_page_track_add_page(struct kvm *kvm, gfn_t gfn,
>> +			     enum kvm_page_track_mode mode)
>> +{
>> +	struct kvm_memslots *slots;
>> +	struct kvm_memory_slot *slot;
>> +	int i;
>> +
>> +	for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
>> +		slots = __kvm_memslots(kvm, i);
>> +
>> +		slot = __gfn_to_memslot(slots, gfn);
>> +		if (!slot)
>> +			continue;
>> +
>> +		spin_lock(&kvm->mmu_lock);
>> +		kvm_slot_page_track_add_page_nolock(kvm, slot, gfn, mode);
>> +		spin_unlock(&kvm->mmu_lock);
>> +	}
>> +}
>
> I don't think it is right to walk all address spaces.  The good news is

Then we can not track the page in SMM mode, but i think it is not a big
problem as SMM is invisible to OS (it is expected to not hurting OS) and
current shadow page in normal spaces can not reflect the changes happend
in SMM either. So it is okay to only take normal space into account.

> that you're not using kvm_page_track_{add,remove}_page at all as far as
> I can see, so you can just remove them.

kvm_page_track_{add,remove}_page, which hides the mmu specifics (e.g. slot,
mmu-lock, etc.) and are exported as APIs for other users, are just the
small wrappers of kvm_slot_page_track_{add,remove}_page_nolock and the
nolock functions are used in the later patch.

If you think it is not a good time to export these APIs, i am okay to export
_nolock functions only in the next version.

>
> Also, when you will need it, I think it's better to move the
> spin_lock/spin_unlock pair outside the for loop.  With this change,
> perhaps it's better to leave it to the caller completely---but I cannot
> say until I see the caller.

I will remove page tracking in SMM address space, so this is no loop in
the next version. ;)

>
> In the meanwhile, please leave out _nolock from the other functions' name.

I just wanted to warn the user that these functions are not safe as they
are not protected by mmu-lock. I will remove these hints if you dislike them.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xiao Guangrong Feb. 23, 2016, 4:18 a.m. UTC | #4
On 02/19/2016 07:37 PM, Paolo Bonzini wrote:
>
>
> On 14/02/2016 12:31, Xiao Guangrong wrote:
>> +static bool check_mode(enum kvm_page_track_mode mode)
>> +{
>> +	if (mode < 0 || mode >= KVM_PAGE_TRACK_MAX)
>> +		return false;
>> +
>> +	return true;
>> +}
>
> Oops, forgot about this; please rename to page_track_mode_is_valid and
> make it "static inline".

Sure, it looks good to me. :)
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paolo Bonzini Feb. 23, 2016, 2:15 p.m. UTC | #5
On 23/02/2016 05:18, Xiao Guangrong wrote:
> 
> 
> On 02/19/2016 07:37 PM, Paolo Bonzini wrote:
>>
>>
>> On 14/02/2016 12:31, Xiao Guangrong wrote:
>>> +    /* does tracking count wrap? */
>>> +    WARN_ON((count > 0) && (val + count < val));
>>
>> This doesn't work, because "val + count" is an int.
> 
> val is 'unsigned short val' and count is 'short', so
> 'val + count' is not int...

Actually, it is.  "val" and "count" are both promoted to int, and the
result of "val + count" is an int!


>>> +void kvm_page_track_add_page(struct kvm *kvm, gfn_t gfn,
>>> +                 enum kvm_page_track_mode mode)
>>> +{
>>> +    struct kvm_memslots *slots;
>>> +    struct kvm_memory_slot *slot;
>>> +    int i;
>>> +
>>> +    for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
>>> +        slots = __kvm_memslots(kvm, i);
>>> +
>>> +        slot = __gfn_to_memslot(slots, gfn);
>>> +        if (!slot)
>>> +            continue;
>>> +
>>> +        spin_lock(&kvm->mmu_lock);
>>> +        kvm_slot_page_track_add_page_nolock(kvm, slot, gfn, mode);
>>> +        spin_unlock(&kvm->mmu_lock);
>>> +    }
>>> +}
>>
>> I don't think it is right to walk all address spaces.  The good news is
> 
> Then we can not track the page in SMM mode, but i think it is not a big
> problem as SMM is invisible to OS (it is expected to not hurting OS) and
> current shadow page in normal spaces can not reflect the changes happend
> in SMM either. So it is okay to only take normal space into account.

I think which address space to track depends on the scenario where
you're using page tracking.  For example, in the shadow case you only
track either SMM or non-SMM depending on the CPU's mode.

For KVM-GT you probably need to track only non-SMM.

>> that you're not using kvm_page_track_{add,remove}_page at all as far as
>> I can see, so you can just remove them.
> 
> kvm_page_track_{add,remove}_page, which hides the mmu specifics (e.g. slot,
> mmu-lock, etc.) and are exported as APIs for other users, are just the
> small wrappers of kvm_slot_page_track_{add,remove}_page_nolock and the
> nolock functions are used in the later patch.
> 
> If you think it is not a good time to export these APIs, i am okay to
> export _nolock functions only in the next version.

Yes, please.

>> Also, when you will need it, I think it's better to move the
>> spin_lock/spin_unlock pair outside the for loop.  With this change,
>> perhaps it's better to leave it to the caller completely---but I cannot
>> say until I see the caller.
> 
> I will remove page tracking in SMM address space, so this is no loop in
> the next version. ;)

Instead please just remove the functions completely.  Functions without
a caller are unnecessary.

>> In the meanwhile, please leave out _nolock from the other functions'
>> name.
> 
> I just wanted to warn the user that these functions are not safe as they
> are not protected by mmu-lock. I will remove these hints if you dislike
> them.

I think there's several other functions that are not protected by
mmu-lock.  You can instead add kerneldoc comments and mention the
locking requirements there.

The common convention in the kernel is to have function take the lock
and call __function.  In this case however there is no "locked" function
yet; if it comes later, we will rename the functions to add "__" in front.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/include/asm/kvm_page_track.h b/arch/x86/include/asm/kvm_page_track.h
index 55200406..c010124 100644
--- a/arch/x86/include/asm/kvm_page_track.h
+++ b/arch/x86/include/asm/kvm_page_track.h
@@ -10,4 +10,17 @@  void kvm_page_track_free_memslot(struct kvm_memory_slot *free,
 				 struct kvm_memory_slot *dont);
 int kvm_page_track_create_memslot(struct kvm_memory_slot *slot,
 				  unsigned long npages);
+
+void
+kvm_slot_page_track_add_page_nolock(struct kvm *kvm,
+				    struct kvm_memory_slot *slot, gfn_t gfn,
+				    enum kvm_page_track_mode mode);
+void kvm_page_track_add_page(struct kvm *kvm, gfn_t gfn,
+			     enum kvm_page_track_mode mode);
+void kvm_slot_page_track_remove_page_nolock(struct kvm *kvm,
+					    struct kvm_memory_slot *slot,
+					    gfn_t gfn,
+					    enum kvm_page_track_mode mode);
+void kvm_page_track_remove_page(struct kvm *kvm, gfn_t gfn,
+				enum kvm_page_track_mode mode);
 #endif
diff --git a/arch/x86/kvm/page_track.c b/arch/x86/kvm/page_track.c
index 8c396d0..e17efe9 100644
--- a/arch/x86/kvm/page_track.c
+++ b/arch/x86/kvm/page_track.c
@@ -50,3 +50,127 @@  track_free:
 	kvm_page_track_free_memslot(slot, NULL);
 	return -ENOMEM;
 }
+
+static bool check_mode(enum kvm_page_track_mode mode)
+{
+	if (mode < 0 || mode >= KVM_PAGE_TRACK_MAX)
+		return false;
+
+	return true;
+}
+
+static void update_gfn_track(struct kvm_memory_slot *slot, gfn_t gfn,
+			     enum kvm_page_track_mode mode, short count)
+{
+	int index;
+	unsigned short val;
+
+	index = gfn_to_index(gfn, slot->base_gfn, PT_PAGE_TABLE_LEVEL);
+
+	val = slot->arch.gfn_track[mode][index];
+
+	/* does tracking count wrap? */
+	WARN_ON((count > 0) && (val + count < val));
+	/* the last tracker has already gone? */
+	WARN_ON((count < 0) && (val < !count));
+
+	slot->arch.gfn_track[mode][index] += count;
+}
+
+void
+kvm_slot_page_track_add_page_nolock(struct kvm *kvm,
+				    struct kvm_memory_slot *slot, gfn_t gfn,
+				    enum kvm_page_track_mode mode)
+{
+
+	WARN_ON(!check_mode(mode));
+
+	update_gfn_track(slot, gfn, mode, 1);
+
+	/*
+	 * new track stops large page mapping for the
+	 * tracked page.
+	 */
+	kvm_mmu_gfn_disallow_lpage(slot, gfn);
+
+	if (mode == KVM_PAGE_TRACK_WRITE)
+		if (kvm_mmu_slot_gfn_write_protect(kvm, slot, gfn))
+			kvm_flush_remote_tlbs(kvm);
+}
+
+/*
+ * add guest page to the tracking pool so that corresponding access on that
+ * page will be intercepted.
+ *
+ * It should be called under the protection of kvm->srcu or kvm->slots_lock
+ *
+ * @kvm: the guest instance we are interested in.
+ * @gfn: the guest page.
+ * @mode: tracking mode, currently only write track is supported.
+ */
+void kvm_page_track_add_page(struct kvm *kvm, gfn_t gfn,
+			     enum kvm_page_track_mode mode)
+{
+	struct kvm_memslots *slots;
+	struct kvm_memory_slot *slot;
+	int i;
+
+	for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
+		slots = __kvm_memslots(kvm, i);
+
+		slot = __gfn_to_memslot(slots, gfn);
+		if (!slot)
+			continue;
+
+		spin_lock(&kvm->mmu_lock);
+		kvm_slot_page_track_add_page_nolock(kvm, slot, gfn, mode);
+		spin_unlock(&kvm->mmu_lock);
+	}
+}
+
+void kvm_slot_page_track_remove_page_nolock(struct kvm *kvm,
+					    struct kvm_memory_slot *slot,
+					    gfn_t gfn,
+					    enum kvm_page_track_mode mode)
+{
+	WARN_ON(!check_mode(mode));
+
+	update_gfn_track(slot, gfn, mode, -1);
+
+	/*
+	 * allow large page mapping for the tracked page
+	 * after the tracker is gone.
+	 */
+	kvm_mmu_gfn_allow_lpage(slot, gfn);
+}
+
+/*
+ * remove the guest page from the tracking pool which stops the interception
+ * of corresponding access on that page. It is the opposed operation of
+ * kvm_page_track_add_page().
+ *
+ * It should be called under the protection of kvm->srcu or kvm->slots_lock
+ *
+ * @kvm: the guest instance we are interested in.
+ * @gfn: the guest page.
+ * @mode: tracking mode, currently only write track is supported.
+ */
+void kvm_page_track_remove_page(struct kvm *kvm, gfn_t gfn,
+				enum kvm_page_track_mode mode)
+{
+	struct kvm_memslots *slots;
+	struct kvm_memory_slot *slot;
+	int i;
+
+	for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
+		slots = __kvm_memslots(kvm, i);
+
+		slot = __gfn_to_memslot(slots, gfn);
+		if (!slot)
+			continue;
+
+		spin_lock(&kvm->mmu_lock);
+		kvm_slot_page_track_remove_page_nolock(kvm, slot, gfn, mode);
+		spin_unlock(&kvm->mmu_lock);
+	}
+}