diff mbox

KVM: use slowpath for cross page cached accesses

Message ID 20150408121648.GA3519@potion.brq.redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Radim Krčmář April 8, 2015, 12:16 p.m. UTC
2015-04-08 12:43+0200, Paolo Bonzini:
> On 08/04/2015 11:26, Radim Kr?má? wrote:
>> 2015-04-08 10:49+0200, Paolo Bonzini:
>>> On 07/04/2015 22:34, Radim Kr?má? wrote:
>>>> We dirtied only one page because writes originally couldn't span more.
>>>> Use improved syntax for '>> PAGE_SHIFT' while at it.
>>>>
>>>> Fixes: 8f964525a121 ("KVM: Allow cross page reads and writes from cached translations.")
>>>> Signed-off-by: Radim Kr?má? <rkrcmar@redhat.com>
>>>
>>> Cross-page reads and writes should never get here; they have
>>> ghc->memslot set to NULL and go through the slow path in kvm_write_guest.
>> 
>> Only cross-memslot writes have NULL memslot.
> 
> The power of wrong comments...
> 
> Considering how kvm_gfn_to_hva_cache_init is used (one 1-byte field, two
> 4-byte fields, one 28-bytes struct that is 32-bytes aligned, one
> 32-bytes field that is in practice cacheline-aligned), I wonder if we
> should just use ghc->memslot = NULL for cross page writes.  This would
> bypass the bug you are fixing here, and avoid worries about partial writes.

Good idea, and it could make those comments right :)
(Though in general, I prefer less constraints on APIs ...)

Partial writes would be a pain;  copy_to_user API does not define which
bytes were not written.  I think the write can't fail mid-page, which
makes our implementation ok, but I still worry a bit about it.

Anyway, here's the patch:

---8<---
kvm_write_guest_cached() does not mark all written pages as dirty and
code comments in kvm_gfn_to_hva_cache_init() talk about NULL memslot
with cross page accesses.  Fix all the easy way.

The check is '<= 1' to have the same result for 'len = 0' cache anywhere
in the page.  (nr_pages_needed is 0 on page boundary.)

Fixes: 8f964525a121 ("KVM: Allow cross page reads and writes from cached translations.")
Signed-off-by: Radim Kr?má? <rkrcmar@redhat.com>
---
 virt/kvm/kvm_main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Paolo Bonzini April 8, 2015, 12:23 p.m. UTC | #1
On 08/04/2015 14:16, Radim Kr?má? wrote:
> 2015-04-08 12:43+0200, Paolo Bonzini:
>> On 08/04/2015 11:26, Radim Kr?má? wrote:
>>> Only cross-memslot writes have NULL memslot.
>>
>> The power of wrong comments...
>>
>> Considering how kvm_gfn_to_hva_cache_init is used (one 1-byte field, two
>> 4-byte fields, one 28-bytes struct that is 32-bytes aligned, one
>> 32-bytes field that is in practice cacheline-aligned), I wonder if we
>> should just use ghc->memslot = NULL for cross page writes.  This would
>> bypass the bug you are fixing here, and avoid worries about partial writes.
> 
> Good idea, and it could make those comments right :)
> (Though in general, I prefer less constraints on APIs ...)

It doesn't put constraints, it still handles cross page writes right
(just slower).  copy_to_user in some sense is the API that constrains us
to do this.

> Partial writes would be a pain;  copy_to_user API does not define which
> bytes were not written.  I think the write can't fail mid-page, which
> makes our implementation ok

No, writes can't fail mid-page (I guess in atomic context it's
theoretically possible, but we're equipped to handle the failure in that
case).

Patch applied, thanks!

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wanpeng Li April 9, 2015, 12:18 a.m. UTC | #2
On Wed, Apr 08, 2015 at 02:16:48PM +0200, Radim Kr?má? wrote:
>2015-04-08 12:43+0200, Paolo Bonzini:
>> On 08/04/2015 11:26, Radim Kr?má? wrote:
>>> 2015-04-08 10:49+0200, Paolo Bonzini:
>>>> On 07/04/2015 22:34, Radim Kr?má? wrote:
>>>>> We dirtied only one page because writes originally couldn't span more.
>>>>> Use improved syntax for '>> PAGE_SHIFT' while at it.
>>>>>
>>>>> Fixes: 8f964525a121 ("KVM: Allow cross page reads and writes from cached translations.")
>>>>> Signed-off-by: Radim Kr?má? <rkrcmar@redhat.com>
>>>>
>>>> Cross-page reads and writes should never get here; they have
>>>> ghc->memslot set to NULL and go through the slow path in kvm_write_guest.
>>> 
>>> Only cross-memslot writes have NULL memslot.
>> 
>> The power of wrong comments...
>> 
>> Considering how kvm_gfn_to_hva_cache_init is used (one 1-byte field, two
>> 4-byte fields, one 28-bytes struct that is 32-bytes aligned, one
>> 32-bytes field that is in practice cacheline-aligned), I wonder if we
>> should just use ghc->memslot = NULL for cross page writes.  This would
>> bypass the bug you are fixing here, and avoid worries about partial writes.
>
>Good idea, and it could make those comments right :)
>(Though in general, I prefer less constraints on APIs ...)
>
>Partial writes would be a pain;  copy_to_user API does not define which
>bytes were not written.  I think the write can't fail mid-page, which
>makes our implementation ok, but I still worry a bit about it.
>
>Anyway, here's the patch:
>
>---8<---
>kvm_write_guest_cached() does not mark all written pages as dirty and
>code comments in kvm_gfn_to_hva_cache_init() talk about NULL memslot
>with cross page accesses.  Fix all the easy way.
>
>The check is '<= 1' to have the same result for 'len = 0' cache anywhere
>in the page.  (nr_pages_needed is 0 on page boundary.)
>
>Fixes: 8f964525a121 ("KVM: Allow cross page reads and writes from cached translations.")
>Signed-off-by: Radim Kr?má? <rkrcmar@redhat.com>

Reviewed-by: Wanpeng Li <wanpeng.li@linux.intel.com>

>---
> virt/kvm/kvm_main.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
>diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>index aadef264bed1..f3dc641f9640 100644
>--- a/virt/kvm/kvm_main.c
>+++ b/virt/kvm/kvm_main.c
>@@ -1637,8 +1637,8 @@ int kvm_gfn_to_hva_cache_init(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
> 	ghc->generation = slots->generation;
> 	ghc->len = len;
> 	ghc->memslot = gfn_to_memslot(kvm, start_gfn);
>-	ghc->hva = gfn_to_hva_many(ghc->memslot, start_gfn, &nr_pages_avail);
>-	if (!kvm_is_error_hva(ghc->hva) && nr_pages_avail >= nr_pages_needed) {
>+	ghc->hva = gfn_to_hva_many(ghc->memslot, start_gfn, NULL);
>+	if (!kvm_is_error_hva(ghc->hva) && nr_pages_needed <= 1) {
> 		ghc->hva += offset;
> 	} else {
> 		/*
>--
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index aadef264bed1..f3dc641f9640 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1637,8 +1637,8 @@  int kvm_gfn_to_hva_cache_init(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
 	ghc->generation = slots->generation;
 	ghc->len = len;
 	ghc->memslot = gfn_to_memslot(kvm, start_gfn);
-	ghc->hva = gfn_to_hva_many(ghc->memslot, start_gfn, &nr_pages_avail);
-	if (!kvm_is_error_hva(ghc->hva) && nr_pages_avail >= nr_pages_needed) {
+	ghc->hva = gfn_to_hva_many(ghc->memslot, start_gfn, NULL);
+	if (!kvm_is_error_hva(ghc->hva) && nr_pages_needed <= 1) {
 		ghc->hva += offset;
 	} else {
 		/*