diff mbox

KVM: MMU: improve n_max_mmu_pages calculation with TDP

Message ID 20130320201420.GA17347@amt.cnet (mailing list archive)
State New, archived
Headers show

Commit Message

Marcelo Tosatti March 20, 2013, 8:14 p.m. UTC
kvm_mmu_calculate_mmu_pages numbers, 

maximum number of shadow pages = 2% of mapped guest pages

Does not make sense for TDP guests where mapping all of guest
memory with 4k pages cannot exceed "mapped guest pages / 512"
(not counting root pages).

Allow that maximum for TDP, forcing the guest to recycle otherwise.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Xiao Guangrong March 21, 2013, 5:41 a.m. UTC | #1
On 03/21/2013 04:14 AM, Marcelo Tosatti wrote:
> 
> kvm_mmu_calculate_mmu_pages numbers, 
> 
> maximum number of shadow pages = 2% of mapped guest pages
> 
> Does not make sense for TDP guests where mapping all of guest
> memory with 4k pages cannot exceed "mapped guest pages / 512"
> (not counting root pages).
> 
> Allow that maximum for TDP, forcing the guest to recycle otherwise.
> 
> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> 
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 956ca35..a9694a8d7 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -4293,7 +4293,7 @@ nomem:
>  unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm)
>  {
>  	unsigned int nr_mmu_pages;
> -	unsigned int  nr_pages = 0;
> +	unsigned int i, nr_pages = 0;
>  	struct kvm_memslots *slots;
>  	struct kvm_memory_slot *memslot;
> 
> @@ -4302,7 +4302,19 @@ unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm)
>  	kvm_for_each_memslot(memslot, slots)
>  		nr_pages += memslot->npages;
> 
> -	nr_mmu_pages = nr_pages * KVM_PERMILLE_MMU_PAGES / 1000;
> +	if (tdp_enabled) {
> +		/* one root page */
> +		nr_mmu_pages = 1;
> +		/* nr_pages / (512^i) per level, due to
> +		 * guest RAM map being linear */
> +		for (i = 1; i < 4; i++) {
> +			int nr_pages_round = nr_pages + (1 << (9*i));
> +			nr_mmu_pages += nr_pages_round >> (9*i);
> +		}

Marcelo,

Can it work if nested guest is used? Did you see any problem in practice (direct guest
uses more memory than your calculation)?

And mmio also can build some page table that looks like not considered
in this patch.


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marcelo Tosatti March 21, 2013, 2:29 p.m. UTC | #2
On Thu, Mar 21, 2013 at 01:41:59PM +0800, Xiao Guangrong wrote:
> On 03/21/2013 04:14 AM, Marcelo Tosatti wrote:
> > 
> > kvm_mmu_calculate_mmu_pages numbers, 
> > 
> > maximum number of shadow pages = 2% of mapped guest pages
> > 
> > Does not make sense for TDP guests where mapping all of guest
> > memory with 4k pages cannot exceed "mapped guest pages / 512"
> > (not counting root pages).
> > 
> > Allow that maximum for TDP, forcing the guest to recycle otherwise.
> > 
> > Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> > 
> > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> > index 956ca35..a9694a8d7 100644
> > --- a/arch/x86/kvm/mmu.c
> > +++ b/arch/x86/kvm/mmu.c
> > @@ -4293,7 +4293,7 @@ nomem:
> >  unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm)
> >  {
> >  	unsigned int nr_mmu_pages;
> > -	unsigned int  nr_pages = 0;
> > +	unsigned int i, nr_pages = 0;
> >  	struct kvm_memslots *slots;
> >  	struct kvm_memory_slot *memslot;
> > 
> > @@ -4302,7 +4302,19 @@ unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm)
> >  	kvm_for_each_memslot(memslot, slots)
> >  		nr_pages += memslot->npages;
> > 
> > -	nr_mmu_pages = nr_pages * KVM_PERMILLE_MMU_PAGES / 1000;
> > +	if (tdp_enabled) {
> > +		/* one root page */
> > +		nr_mmu_pages = 1;
> > +		/* nr_pages / (512^i) per level, due to
> > +		 * guest RAM map being linear */
> > +		for (i = 1; i < 4; i++) {
> > +			int nr_pages_round = nr_pages + (1 << (9*i));
> > +			nr_mmu_pages += nr_pages_round >> (9*i);
> > +		}
> 
> Marcelo,
> 
> Can it work if nested guest is used? Did you see any problem in practice (direct guest
> uses more memory than your calculation)?

Direct guest can use more than the calculation by switching between
different paging modes.

About nested guest: at one point in time the working set cannot exceed 
the number of physical pages visible by the guest.

Allowing an excessively high number of shadow pages is a security
concern, also, as unpreemptable long operations are necessary to tear
down the pages.

> And mmio also can build some page table that looks like not considered
> in this patch.

Right, but its only a few pages. Same argument as above: working set at
one given time is smaller than total RAM. Do you see any potential
problem?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xiao Guangrong March 22, 2013, 3 a.m. UTC | #3
On 03/21/2013 10:29 PM, Marcelo Tosatti wrote:
> On Thu, Mar 21, 2013 at 01:41:59PM +0800, Xiao Guangrong wrote:
>> On 03/21/2013 04:14 AM, Marcelo Tosatti wrote:
>>>
>>> kvm_mmu_calculate_mmu_pages numbers, 
>>>
>>> maximum number of shadow pages = 2% of mapped guest pages
>>>
>>> Does not make sense for TDP guests where mapping all of guest
>>> memory with 4k pages cannot exceed "mapped guest pages / 512"
>>> (not counting root pages).
>>>
>>> Allow that maximum for TDP, forcing the guest to recycle otherwise.
>>>
>>> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
>>>
>>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>>> index 956ca35..a9694a8d7 100644
>>> --- a/arch/x86/kvm/mmu.c
>>> +++ b/arch/x86/kvm/mmu.c
>>> @@ -4293,7 +4293,7 @@ nomem:
>>>  unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm)
>>>  {
>>>  	unsigned int nr_mmu_pages;
>>> -	unsigned int  nr_pages = 0;
>>> +	unsigned int i, nr_pages = 0;
>>>  	struct kvm_memslots *slots;
>>>  	struct kvm_memory_slot *memslot;
>>>
>>> @@ -4302,7 +4302,19 @@ unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm)
>>>  	kvm_for_each_memslot(memslot, slots)
>>>  		nr_pages += memslot->npages;
>>>
>>> -	nr_mmu_pages = nr_pages * KVM_PERMILLE_MMU_PAGES / 1000;
>>> +	if (tdp_enabled) {
>>> +		/* one root page */
>>> +		nr_mmu_pages = 1;
>>> +		/* nr_pages / (512^i) per level, due to
>>> +		 * guest RAM map being linear */
>>> +		for (i = 1; i < 4; i++) {
>>> +			int nr_pages_round = nr_pages + (1 << (9*i));
>>> +			nr_mmu_pages += nr_pages_round >> (9*i);
>>> +		}
>>
>> Marcelo,
>>
>> Can it work if nested guest is used? Did you see any problem in practice (direct guest
>> uses more memory than your calculation)?
> 
> Direct guest can use more than the calculation by switching between
> different paging modes.

I mean guest runs on hardmmu (tdp is used but no nested guest). Its only
use one page table and seems can not use more memory than your calculation
(except some mmio page tables).

So, you calculation is only used to limit memory used if tdp + nested guest?

> 
> About nested guest: at one point in time the working set cannot exceed 
> the number of physical pages visible by the guest.

But it can cause lots of #PF, it is the nightmare for performance, no?

> 
> Allowing an excessively high number of shadow pages is a security

The security concern means "optimization memory usage"? Or something else?

> concern, also, as unpreemptable long operations are necessary to tear
> down the pages.

You mean limiting the shadow pages to let some patch run faster like
remove-write-access and zap-all-sp etc.? If yes, we can directly optimize
for these paths, this is more effective i think.

> 
>> And mmio also can build some page table that looks like not considered
>> in this patch.
> 
> Right, but its only a few pages. Same argument as above: working set at
> one given time is smaller than total RAM. Do you see any potential
> problem?

Marcelo, I just confused whether the limitation is reasonable, as i said,
the limitation is not effective enough on hardmmu-only guest (no nested).
and it seems too low for nested guests.


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marcelo Tosatti March 22, 2013, 10:31 a.m. UTC | #4
On Fri, Mar 22, 2013 at 11:00:28AM +0800, Xiao Guangrong wrote:
> On 03/21/2013 10:29 PM, Marcelo Tosatti wrote:
> > On Thu, Mar 21, 2013 at 01:41:59PM +0800, Xiao Guangrong wrote:
> >> On 03/21/2013 04:14 AM, Marcelo Tosatti wrote:
> >>>
> >>> kvm_mmu_calculate_mmu_pages numbers, 
> >>>
> >>> maximum number of shadow pages = 2% of mapped guest pages
> >>>
> >>> Does not make sense for TDP guests where mapping all of guest
> >>> memory with 4k pages cannot exceed "mapped guest pages / 512"
> >>> (not counting root pages).
> >>>
> >>> Allow that maximum for TDP, forcing the guest to recycle otherwise.
> >>>
> >>> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> >>>
> >>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> >>> index 956ca35..a9694a8d7 100644
> >>> --- a/arch/x86/kvm/mmu.c
> >>> +++ b/arch/x86/kvm/mmu.c
> >>> @@ -4293,7 +4293,7 @@ nomem:
> >>>  unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm)
> >>>  {
> >>>  	unsigned int nr_mmu_pages;
> >>> -	unsigned int  nr_pages = 0;
> >>> +	unsigned int i, nr_pages = 0;
> >>>  	struct kvm_memslots *slots;
> >>>  	struct kvm_memory_slot *memslot;
> >>>
> >>> @@ -4302,7 +4302,19 @@ unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm)
> >>>  	kvm_for_each_memslot(memslot, slots)
> >>>  		nr_pages += memslot->npages;
> >>>
> >>> -	nr_mmu_pages = nr_pages * KVM_PERMILLE_MMU_PAGES / 1000;
> >>> +	if (tdp_enabled) {
> >>> +		/* one root page */
> >>> +		nr_mmu_pages = 1;
> >>> +		/* nr_pages / (512^i) per level, due to
> >>> +		 * guest RAM map being linear */
> >>> +		for (i = 1; i < 4; i++) {
> >>> +			int nr_pages_round = nr_pages + (1 << (9*i));
> >>> +			nr_mmu_pages += nr_pages_round >> (9*i);
> >>> +		}
> >>
> >> Marcelo,
> >>
> >> Can it work if nested guest is used? Did you see any problem in practice (direct guest
> >> uses more memory than your calculation)?
> > 
> > Direct guest can use more than the calculation by switching between
> > different paging modes.
> 
> I mean guest runs on hardmmu (tdp is used but no nested guest). Its only
> use one page table and seems can not use more memory than your calculation
> (except some mmio page tables).
> 
> So, you calculation is only used to limit memory used if tdp + nested guest?

Yes, you're right, there is no duplication of shadow pages even with
mode switches so the patch is not needed.

> > About nested guest: at one point in time the working set cannot exceed 
> > the number of physical pages visible by the guest.
> 
> But it can cause lots of #PF, it is the nightmare for performance, no?
> 
> > 
> > Allowing an excessively high number of shadow pages is a security
> 
> The security concern means "optimization memory usage"? Or something else?
> 
> > concern, also, as unpreemptable long operations are necessary to tear
> > down the pages.
> 
> You mean limiting the shadow pages to let some patch run faster like
> remove-write-access and zap-all-sp etc.? If yes, we can directly optimize
> for these paths, this is more effective i think.
> 
> > 
> >> And mmio also can build some page table that looks like not considered
> >> in this patch.
> > 
> > Right, but its only a few pages. Same argument as above: working set at
> > one given time is smaller than total RAM. Do you see any potential
> > problem?
> 
> Marcelo, I just confused whether the limitation is reasonable, as i said,
> the limitation is not effective enough on hardmmu-only guest (no nested).
> and it seems too low for nested guests.
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 956ca35..a9694a8d7 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -4293,7 +4293,7 @@  nomem:
 unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm)
 {
 	unsigned int nr_mmu_pages;
-	unsigned int  nr_pages = 0;
+	unsigned int i, nr_pages = 0;
 	struct kvm_memslots *slots;
 	struct kvm_memory_slot *memslot;
 
@@ -4302,7 +4302,19 @@  unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm)
 	kvm_for_each_memslot(memslot, slots)
 		nr_pages += memslot->npages;
 
-	nr_mmu_pages = nr_pages * KVM_PERMILLE_MMU_PAGES / 1000;
+	if (tdp_enabled) {
+		/* one root page */
+		nr_mmu_pages = 1;
+		/* nr_pages / (512^i) per level, due to
+		 * guest RAM map being linear */
+		for (i = 1; i < 4; i++) {
+			int nr_pages_round = nr_pages + (1 << (9*i));
+			nr_mmu_pages += nr_pages_round >> (9*i);
+		}
+	} else {
+		nr_mmu_pages = nr_pages * KVM_PERMILLE_MMU_PAGES / 1000;
+	}
+
 	nr_mmu_pages = max(nr_mmu_pages,
 			(unsigned int) KVM_MIN_ALLOC_MMU_PAGES);