diff mbox

kvm, mm: account kvm related kmem slabs to kmemcg

Message ID 20171006010724.186563-1-shakeelb@google.com (mailing list archive)
State New, archived
Headers show

Commit Message

Shakeel Butt Oct. 6, 2017, 1:07 a.m. UTC
The kvm slabs can consume a significant amount of system memory
and indeed in our production environment we have observed that
a lot of machines are spending significant amount of memory that
can not be left as system memory overhead. Also the allocations
from these slabs can be triggered directly by user space applications
which has access to kvm and thus a buggy application can leak
such memory. So, these caches should be accounted to kmemcg.

Signed-off-by: Shakeel Butt <shakeelb@google.com>
---
 arch/x86/kvm/mmu.c  | 4 ++--
 virt/kvm/kvm_main.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

Comments

Anshuman Khandual Oct. 6, 2017, 4:28 a.m. UTC | #1
On 10/06/2017 06:37 AM, Shakeel Butt wrote:
> The kvm slabs can consume a significant amount of system memory
> and indeed in our production environment we have observed that
> a lot of machines are spending significant amount of memory that
> can not be left as system memory overhead. Also the allocations
> from these slabs can be triggered directly by user space applications
> which has access to kvm and thus a buggy application can leak
> such memory. So, these caches should be accounted to kmemcg.

But there may be other situations like this where user space can
trigger allocation from various SLAB objects inside the kernel
which are accounted as system memory. So how we draw the line
which ones should be accounted for memcg. Just being curious.
Shakeel Butt Oct. 6, 2017, 6:40 a.m. UTC | #2
On Thu, Oct 5, 2017 at 9:28 PM, Anshuman Khandual
<khandual@linux.vnet.ibm.com> wrote:
> On 10/06/2017 06:37 AM, Shakeel Butt wrote:
>> The kvm slabs can consume a significant amount of system memory
>> and indeed in our production environment we have observed that
>> a lot of machines are spending significant amount of memory that
>> can not be left as system memory overhead. Also the allocations
>> from these slabs can be triggered directly by user space applications
>> which has access to kvm and thus a buggy application can leak
>> such memory. So, these caches should be accounted to kmemcg.
>
> But there may be other situations like this where user space can
> trigger allocation from various SLAB objects inside the kernel
> which are accounted as system memory. So how we draw the line
> which ones should be accounted for memcg. Just being curious.
>
Yes, there are indeed other slabs where user space can trigger
allocations. IMO selecting which kmem caches to account is kind of
workload and user specific decision. The ones I am converting are
selected based on the data gathered from our production environment.
However I think it would be useful in general.
Michal Hocko Oct. 6, 2017, 7:52 a.m. UTC | #3
On Fri 06-10-17 09:58:30, Anshuman Khandual wrote:
> On 10/06/2017 06:37 AM, Shakeel Butt wrote:
> > The kvm slabs can consume a significant amount of system memory
> > and indeed in our production environment we have observed that
> > a lot of machines are spending significant amount of memory that
> > can not be left as system memory overhead. Also the allocations
> > from these slabs can be triggered directly by user space applications
> > which has access to kvm and thus a buggy application can leak
> > such memory. So, these caches should be accounted to kmemcg.
> 
> But there may be other situations like this where user space can
> trigger allocation from various SLAB objects inside the kernel
> which are accounted as system memory. So how we draw the line
> which ones should be accounted for memcg. Just being curious.

The thing is that we used to have an opt-out approach for kmem
accounting but we decided to go opt-in in a9bb7e620efd ("memcg: only
account kmem allocations marked as __GFP_ACCOUNT").

Since then we are adding the flag to caches/allocations which can go
wild and consume a lot of or even unbounded amount of memory.
Paolo Bonzini Oct. 6, 2017, 8:55 a.m. UTC | #4
On 06/10/2017 03:07, Shakeel Butt wrote:
> The kvm slabs can consume a significant amount of system memory
> and indeed in our production environment we have observed that
> a lot of machines are spending significant amount of memory that
> can not be left as system memory overhead. Also the allocations
> from these slabs can be triggered directly by user space applications
> which has access to kvm and thus a buggy application can leak
> such memory. So, these caches should be accounted to kmemcg.
> 
> Signed-off-by: Shakeel Butt <shakeelb@google.com>
> ---
>  arch/x86/kvm/mmu.c  | 4 ++--
>  virt/kvm/kvm_main.c | 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index eca30c1eb1d9..87c5db9e644d 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -5475,13 +5475,13 @@ int kvm_mmu_module_init(void)
>  
>  	pte_list_desc_cache = kmem_cache_create("pte_list_desc",
>  					    sizeof(struct pte_list_desc),
> -					    0, 0, NULL);
> +					    0, SLAB_ACCOUNT, NULL);
>  	if (!pte_list_desc_cache)
>  		goto nomem;
>  
>  	mmu_page_header_cache = kmem_cache_create("kvm_mmu_page_header",
>  						  sizeof(struct kvm_mmu_page),
> -						  0, 0, NULL);
> +						  0, SLAB_ACCOUNT, NULL);
>  	if (!mmu_page_header_cache)
>  		goto nomem;
>  
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 9deb5a245b83..3d73299e05f2 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -4010,7 +4010,7 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
>  	if (!vcpu_align)
>  		vcpu_align = __alignof__(struct kvm_vcpu);
>  	kvm_vcpu_cache = kmem_cache_create("kvm_vcpu", vcpu_size, vcpu_align,
> -					   0, NULL);
> +					   SLAB_ACCOUNT, NULL);
>  	if (!kvm_vcpu_cache) {
>  		r = -ENOMEM;
>  		goto out_free_3;
> 

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

Adding maintainers for other architectures, because they probably want
to do something similar.

Paolo
Paolo Bonzini Oct. 10, 2017, 8:32 a.m. UTC | #5
On 06/10/2017 03:07, Shakeel Butt wrote:
> The kvm slabs can consume a significant amount of system memory
> and indeed in our production environment we have observed that
> a lot of machines are spending significant amount of memory that
> can not be left as system memory overhead. Also the allocations
> from these slabs can be triggered directly by user space applications
> which has access to kvm and thus a buggy application can leak
> such memory. So, these caches should be accounted to kmemcg.
> 
> Signed-off-by: Shakeel Butt <shakeelb@google.com>
> ---
>  arch/x86/kvm/mmu.c  | 4 ++--
>  virt/kvm/kvm_main.c | 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index eca30c1eb1d9..87c5db9e644d 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -5475,13 +5475,13 @@ int kvm_mmu_module_init(void)
>  
>  	pte_list_desc_cache = kmem_cache_create("pte_list_desc",
>  					    sizeof(struct pte_list_desc),
> -					    0, 0, NULL);
> +					    0, SLAB_ACCOUNT, NULL);
>  	if (!pte_list_desc_cache)
>  		goto nomem;
>  
>  	mmu_page_header_cache = kmem_cache_create("kvm_mmu_page_header",
>  						  sizeof(struct kvm_mmu_page),
> -						  0, 0, NULL);
> +						  0, SLAB_ACCOUNT, NULL);
>  	if (!mmu_page_header_cache)
>  		goto nomem;
>  
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 9deb5a245b83..3d73299e05f2 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -4010,7 +4010,7 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
>  	if (!vcpu_align)
>  		vcpu_align = __alignof__(struct kvm_vcpu);
>  	kvm_vcpu_cache = kmem_cache_create("kvm_vcpu", vcpu_size, vcpu_align,
> -					   0, NULL);
> +					   SLAB_ACCOUNT, NULL);
>  	if (!kvm_vcpu_cache) {
>  		r = -ENOMEM;
>  		goto out_free_3;
> 

Queued, thanks.

Paolo
diff mbox

Patch

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index eca30c1eb1d9..87c5db9e644d 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -5475,13 +5475,13 @@  int kvm_mmu_module_init(void)
 
 	pte_list_desc_cache = kmem_cache_create("pte_list_desc",
 					    sizeof(struct pte_list_desc),
-					    0, 0, NULL);
+					    0, SLAB_ACCOUNT, NULL);
 	if (!pte_list_desc_cache)
 		goto nomem;
 
 	mmu_page_header_cache = kmem_cache_create("kvm_mmu_page_header",
 						  sizeof(struct kvm_mmu_page),
-						  0, 0, NULL);
+						  0, SLAB_ACCOUNT, NULL);
 	if (!mmu_page_header_cache)
 		goto nomem;
 
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 9deb5a245b83..3d73299e05f2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -4010,7 +4010,7 @@  int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
 	if (!vcpu_align)
 		vcpu_align = __alignof__(struct kvm_vcpu);
 	kvm_vcpu_cache = kmem_cache_create("kvm_vcpu", vcpu_size, vcpu_align,
-					   0, NULL);
+					   SLAB_ACCOUNT, NULL);
 	if (!kvm_vcpu_cache) {
 		r = -ENOMEM;
 		goto out_free_3;