diff mbox

kvm userspace: ksm support

Message ID 20090728193959.49cc28b6@woof.woof (mailing list archive)
State New, archived
Headers show

Commit Message

Izik Eidus July 28, 2009, 4:39 p.m. UTC
This patch is not for inclusion just rfc.

Thanks.


From 1297b86aa257100b3d819df9f9f0932bf4f7f49d Mon Sep 17 00:00:00 2001
From: Izik Eidus <ieidus@redhat.com>
Date: Tue, 28 Jul 2009 19:14:26 +0300
Subject: [PATCH] kvm userspace: ksm support

rfc for ksm support to kvm userpsace.

thanks

Signed-off-by: Izik Eidus <ieidus@redhat.com>
---
 exec.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

Comments

Anthony Liguori July 28, 2009, 4:41 p.m. UTC | #1
Izik Eidus wrote:
> This patch is not for inclusion just rfc.
>   

The madvise() interface looks really nice :-)
> Thanks.
>
>
> From 1297b86aa257100b3d819df9f9f0932bf4f7f49d Mon Sep 17 00:00:00 2001
> From: Izik Eidus <ieidus@redhat.com>
> Date: Tue, 28 Jul 2009 19:14:26 +0300
> Subject: [PATCH] kvm userspace: ksm support
>
> rfc for ksm support to kvm userpsace.
>
> thanks
>
> Signed-off-by: Izik Eidus <ieidus@redhat.com>
> ---
>  exec.c |    3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
>
> diff --git a/exec.c b/exec.c
> index f6d9ec9..375cc18 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -2595,6 +2595,9 @@ ram_addr_t qemu_ram_alloc(ram_addr_t size)
>      new_block->host = file_ram_alloc(size, mem_path);
>      if (!new_block->host) {
>          new_block->host = qemu_vmalloc(size);
> +#ifdef MADV_MERGEABLE
> +        madvise(new_block->host, size, MADV_MERGEABLE);
> +#endif
>   

Are madvise calls additive?

Do we need to change the madvise balloon calls to include MADV_MERGEABLE 
or will this carry the property forever?

I'd suggest doing the following in osdep.h too:

#if !defined(MADV_MERGABLE)
#define MADV_MERGABLE MADV_NORMAL
#endif

To avoid #ifdefs in .c files.

Regards,

Anthony Liguori

>      }
>      new_block->offset = last_ram_offset;
>      new_block->length = size;
>   

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Izik Eidus July 28, 2009, 4:50 p.m. UTC | #2
Anthony Liguori wrote:
> Izik Eidus wrote:
>> This patch is not for inclusion just rfc.
>>   
>
> The madvise() interface looks really nice :-)
>> Thanks.
>>
>>
>> From 1297b86aa257100b3d819df9f9f0932bf4f7f49d Mon Sep 17 00:00:00 2001
>> From: Izik Eidus <ieidus@redhat.com>
>> Date: Tue, 28 Jul 2009 19:14:26 +0300
>> Subject: [PATCH] kvm userspace: ksm support
>>
>> rfc for ksm support to kvm userpsace.
>>
>> thanks
>>
>> Signed-off-by: Izik Eidus <ieidus@redhat.com>
>> ---
>>  exec.c |    3 +++
>>  1 files changed, 3 insertions(+), 0 deletions(-)
>>
>> diff --git a/exec.c b/exec.c
>> index f6d9ec9..375cc18 100644
>> --- a/exec.c
>> +++ b/exec.c
>> @@ -2595,6 +2595,9 @@ ram_addr_t qemu_ram_alloc(ram_addr_t size)
>>      new_block->host = file_ram_alloc(size, mem_path);
>>      if (!new_block->host) {
>>          new_block->host = qemu_vmalloc(size);
>> +#ifdef MADV_MERGEABLE
>> +        madvise(new_block->host, size, MADV_MERGEABLE);
>> +#endif
>>   
>
> Are madvise calls additive?
>
> Do we need to change the madvise balloon calls to include 
> MADV_MERGEABLE or will this carry the property forever?

You mean: when we later call for other madvise calls, if it will remove 
the MADV_MERGEABLE from that memory?
if yes, the answer is no, it should be still l left in the vma->vm_flags...

>
> I'd suggest doing the following in osdep.h too:
>
> #if !defined(MADV_MERGABLE)
> #define MADV_MERGABLE MADV_NORMAL
> #endif
>
> To avoid #ifdefs in .c files.

I tried to follow the way DONTFORK madvise is working...

So you say, just to throw this thing into osdep.h instead of that c file?

>
> Regards,
>
> Anthony Liguori
>
>>      }
>>      new_block->offset = last_ram_offset;
>>      new_block->length = size;
>>   
>

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Anthony Liguori July 29, 2009, 1:27 a.m. UTC | #3
Izik Eidus wrote:
>
> You mean: when we later call for other madvise calls, if it will 
> remove the MADV_MERGEABLE from that memory?
> if yes, the answer is no, it should be still l left in the 
> vma->vm_flags...

Excellent.

>>
>> I'd suggest doing the following in osdep.h too:
>>
>> #if !defined(MADV_MERGABLE)
>> #define MADV_MERGABLE MADV_NORMAL
>> #endif
>>
>> To avoid #ifdefs in .c files.
>
> I tried to follow the way DONTFORK madvise is working...
>
> So you say, just to throw this thing into osdep.h instead of that c file?

Yes.

I think the DONTFORK thing is a bit odd.  Of course we have 
MADV_DONTFORK if we're running KVM.  I'm not sure why that is there.

I also think that we could get away with getting rid of any checks for 
!sync_mmu() since that was introduced in 2.6.27.

Otherwise, you should technically avoid doing madvise() unless we have 
sync_mmu().

Regards,

Anthony Liguori
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kiszka July 29, 2009, 8:51 a.m. UTC | #4
Anthony Liguori wrote:
> Izik Eidus wrote:
>>
>> You mean: when we later call for other madvise calls, if it will
>> remove the MADV_MERGEABLE from that memory?
>> if yes, the answer is no, it should be still l left in the
>> vma->vm_flags...
> 
> Excellent.
> 
>>>
>>> I'd suggest doing the following in osdep.h too:
>>>
>>> #if !defined(MADV_MERGABLE)
>>> #define MADV_MERGABLE MADV_NORMAL
>>> #endif
>>>
>>> To avoid #ifdefs in .c files.
>>
>> I tried to follow the way DONTFORK madvise is working...
>>
>> So you say, just to throw this thing into osdep.h instead of that c file?
> 
> Yes.
> 
> I think the DONTFORK thing is a bit odd.  Of course we have
> MADV_DONTFORK if we're running KVM.  I'm not sure why that is there.
> 
> I also think that we could get away with getting rid of any checks for
> !sync_mmu() since that was introduced in 2.6.27.

The problem is that your host kernel also must have CONFIG_MMU_NOTIFIER
enabled - and that's not always the case.

> 
> Otherwise, you should technically avoid doing madvise() unless we have
> sync_mmu().
> 
> Regards,
> 
> Anthony Liguori

Jan
Avi Kivity July 29, 2009, 9:30 a.m. UTC | #5
On 07/29/2009 04:27 AM, Anthony Liguori wrote:
> I also think that we could get away with getting rid of any checks for 
> !sync_mmu() since that was introduced in 2.6.27.

People are running kvm on 2.6.18.
Iggy Jackson Aug. 3, 2009, 6:08 p.m. UTC | #6
If someone wanted to play around with ksm in qemu-kvm-0.x.x would it be as 
simple as adding the below additions to kvm_setup_guest_memory in kvm-all.c 
(and adding the necessary kernel changes of course)?


On Tuesday 28 July 2009 11:39:59 am Izik Eidus wrote:
> This patch is not for inclusion just rfc.
>
> Thanks.
>
>
> From 1297b86aa257100b3d819df9f9f0932bf4f7f49d Mon Sep 17 00:00:00 2001
> From: Izik Eidus <ieidus@redhat.com>
> Date: Tue, 28 Jul 2009 19:14:26 +0300
> Subject: [PATCH] kvm userspace: ksm support
>
> rfc for ksm support to kvm userpsace.
>
> thanks
>
> Signed-off-by: Izik Eidus <ieidus@redhat.com>
> ---
>  exec.c |    3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
>
> diff --git a/exec.c b/exec.c
> index f6d9ec9..375cc18 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -2595,6 +2595,9 @@ ram_addr_t qemu_ram_alloc(ram_addr_t size)
>      new_block->host = file_ram_alloc(size, mem_path);
>      if (!new_block->host) {
>          new_block->host = qemu_vmalloc(size);
> +#ifdef MADV_MERGEABLE
> +        madvise(new_block->host, size, MADV_MERGEABLE);
> +#endif
>      }
>      new_block->offset = last_ram_offset;
>      new_block->length = size;
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Izik Eidus Aug. 3, 2009, 6:09 p.m. UTC | #7
Brian Jackson wrote:
> If someone wanted to play around with ksm in qemu-kvm-0.x.x would it be as 
> simple as adding the below additions to kvm_setup_guest_memory in kvm-all.c

qemu-kvm-0.x.x doesnt tell me much, but if it is the function that 
register the memory than yes...

(I just remember that qemu used to have something called phys_ram_base, 
in that case it would be just making madvise on phys_ram_base with the 
same of phys_ram_size....)

>  
> (and adding the necessary kernel changes of course)?
>
>
> On Tuesday 28 July 2009 11:39:59 am Izik Eidus wrote:
>   
>> This patch is not for inclusion just rfc.
>>
>> Thanks.
>>
>>
>> From 1297b86aa257100b3d819df9f9f0932bf4f7f49d Mon Sep 17 00:00:00 2001
>> From: Izik Eidus <ieidus@redhat.com>
>> Date: Tue, 28 Jul 2009 19:14:26 +0300
>> Subject: [PATCH] kvm userspace: ksm support
>>
>> rfc for ksm support to kvm userpsace.
>>
>> thanks
>>
>> Signed-off-by: Izik Eidus <ieidus@redhat.com>
>> ---
>>  exec.c |    3 +++
>>  1 files changed, 3 insertions(+), 0 deletions(-)
>>
>> diff --git a/exec.c b/exec.c
>> index f6d9ec9..375cc18 100644
>> --- a/exec.c
>> +++ b/exec.c
>> @@ -2595,6 +2595,9 @@ ram_addr_t qemu_ram_alloc(ram_addr_t size)
>>      new_block->host = file_ram_alloc(size, mem_path);
>>      if (!new_block->host) {
>>          new_block->host = qemu_vmalloc(size);
>> +#ifdef MADV_MERGEABLE
>> +        madvise(new_block->host, size, MADV_MERGEABLE);
>> +#endif
>>      }
>>      new_block->offset = last_ram_offset;
>>      new_block->length = size;
>>     

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Iggy Jackson Aug. 3, 2009, 6:37 p.m. UTC | #8
On Monday 03 August 2009 01:09:38 pm Izik Eidus wrote:
> Brian Jackson wrote:
> > If someone wanted to play around with ksm in qemu-kvm-0.x.x would it be
> > as simple as adding the below additions to kvm_setup_guest_memory in
> > kvm-all.c
>
> qemu-kvm-0.x.x doesnt tell me much, but if it is the function that
> register the memory than yes...
>
> (I just remember that qemu used to have something called phys_ram_base,
> in that case it would be just making madvise on phys_ram_base with the
> same of phys_ram_size....)

Sorry, I'm using qemu-kvm-0.10.6


This is what qemu_ram_alloc looks like:



/* XXX: better than nothing */
ram_addr_t qemu_ram_alloc(ram_addr_t size)
{
    ram_addr_t addr;
    if ((phys_ram_alloc_offset + size) > phys_ram_size) {
        fprintf(stderr, "Not enough memory (requested_size = %" PRIu64 ", max memory = %" PRIu64 ")\n",
                (uint64_t)size, (uint64_t)phys_ram_size);
        abort();
    }
    addr = phys_ram_alloc_offset;
    phys_ram_alloc_offset = TARGET_PAGE_ALIGN(phys_ram_alloc_offset + size);

    if (kvm_enabled())
        kvm_setup_guest_memory(phys_ram_base + addr, size);

    return addr;
}


And this is what my new kvm_setup_guest_memory looks like:


void kvm_setup_guest_memory(void *start, size_t size)
{
    if (!kvm_has_sync_mmu()) {
#ifdef MADV_DONTFORK
        int ret = madvise(start, size, MADV_DONTFORK);

        if (ret) {
            perror("madvice");
            exit(1);
        }
#else
        fprintf(stderr,
                "Need MADV_DONTFORK in absence of synchronous KVM MMU\n");
        exit(1);
#endif
    }
#ifdef MADV_MERGEABLE
        madvise(start, size, MADV_MERGEABLE);
#endif
}



Look okay?


>
> > (and adding the necessary kernel changes of course)?
> >
> > On Tuesday 28 July 2009 11:39:59 am Izik Eidus wrote:
> >> This patch is not for inclusion just rfc.
> >>
> >> Thanks.
> >>
> >>
> >> From 1297b86aa257100b3d819df9f9f0932bf4f7f49d Mon Sep 17 00:00:00 2001
> >> From: Izik Eidus <ieidus@redhat.com>
> >> Date: Tue, 28 Jul 2009 19:14:26 +0300
> >> Subject: [PATCH] kvm userspace: ksm support
> >>
> >> rfc for ksm support to kvm userpsace.
> >>
> >> thanks
> >>
> >> Signed-off-by: Izik Eidus <ieidus@redhat.com>
> >> ---
> >>  exec.c |    3 +++
> >>  1 files changed, 3 insertions(+), 0 deletions(-)
> >>
> >> diff --git a/exec.c b/exec.c
> >> index f6d9ec9..375cc18 100644
> >> --- a/exec.c
> >> +++ b/exec.c
> >> @@ -2595,6 +2595,9 @@ ram_addr_t qemu_ram_alloc(ram_addr_t size)
> >>      new_block->host = file_ram_alloc(size, mem_path);
> >>      if (!new_block->host) {
> >>          new_block->host = qemu_vmalloc(size);
> >> +#ifdef MADV_MERGEABLE
> >> +        madvise(new_block->host, size, MADV_MERGEABLE);
> >> +#endif
> >>      }
> >>      new_block->offset = last_ram_offset;
> >>      new_block->length = size;
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Izik Eidus Aug. 3, 2009, 7:04 p.m. UTC | #9
Brian Jackson wrote:
> On Monday 03 August 2009 01:09:38 pm Izik Eidus wrote:
>   
>> Brian Jackson wrote:
>>     
>>> If someone wanted to play around with ksm in qemu-kvm-0.x.x would it be
>>> as simple as adding the below additions to kvm_setup_guest_memory in
>>> kvm-all.c
>>>       
>> qemu-kvm-0.x.x doesnt tell me much, but if it is the function that
>> register the memory than yes...
>>
>> (I just remember that qemu used to have something called phys_ram_base,
>> in that case it would be just making madvise on phys_ram_base with the
>> same of phys_ram_size....)
>>     
>
> Sorry, I'm using qemu-kvm-0.10.6
>
>
> This is what qemu_ram_alloc looks like:
>
>
>
> /* XXX: better than nothing */
> ram_addr_t qemu_ram_alloc(ram_addr_t size)
> {
>     ram_addr_t addr;
>     if ((phys_ram_alloc_offset + size) > phys_ram_size) {
>         fprintf(stderr, "Not enough memory (requested_size = %" PRIu64 ", max memory = %" PRIu64 ")\n",
>                 (uint64_t)size, (uint64_t)phys_ram_size);
>         abort();
>     }
>     addr = phys_ram_alloc_offset;
>     phys_ram_alloc_offset = TARGET_PAGE_ALIGN(phys_ram_alloc_offset + size);
>
>     if (kvm_enabled())
>         kvm_setup_guest_memory(phys_ram_base + addr, size);
>
>     return addr;
> }
>
>
> And this is what my new kvm_setup_guest_memory looks like:
>
>
> void kvm_setup_guest_memory(void *start, size_t size)
> {
>     if (!kvm_has_sync_mmu()) {
> #ifdef MADV_DONTFORK
>         int ret = madvise(start, size, MADV_DONTFORK);
>
>         if (ret) {
>             perror("madvice");
>             exit(1);
>         }
> #else
>         fprintf(stderr,
>                 "Need MADV_DONTFORK in absence of synchronous KVM MMU\n");
>         exit(1);
> #endif
>     }
> #ifdef MADV_MERGEABLE
>         madvise(start, size, MADV_MERGEABLE);
> #endif
> }
>
>
>
> Look okay?
>
>
>   

Yes.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Iggy Jackson Aug. 5, 2009, 6:38 p.m. UTC | #10
On Monday 03 August 2009 02:04:15 pm Izik Eidus wrote:
> Brian Jackson wrote:
> > Look okay?
>
> Yes.


Okay I got it working after I figured out there were 2 
kvm_setup_guest_memory()'s in qemu-kvm

I have debian-5 packages of linux-2.6.31-rc4 with ksm patches and qemu-
kvm-0.10.6 with ksm patches if anyone is interested.

--Iggy
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/exec.c b/exec.c
index f6d9ec9..375cc18 100644
--- a/exec.c
+++ b/exec.c
@@ -2595,6 +2595,9 @@  ram_addr_t qemu_ram_alloc(ram_addr_t size)
     new_block->host = file_ram_alloc(size, mem_path);
     if (!new_block->host) {
         new_block->host = qemu_vmalloc(size);
+#ifdef MADV_MERGEABLE
+        madvise(new_block->host, size, MADV_MERGEABLE);
+#endif
     }
     new_block->offset = last_ram_offset;
     new_block->length = size;