diff mbox

[06/17] Use statfs to determine size of huge pages

Message ID 1242574999-20887-7-git-send-email-aliguori@us.ibm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Anthony Liguori May 17, 2009, 3:43 p.m. UTC
From: Joerg Roedel <joerg.roedel@amd.com>

The current method of finding out the size of huge pages does not work
reliably anymore. Current Linux supports more than one huge page size
but /proc/meminfo only show one of the supported sizes.
To find out the real page size used can be found by calling statfs. This
patch changes qemu to use statfs instead of parsing /proc/meminfo.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>

Comments

Avi Kivity May 17, 2009, 5 p.m. UTC | #1
Anthony Liguori wrote:
> From: Joerg Roedel <joerg.roedel@amd.com>
>
> The current method of finding out the size of huge pages does not work
> reliably anymore. Current Linux supports more than one huge page size
> but /proc/meminfo only show one of the supported sizes.
> To find out the real page size used can be found by calling statfs. This
> patch changes qemu to use statfs instead of parsing /proc/meminfo.
>   

Since we don't support 1GBpages in stable-0.10, this is unneeded.
Joerg Roedel May 18, 2009, 9:02 a.m. UTC | #2
On Sun, May 17, 2009 at 08:00:40PM +0300, Avi Kivity wrote:
> Anthony Liguori wrote:
>> From: Joerg Roedel <joerg.roedel@amd.com>
>>
>> The current method of finding out the size of huge pages does not work
>> reliably anymore. Current Linux supports more than one huge page size
>> but /proc/meminfo only show one of the supported sizes.
>> To find out the real page size used can be found by calling statfs. This
>> patch changes qemu to use statfs instead of parsing /proc/meminfo.
>>   
>
> Since we don't support 1GBpages in stable-0.10, this is unneeded.

This patch is needed to run current KVM on a hugetlbfs backed with
1GB pages. Therefore I think this patch is needed. It is an improvement
over the /proc/meminfo parsing anyway and is not strictly related to
kvm kernel support for 1GB pages.

Joerg
Anthony Liguori May 18, 2009, 1:10 p.m. UTC | #3
Joerg Roedel wrote:
> On Sun, May 17, 2009 at 08:00:40PM +0300, Avi Kivity wrote:
>   
>> Anthony Liguori wrote:
>>     
>>> From: Joerg Roedel <joerg.roedel@amd.com>
>>>
>>> The current method of finding out the size of huge pages does not work
>>> reliably anymore. Current Linux supports more than one huge page size
>>> but /proc/meminfo only show one of the supported sizes.
>>> To find out the real page size used can be found by calling statfs. This
>>> patch changes qemu to use statfs instead of parsing /proc/meminfo.
>>>   
>>>       
>> Since we don't support 1GBpages in stable-0.10, this is unneeded.
>>     
>
> This patch is needed to run current KVM on a hugetlbfs backed with
> 1GB pages. Therefore I think this patch is needed. It is an improvement
> over the /proc/meminfo parsing anyway and is not strictly related to
> kvm kernel support for 1GB pages.
>   

Is there any userspace support requirements for 1GB pages?

That is, if you had a 2.6.31 kernel and stable-0.10, would 1GB pages 
work (assuming this patch is backported)?

This patch could still be considered a feature vs. bug fix but I'm 
mostly curious.

Regards,

Anthony Liguori

> Joerg
>
>
Joerg Roedel May 18, 2009, 1:22 p.m. UTC | #4
On Mon, May 18, 2009 at 08:10:28AM -0500, Anthony Liguori wrote:
> Joerg Roedel wrote:
>> On Sun, May 17, 2009 at 08:00:40PM +0300, Avi Kivity wrote:
>>   
>>> Anthony Liguori wrote:
>>>     
>>>> From: Joerg Roedel <joerg.roedel@amd.com>
>>>>
>>>> The current method of finding out the size of huge pages does not work
>>>> reliably anymore. Current Linux supports more than one huge page size
>>>> but /proc/meminfo only show one of the supported sizes.
>>>> To find out the real page size used can be found by calling statfs. This
>>>> patch changes qemu to use statfs instead of parsing /proc/meminfo.
>>>>         
>>> Since we don't support 1GBpages in stable-0.10, this is unneeded.
>>>     
>>
>> This patch is needed to run current KVM on a hugetlbfs backed with
>> 1GB pages. Therefore I think this patch is needed. It is an improvement
>> over the /proc/meminfo parsing anyway and is not strictly related to
>> kvm kernel support for 1GB pages.
>>   
>
> Is there any userspace support requirements for 1GB pages?

The /proc/meminfo parsing code breaks when current KVM is run with
a -mempath on a 1GB backed hugetlbfs.

> That is, if you had a 2.6.31 kernel and stable-0.10, would 1GB pages  
> work (assuming this patch is backported)?

With this patch and kvm kernel support 1GB pages will work. Another
patch is needed to make it more easy to enable the pdpe1gb cpuid bit in
the guest.

Joerg
diff mbox

Patch

diff --git a/sysemu.h b/sysemu.h
index 19464cf..4333495 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -99,7 +99,7 @@  extern int graphic_rotate;
 extern int no_quit;
 extern int semihosting_enabled;
 extern int old_param;
-extern int hpagesize;
+extern long hpagesize;
 extern const char *bootp_filename;
 
 #ifdef USE_KQEMU
diff --git a/vl.c b/vl.c
index 0437159..5d02e10 100644
--- a/vl.c
+++ b/vl.c
@@ -62,6 +62,7 @@ 
 #include <sys/ioctl.h>
 #include <sys/resource.h>
 #include <sys/socket.h>
+#include <sys/vfs.h>
 #include <netinet/in.h>
 #include <net/if.h>
 #if defined(__NetBSD__)
@@ -256,7 +257,7 @@  const char *mem_path = NULL;
 #ifdef MAP_POPULATE
 int mem_prealloc = 1;	/* force preallocation of physical target memory */
 #endif
-int hpagesize = 0;
+long hpagesize = 0;
 const char *cpu_vendor_string;
 #ifdef TARGET_ARM
 int old_param = 0;
@@ -4707,32 +4708,27 @@  void qemu_get_launch_info(int *argc, char ***argv, int *opt_daemonize, const cha
 }
 
 #ifdef USE_KVM
-static int gethugepagesize(void)
+
+#define HUGETLBFS_MAGIC       0x958458f6
+
+static long gethugepagesize(const char *path)
 {
-    int ret, fd;
-    char buf[4096];
-    const char *needle = "Hugepagesize:";
-    char *size;
-    unsigned long hugepagesize;
+    struct statfs fs;
+    int ret;
 
-    fd = open("/proc/meminfo", O_RDONLY);
-    if (fd < 0) {
-	perror("open");
-	exit(0);
-    }
+    do {
+	    ret = statfs(path, &fs);
+    } while (ret != 0 && errno == EINTR);
 
-    ret = read(fd, buf, sizeof(buf));
-    if (ret < 0) {
-	perror("read");
-	exit(0);
+    if (ret != 0) {
+	    perror("statfs");
+	    return 0;
     }
 
-    size = strstr(buf, needle);
-    if (!size)
-	return 0;
-    size += strlen(needle);
-    hugepagesize = strtol(size, NULL, 0);
-    return hugepagesize;
+    if (fs.f_type != HUGETLBFS_MAGIC)
+	    fprintf(stderr, "Warning: path not on HugeTLBFS: %s\n", path);
+
+    return fs.f_bsize;
 }
 
 static void *alloc_mem_area(size_t memory, unsigned long *len, const char *path)
@@ -4752,7 +4748,7 @@  static void *alloc_mem_area(size_t memory, unsigned long *len, const char *path)
     if (asprintf(&filename, "%s/kvm.XXXXXX", path) == -1)
 	return NULL;
 
-    hpagesize = gethugepagesize() * 1024;
+    hpagesize = gethugepagesize(path);
     if (!hpagesize)
 	return NULL;