diff mbox

qemu-kvm: Flush icache after dma operations for ia64

Message ID 4A1A9951.5030408@sgi.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jes Sorensen May 25, 2009, 1:12 p.m. UTC
Ok,

Trying once more. After spending a couple of hours trying to follow
the QEMU dma codeflow, I have convinced myself Avi is right and those
two functions don't need to do the flushing as they all end up calling
dma_bdrv_cb() which calls dma_brdv_unmap(). I have added a couple
comments to the code, which will hopefully save the next person the
'pleasure' of trying to figure out this too.

Cheers,
Jes

Comments

Avi Kivity May 26, 2009, 12:30 p.m. UTC | #1
Jes Sorensen wrote:
> Ok,
>
> Trying once more. After spending a couple of hours trying to follow
> the QEMU dma codeflow, I have convinced myself Avi is right and those
> two functions don't need to do the flushing as they all end up calling
> dma_bdrv_cb() which calls dma_brdv_unmap(). I have added a couple
> comments to the code, which will hopefully save the next person the
> 'pleasure' of trying to figure out this too.
>

It looks right to me.  Xiantao?

>                  access_len -= l;
> -            }
> +	    }
> +	    dma_flush_range((unsigned long)buffer,
> +			    (unsigned long)buffer + flush_len);
>          }
>       

Detab your code, please.
Zhang, Xiantao June 1, 2009, 5:40 a.m. UTC | #2
Avi Kivity wrote:
> Jes Sorensen wrote:
>> Ok,
>> 
>> Trying once more. After spending a couple of hours trying to follow
>> the QEMU dma codeflow, I have convinced myself Avi is right and those
>> two functions don't need to do the flushing as they all end up
>> calling dma_bdrv_cb() which calls dma_brdv_unmap(). I have added a
>> couple comments to the code, which will hopefully save the next
>> person the 'pleasure' of trying to figure out this too.
>> 
> 
> It looks right to me.  Xiantao?

Fine to me.  But seems the change in qemu_iovec_from_buffer is lost in this patch or that change is also not unnecessary ?

Xiantao


>>                  access_len -= l;
>> -            }
>> +	    }
>> +	    dma_flush_range((unsigned long)buffer,
>> +			    (unsigned long)buffer + flush_len);
>>          }
>> 
> 
> Detab your code, please.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Avi Kivity June 1, 2009, 7:45 a.m. UTC | #3
Zhang, Xiantao wrote:
> Avi Kivity wrote:
>   
>> Jes Sorensen wrote:
>>     
>>> Ok,
>>>
>>> Trying once more. After spending a couple of hours trying to follow
>>> the QEMU dma codeflow, I have convinced myself Avi is right and those
>>> two functions don't need to do the flushing as they all end up
>>> calling dma_bdrv_cb() which calls dma_brdv_unmap(). I have added a
>>> couple comments to the code, which will hopefully save the next
>>> person the 'pleasure' of trying to figure out this too.
>>>
>>>       
>> It looks right to me.  Xiantao?
>>     
>
> Fine to me.  But seems the change in qemu_iovec_from_buffer is lost in this patch or that change is also not unnecessary ?
>   


I think the fixed unmap handles that case.  Can you test to make sure?
Jes Sorensen June 2, 2009, 10:56 a.m. UTC | #4
Avi Kivity wrote:
> Zhang, Xiantao wrote:
>> Fine to me.  But seems the change in qemu_iovec_from_buffer is lost in 
>> this patch or that change is also not unnecessary ?
> 
> I think the fixed unmap handles that case.  Can you test to make sure?
> 

Avi and I went through the code and verified that it was all covered by
the unmap case. It was pretty messy to come to the conclusion, which is
why I tried to document it in the code and the patch description.

Cheers,
Jes
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zhang, Xiantao June 2, 2009, 3:20 p.m. UTC | #5
Hi, Jes
    Have you verified whether it works for you ?  You may run kernel build in the guest with 4 vcpus,  if it can be done successfully without any error, it should be Okay I think, otherwise, we may need to investigate it further. :)
Xiantao 
 


	 

-----Original Message-----
From: Jes Sorensen [mailto:jes@sgi.com] 
Sent: Tuesday, June 02, 2009 6:57 PM
To: Avi Kivity
Cc: Zhang, Xiantao; kvm@vger.kernel.org; kvm-ia64@vger.kernel.org; Hollis Blanchard
Subject: Re: [PATCH] qemu-kvm: Flush icache after dma operations for ia64

Avi Kivity wrote:
> Zhang, Xiantao wrote:
>> Fine to me.  But seems the change in qemu_iovec_from_buffer is lost in 
>> this patch or that change is also not unnecessary ?
> 
> I think the fixed unmap handles that case.  Can you test to make sure?
> 

Avi and I went through the code and verified that it was all covered by
the unmap case. It was pretty messy to come to the conclusion, which is
why I tried to document it in the code and the patch description.

Cheers,
Jes
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jes Sorensen June 4, 2009, 1:09 p.m. UTC | #6
Zhang, Xiantao wrote:
> Hi, Jes
>     Have you verified whether it works for you ?  You may run kernel build in the guest with 4 vcpus,  if it can be done successfully without any error, it should be Okay I think, otherwise, we may need to investigate it further. :)
> Xiantao 

Hi Xiantao,

I was able to run a 16 vCPU guest and build the kernel using make -j 16.
How quickly would the problem show up for you, on every run, or should I
run more tests?

Cheers,
Jes
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zhang, Xiantao June 5, 2009, 1:38 a.m. UTC | #7
Jes Sorensen wrote:
> Zhang, Xiantao wrote:
>> Hi, Jes
>>     Have you verified whether it works for you ?  You may run kernel
>> build in the guest with 4 vcpus,  if it can be done successfully
>> without any error, it should be Okay I think, otherwise, we may need
>> to investigate it further. :) Xiantao  
> 
> Hi Xiantao,
> 
> I was able to run a 16 vCPU guest and build the kernel using make -j
> 16. How quickly would the problem show up for you, on every run, or
> should I run more tests?

Hi Jes, 
 Good news! On my machine, without the patch, smp guest can't build one whole kernel at all. So if you can build it without errors and use it to boot up the guest, I think it should work well.  
Xiantao



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jes Sorensen June 5, 2009, 11:13 a.m. UTC | #8
Zhang, Xiantao wrote:
> Hi Jes, 
>  Good news! On my machine, without the patch, smp guest can't build one whole kernel at all. So if you can build it without errors and use it to boot up the guest, I think it should work well.  
> Xiantao

Yep, compiles, boots, the works ...

Avi, I think we can conclude this patch is fine to go in. If it ends up
I did something wrong, you can all attack me later :-)


Cheers,
Jes
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Avi Kivity June 7, 2009, 6:28 a.m. UTC | #9
Jes Sorensen wrote:
> Zhang, Xiantao wrote:
>> Hi Jes,  Good news! On my machine, without the patch, smp guest can't 
>> build one whole kernel at all. So if you can build it without errors 
>> and use it to boot up the guest, I think it should work well.  Xiantao
>
> Yep, compiles, boots, the works ...
>
> Avi, I think we can conclude this patch is fine to go in. If it ends up
> I did something wrong, you can all attack me later :-)

It's in, thanks.
diff mbox

Patch

ia64 system depends on that platform issues snoop cycle to flush
icache for memory touched by DMA write operations, but virtual DMA
operations is emulated by memcpy, so use explict instrustions to flush
the related icache, otherwise, guest may use obsolete icache.

Slightly modified version of Xiantao's patch, which avoids the #ifdef's
for ia64 by introducing a dma_flush_range() function defined as a noop
on architectures which do not need it.

Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Jes Sorensen <jes@sgi.com>

---
 cache-utils.h           |   21 +++++++++++++++++++++
 cutils.c                |    5 +++++
 dma-helpers.c           |    4 ++++
 exec.c                  |    7 ++++++-
 target-ia64/cpu.h       |    1 -
 target-ia64/fake-exec.c |    9 ---------
 6 files changed, 36 insertions(+), 11 deletions(-)

Index: qemu-kvm/cache-utils.h
===================================================================
--- qemu-kvm.orig/cache-utils.h
+++ qemu-kvm/cache-utils.h
@@ -34,7 +34,28 @@ 
     asm volatile ("isync" : : : "memory");
 }
 
+/*
+ * Is this correct for PPC?
+ */
+static inline void dma_flush_range(unsigned long start, unsigned long stop)
+{
+}
+
+#elif defined(__ia64__)
+static inline void flush_icache_range(unsigned long start, unsigned long stop)
+{
+    while (start < stop) {
+	asm volatile ("fc %0" :: "r"(start));
+	start += 32;
+    }
+    asm volatile (";;sync.i;;srlz.i;;");
+}
+#define dma_flush_range(start, end) flush_icache_range(start, end)
+#define qemu_cache_utils_init(envp) do { (void) (envp); } while (0)
 #else
+static inline void dma_flush_range(unsigned long start, unsigned long stop)
+{
+}
 #define qemu_cache_utils_init(envp) do { (void) (envp); } while (0)
 #endif
 
Index: qemu-kvm/cutils.c
===================================================================
--- qemu-kvm.orig/cutils.c
+++ qemu-kvm/cutils.c
@@ -165,6 +165,11 @@ 
     }
 }
 
+/*
+ * No dma flushing needed here, as the aio code will call dma_bdrv_cb()
+ * on completion as well, which will result in a call to
+ * dma_bdrv_unmap() which will do the flushing ....
+ */
 void qemu_iovec_from_buffer(QEMUIOVector *qiov, const void *buf, size_t count)
 {
     const uint8_t *p = (const uint8_t *)buf;
Index: qemu-kvm/dma-helpers.c
===================================================================
--- qemu-kvm.orig/dma-helpers.c
+++ qemu-kvm/dma-helpers.c
@@ -148,6 +148,10 @@ 
     dbs->is_write = is_write;
     dbs->bh = NULL;
     qemu_iovec_init(&dbs->iov, sg->nsg);
+    /*
+     * DMA flushing is handled in dma_bdrv_cb() calling dma_bdrv_unmap()
+     * so we don't need to do that here.
+     */
     dma_bdrv_cb(dbs, 0);
     if (!dbs->acb) {
         qemu_aio_release(dbs);
Index: qemu-kvm/exec.c
===================================================================
--- qemu-kvm.orig/exec.c
+++ qemu-kvm/exec.c
@@ -35,6 +35,7 @@ 
 #include "cpu.h"
 #include "exec-all.h"
 #include "qemu-common.h"
+#include "cache-utils.h"
 
 #if !defined(TARGET_IA64)
 #include "tcg.h"
@@ -3385,6 +3386,8 @@ 
 void cpu_physical_memory_unmap(void *buffer, target_phys_addr_t len,
                                int is_write, target_phys_addr_t access_len)
 {
+    unsigned long flush_len = (unsigned long)access_len;
+
     if (buffer != bounce.buffer) {
         if (is_write) {
             ram_addr_t addr1 = qemu_ram_addr_from_host(buffer);
@@ -3402,7 +3405,9 @@ 
                 }
                 addr1 += l;
                 access_len -= l;
-            }
+	    }
+	    dma_flush_range((unsigned long)buffer,
+			    (unsigned long)buffer + flush_len);
         }
         return;
     }
Index: qemu-kvm/target-ia64/cpu.h
===================================================================
--- qemu-kvm.orig/target-ia64/cpu.h
+++ qemu-kvm/target-ia64/cpu.h
@@ -73,7 +73,6 @@ 
  * These ones really should go to the appropriate tcg header file, if/when
  * tcg support is added for ia64.
  */
-void flush_icache_range(unsigned long start, unsigned long stop);
 void tcg_dump_info(FILE *f,
                    int (*cpu_fprintf)(FILE *f, const char *fmt, ...));
 
Index: qemu-kvm/target-ia64/fake-exec.c
===================================================================
--- qemu-kvm.orig/target-ia64/fake-exec.c
+++ qemu-kvm/target-ia64/fake-exec.c
@@ -41,15 +41,6 @@ 
     return;
 }
 
-void flush_icache_range(unsigned long start, unsigned long stop)
-{
-    while (start < stop) {
-	asm volatile ("fc %0" :: "r"(start));
-	start += 32;
-    }
-    asm volatile (";;sync.i;;srlz.i;;");
-}
-
 int cpu_restore_state(TranslationBlock *tb,
                       CPUState *env, unsigned long searched_pc,
                       void *puc)