diff mbox

[v5,11/13] xen: introduce xen_alloc/free_coherent_pages

Message ID 1377801154-29215-11-git-send-email-stefano.stabellini@eu.citrix.com (mailing list archive)
State New, archived
Headers show

Commit Message

Stefano Stabellini Aug. 29, 2013, 6:32 p.m. UTC
xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
and devices. On native x86 and ARMv8 is sufficient to call
__get_free_pages in order to get a coherent buffer, while on ARM we need
to call arm_dma_ops.alloc.

Introduce xen_alloc_coherent_pages to abstract the arch specific buffer
allocation.

Similarly introduce xen_free_coherent_pages to free a coherent buffer:
on x86 and ARM64 is simply a call to free_pages while on ARM is
arm_dma_ops.free.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 arch/arm/include/asm/xen/page-coherent.h   |   22 ++++++++++++++++++++++
 arch/arm64/include/asm/xen/page-coherent.h |   24 ++++++++++++++++++++++++
 arch/ia64/include/asm/xen/page-coherent.h  |   24 ++++++++++++++++++++++++
 arch/x86/include/asm/xen/page-coherent.h   |   24 ++++++++++++++++++++++++
 4 files changed, 94 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm/include/asm/xen/page-coherent.h
 create mode 100644 arch/arm64/include/asm/xen/page-coherent.h
 create mode 100644 arch/ia64/include/asm/xen/page-coherent.h
 create mode 100644 arch/x86/include/asm/xen/page-coherent.h

Comments

Catalin Marinas Sept. 5, 2013, 4:09 p.m. UTC | #1
On Thu, Aug 29, 2013 at 07:32:32PM +0100, Stefano Stabellini wrote:
> xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
> and devices. On native x86 and ARMv8 is sufficient to call
> __get_free_pages in order to get a coherent buffer, while on ARM we need
> to call arm_dma_ops.alloc.

Don't bet on this for ARMv8. It's not mandated for the architecture, so
at some point some SoC will require non-cacheable buffers for coherency.
Stefano Stabellini Sept. 5, 2013, 4:43 p.m. UTC | #2
On Thu, 5 Sep 2013, Catalin Marinas wrote:
> On Thu, Aug 29, 2013 at 07:32:32PM +0100, Stefano Stabellini wrote:
> > xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
> > and devices. On native x86 and ARMv8 is sufficient to call
> > __get_free_pages in order to get a coherent buffer, while on ARM we need
> > to call arm_dma_ops.alloc.
> 
> Don't bet on this for ARMv8. It's not mandated for the architecture, so
> at some point some SoC will require non-cacheable buffers for coherency.

I see.
Would it be better if I implemented xen_alloc_coherent_pages on armv8 by
calling arm64_swiotlb_dma_ops.alloc?
Catalin Marinas Sept. 6, 2013, 2:14 p.m. UTC | #3
On Thu, Sep 05, 2013 at 05:43:33PM +0100, Stefano Stabellini wrote:
> On Thu, 5 Sep 2013, Catalin Marinas wrote:
> > On Thu, Aug 29, 2013 at 07:32:32PM +0100, Stefano Stabellini wrote:
> > > xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
> > > and devices. On native x86 and ARMv8 is sufficient to call
> > > __get_free_pages in order to get a coherent buffer, while on ARM we need
> > > to call arm_dma_ops.alloc.
> > 
> > Don't bet on this for ARMv8. It's not mandated for the architecture, so
> > at some point some SoC will require non-cacheable buffers for coherency.
> 
> I see.
> Would it be better if I implemented xen_alloc_coherent_pages on armv8 by
> calling arm64_swiotlb_dma_ops.alloc?

What does this buffer do exactly? Is it allocated by guests?

Currently arm64_swiotlb_dma_ops assume cache-coherent DMA. I have a
patch which introduces new ops for non-coherent DMA but this should
really be orthogonal to swiotlb. You can basically have 4 combinations
of coherent/non-coherent and swiotlb/iommu. Mark Rutland is currently
looking into how best to describe this via DT as it may not even be per
SoC but per bus or device.
Stefano Stabellini Sept. 6, 2013, 2:59 p.m. UTC | #4
On Fri, 6 Sep 2013, Catalin Marinas wrote:
> On Thu, Sep 05, 2013 at 05:43:33PM +0100, Stefano Stabellini wrote:
> > On Thu, 5 Sep 2013, Catalin Marinas wrote:
> > > On Thu, Aug 29, 2013 at 07:32:32PM +0100, Stefano Stabellini wrote:
> > > > xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
> > > > and devices. On native x86 and ARMv8 is sufficient to call
> > > > __get_free_pages in order to get a coherent buffer, while on ARM we need
> > > > to call arm_dma_ops.alloc.
> > > 
> > > Don't bet on this for ARMv8. It's not mandated for the architecture, so
> > > at some point some SoC will require non-cacheable buffers for coherency.
> > 
> > I see.
> > Would it be better if I implemented xen_alloc_coherent_pages on armv8 by
> > calling arm64_swiotlb_dma_ops.alloc?
> 
> What does this buffer do exactly? Is it allocated by guests?

It is allocated by Dom0 to do DMA to/from a device.
It is the buffer that is going to be returned by dma_map_ops.alloc to
the caller:

On x86:
dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> __get_free_pages

On ARM:
dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> arm_dma_ops.alloc

On ARM64
dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> ????



> Currently arm64_swiotlb_dma_ops assume cache-coherent DMA. I have a
> patch which introduces new ops for non-coherent DMA but this should
> really be orthogonal to swiotlb. You can basically have 4 combinations
> of coherent/non-coherent and swiotlb/iommu. Mark Rutland is currently
> looking into how best to describe this via DT as it may not even be per
> SoC but per bus or device.

It seems to me that calling arm64_swiotlb_dma_ops.alloc would ensure
that we allocate the right buffer for the right device?
Catalin Marinas Sept. 6, 2013, 3:59 p.m. UTC | #5
On Fri, Sep 06, 2013 at 03:59:02PM +0100, Stefano Stabellini wrote:
> On Fri, 6 Sep 2013, Catalin Marinas wrote:
> > On Thu, Sep 05, 2013 at 05:43:33PM +0100, Stefano Stabellini wrote:
> > > On Thu, 5 Sep 2013, Catalin Marinas wrote:
> > > > On Thu, Aug 29, 2013 at 07:32:32PM +0100, Stefano Stabellini wrote:
> > > > > xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
> > > > > and devices. On native x86 and ARMv8 is sufficient to call
> > > > > __get_free_pages in order to get a coherent buffer, while on ARM we need
> > > > > to call arm_dma_ops.alloc.
> > > > 
> > > > Don't bet on this for ARMv8. It's not mandated for the architecture, so
> > > > at some point some SoC will require non-cacheable buffers for coherency.
> > > 
> > > I see.
> > > Would it be better if I implemented xen_alloc_coherent_pages on armv8 by
> > > calling arm64_swiotlb_dma_ops.alloc?
> > 
> > What does this buffer do exactly? Is it allocated by guests?
> 
> It is allocated by Dom0 to do DMA to/from a device.
> It is the buffer that is going to be returned by dma_map_ops.alloc to
> the caller:
> 
> On x86:
> dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> __get_free_pages
> 
> On ARM:
> dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> arm_dma_ops.alloc
> 
> On ARM64
> dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> ????

OK, I'm getting more confused. Do all the above calls happen in the
guest, Dom0, or a mix?
Stefano Stabellini Sept. 6, 2013, 4:09 p.m. UTC | #6
On Fri, 6 Sep 2013, Catalin Marinas wrote:
> On Fri, Sep 06, 2013 at 03:59:02PM +0100, Stefano Stabellini wrote:
> > On Fri, 6 Sep 2013, Catalin Marinas wrote:
> > > On Thu, Sep 05, 2013 at 05:43:33PM +0100, Stefano Stabellini wrote:
> > > > On Thu, 5 Sep 2013, Catalin Marinas wrote:
> > > > > On Thu, Aug 29, 2013 at 07:32:32PM +0100, Stefano Stabellini wrote:
> > > > > > xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
> > > > > > and devices. On native x86 and ARMv8 is sufficient to call
> > > > > > __get_free_pages in order to get a coherent buffer, while on ARM we need
> > > > > > to call arm_dma_ops.alloc.
> > > > > 
> > > > > Don't bet on this for ARMv8. It's not mandated for the architecture, so
> > > > > at some point some SoC will require non-cacheable buffers for coherency.
> > > > 
> > > > I see.
> > > > Would it be better if I implemented xen_alloc_coherent_pages on armv8 by
> > > > calling arm64_swiotlb_dma_ops.alloc?
> > > 
> > > What does this buffer do exactly? Is it allocated by guests?
> > 
> > It is allocated by Dom0 to do DMA to/from a device.
> > It is the buffer that is going to be returned by dma_map_ops.alloc to
> > the caller:
> > 
> > On x86:
> > dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> __get_free_pages
> > 
> > On ARM:
> > dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> arm_dma_ops.alloc
> > 
> > On ARM64
> > dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> ????
> 
> OK, I'm getting more confused. Do all the above calls happen in the
> guest, Dom0, or a mix?

I guess the confusion comes from a difference in terminology: dom0 is a
guest like the others, just a bit more privileged. We usually call domU
a normal unprivileged guest.

The above calls would happen in Dom0 (when an SMMU is not available).
They could also happen in a DomU if we assign a physical device to it
(and an SMMU is not available).

So in general they would happen in any guest that needs to program a
real device.
Catalin Marinas Sept. 6, 2013, 4:20 p.m. UTC | #7
On Fri, Sep 06, 2013 at 05:09:52PM +0100, Stefano Stabellini wrote:
> On Fri, 6 Sep 2013, Catalin Marinas wrote:
> > On Fri, Sep 06, 2013 at 03:59:02PM +0100, Stefano Stabellini wrote:
> > > On Fri, 6 Sep 2013, Catalin Marinas wrote:
> > > > On Thu, Sep 05, 2013 at 05:43:33PM +0100, Stefano Stabellini wrote:
> > > > > On Thu, 5 Sep 2013, Catalin Marinas wrote:
> > > > > > On Thu, Aug 29, 2013 at 07:32:32PM +0100, Stefano Stabellini wrote:
> > > > > > > xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
> > > > > > > and devices. On native x86 and ARMv8 is sufficient to call
> > > > > > > __get_free_pages in order to get a coherent buffer, while on ARM we need
> > > > > > > to call arm_dma_ops.alloc.
> > > > > > 
> > > > > > Don't bet on this for ARMv8. It's not mandated for the architecture, so
> > > > > > at some point some SoC will require non-cacheable buffers for coherency.
> > > > > 
> > > > > I see.
> > > > > Would it be better if I implemented xen_alloc_coherent_pages on armv8 by
> > > > > calling arm64_swiotlb_dma_ops.alloc?
> > > > 
> > > > What does this buffer do exactly? Is it allocated by guests?
> > > 
> > > It is allocated by Dom0 to do DMA to/from a device.
> > > It is the buffer that is going to be returned by dma_map_ops.alloc to
> > > the caller:
> > > 
> > > On x86:
> > > dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> __get_free_pages
> > > 
> > > On ARM:
> > > dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> arm_dma_ops.alloc
> > > 
> > > On ARM64
> > > dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> ????
> > 
> > OK, I'm getting more confused. Do all the above calls happen in the
> > guest, Dom0, or a mix?
> 
> I guess the confusion comes from a difference in terminology: dom0 is a
> guest like the others, just a bit more privileged. We usually call domU
> a normal unprivileged guest.

Thanks for the explanation.

> The above calls would happen in Dom0 (when an SMMU is not available).

So for Dom0, are there cases when it needs xen_swiotlb_alloc_coherent()
and other cases when it needs the arm_dma_ops.alloc? In Dom0 could we
not always use the default dma_alloc_coherent()?

> They could also happen in a DomU if we assign a physical device to it
> (and an SMMU is not available).

The problem is that you don't necessarily know one kind of coherency you
know for a physical device. As I said, we plan to do this DT-driven.
Stefano Stabellini Sept. 6, 2013, 4:52 p.m. UTC | #8
On Fri, 6 Sep 2013, Catalin Marinas wrote:
> On Fri, Sep 06, 2013 at 05:09:52PM +0100, Stefano Stabellini wrote:
> > On Fri, 6 Sep 2013, Catalin Marinas wrote:
> > > On Fri, Sep 06, 2013 at 03:59:02PM +0100, Stefano Stabellini wrote:
> > > > On Fri, 6 Sep 2013, Catalin Marinas wrote:
> > > > > On Thu, Sep 05, 2013 at 05:43:33PM +0100, Stefano Stabellini wrote:
> > > > > > On Thu, 5 Sep 2013, Catalin Marinas wrote:
> > > > > > > On Thu, Aug 29, 2013 at 07:32:32PM +0100, Stefano Stabellini wrote:
> > > > > > > > xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
> > > > > > > > and devices. On native x86 and ARMv8 is sufficient to call
> > > > > > > > __get_free_pages in order to get a coherent buffer, while on ARM we need
> > > > > > > > to call arm_dma_ops.alloc.
> > > > > > > 
> > > > > > > Don't bet on this for ARMv8. It's not mandated for the architecture, so
> > > > > > > at some point some SoC will require non-cacheable buffers for coherency.
> > > > > > 
> > > > > > I see.
> > > > > > Would it be better if I implemented xen_alloc_coherent_pages on armv8 by
> > > > > > calling arm64_swiotlb_dma_ops.alloc?
> > > > > 
> > > > > What does this buffer do exactly? Is it allocated by guests?
> > > > 
> > > > It is allocated by Dom0 to do DMA to/from a device.
> > > > It is the buffer that is going to be returned by dma_map_ops.alloc to
> > > > the caller:
> > > > 
> > > > On x86:
> > > > dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> __get_free_pages
> > > > 
> > > > On ARM:
> > > > dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> arm_dma_ops.alloc
> > > > 
> > > > On ARM64
> > > > dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> ????
> > > 
> > > OK, I'm getting more confused. Do all the above calls happen in the
> > > guest, Dom0, or a mix?
> > 
> > I guess the confusion comes from a difference in terminology: dom0 is a
> > guest like the others, just a bit more privileged. We usually call domU
> > a normal unprivileged guest.
> 
> Thanks for the explanation.
> 
> > The above calls would happen in Dom0 (when an SMMU is not available).
> 
> So for Dom0, are there cases when it needs xen_swiotlb_alloc_coherent()
> and other cases when it needs the arm_dma_ops.alloc? In Dom0 could we
> not always use the default dma_alloc_coherent()?

Keep in mind that dom0 runs with second stage translation enabled.  This
means that what dom0 thinks is a physical address (machine address in
Xen terminology), it's actually just an intermediate physical address.
Also for the same reason what dom0 thinks is a contiguous buffer, it's
actually only contiguous in the intermediate physical address space.

So every time dom0 wants to allocate a dma-capable buffer it needs to go
via swiotlb-xen, that makes the buffer contiguous in the physical address
space (machine address space in Xen terminology) by issuing an hypercall.
swiotlb-xen also returns the physical address (machine address in Xen
terminology) to the caller.

To answer your question: in absence of an SMMU, all the
dma_alloc_coherent calls in dom0 need to go via xen_swiotlb_alloc_coherent.

xen_swiotlb_alloc_coherent cannot allocate a contigous buffer in physical
address space (see above), but it has to allocate a buffer coherent from
the caching attributes point of view. The hypervisor is going to take
care of making the allocated buffer really contiguous in physical address
space.

So now the problem is: how is xen_swiotlb_alloc_coherent going to
allocate a coherent buffer?
On x86 I can just call __get_free_pages.
On ARM I have to call arm_dma_ops.alloc.
On ARM64 ???

BTW if the Matrix is your kind of fun, I wrote an blog post explaining the
swiotlb Morpheus style:
http://blog.xen.org/index.php/2013/08/14/swiotlb-by-morpheus/


> > They could also happen in a DomU if we assign a physical device to it
> > (and an SMMU is not available).
> 
> The problem is that you don't necessarily know one kind of coherency you
> know for a physical device. As I said, we plan to do this DT-driven.
 
OK, but if I call arm64_swiotlb_dma_ops.alloc passing the right
arguments to it, I should be able to get the right coherency for the
right device, correct?
Catalin Marinas Sept. 9, 2013, 3:51 p.m. UTC | #9
On 6 Sep 2013, at 17:52, Stefano Stabellini <Stefano.Stabellini@eu.citrix.com> wrote:
> On Fri, 6 Sep 2013, Catalin Marinas wrote:
>> On Fri, Sep 06, 2013 at 05:09:52PM +0100, Stefano Stabellini wrote:
>>> On Fri, 6 Sep 2013, Catalin Marinas wrote:
>>>> On Fri, Sep 06, 2013 at 03:59:02PM +0100, Stefano Stabellini wrote:
>>>>> On Fri, 6 Sep 2013, Catalin Marinas wrote:
>>>>>> On Thu, Sep 05, 2013 at 05:43:33PM +0100, Stefano Stabellini wrote:
>>>>>>> On Thu, 5 Sep 2013, Catalin Marinas wrote:
>>>>>>>> On Thu, Aug 29, 2013 at 07:32:32PM +0100, Stefano Stabellini wrote:
>>>>>>>>> xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
>>>>>>>>> and devices. On native x86 and ARMv8 is sufficient to call
>>>>>>>>> __get_free_pages in order to get a coherent buffer, while on ARM we need
>>>>>>>>> to call arm_dma_ops.alloc.
>>>>>>>> 
>>>>>>>> Don't bet on this for ARMv8. It's not mandated for the architecture, so
>>>>>>>> at some point some SoC will require non-cacheable buffers for coherency.
>>>>>>> 
>>>>>>> I see.
>>>>>>> Would it be better if I implemented xen_alloc_coherent_pages on armv8 by
>>>>>>> calling arm64_swiotlb_dma_ops.alloc?
>>>>>> 
>>>>>> What does this buffer do exactly? Is it allocated by guests?
>>>>> 
>>>>> It is allocated by Dom0 to do DMA to/from a device.
>>>>> It is the buffer that is going to be returned by dma_map_ops.alloc to
>>>>> the caller:
>>>>> 
>>>>> On x86:
>>>>> dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> __get_free_pages
>>>>> 
>>>>> On ARM:
>>>>> dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> arm_dma_ops.alloc
>>>>> 
>>>>> On ARM64
>>>>> dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> ????
>>>> 
>>>> OK, I'm getting more confused. Do all the above calls happen in the
>>>> guest, Dom0, or a mix?
>>> 
>>> I guess the confusion comes from a difference in terminology: dom0 is a
>>> guest like the others, just a bit more privileged. We usually call domU
>>> a normal unprivileged guest.
>> 
>> Thanks for the explanation.
>> 
>>> The above calls would happen in Dom0 (when an SMMU is not available).
>> 
>> So for Dom0, are there cases when it needs xen_swiotlb_alloc_coherent()
>> and other cases when it needs the arm_dma_ops.alloc? In Dom0 could we
>> not always use the default dma_alloc_coherent()?
> 
> Keep in mind that dom0 runs with second stage translation enabled.  This
> means that what dom0 thinks is a physical address (machine address in
> Xen terminology), it's actually just an intermediate physical address.
> Also for the same reason what dom0 thinks is a contiguous buffer, it's
> actually only contiguous in the intermediate physical address space.

OK, it makes sense now.  I thought dom0 is like the KVM host where stage
2 is disabled (or just flat).

> BTW if the Matrix is your kind of fun, I wrote an blog post explaining the
> swiotlb Morpheus style:
> http://blog.xen.org/index.php/2013/08/14/swiotlb-by-morpheus/

That was easier to understand ;)

>>> They could also happen in a DomU if we assign a physical device to it
>>> (and an SMMU is not available).
>> 
>> The problem is that you don't necessarily know one kind of coherency you
>> know for a physical device. As I said, we plan to do this DT-driven.
> 
> OK, but if I call arm64_swiotlb_dma_ops.alloc passing the right
> arguments to it, I should be able to get the right coherency for the
> right device, correct?

I think it needs a bit more work on the Xen part.  Basically
dma_alloc_attrs() calls get_dma_ops() to obtain the best DMA operations
for a device.  arm64_swiotlb_dma_ops is just the default implementation
and I'll add a _noncoherent variant as well.  Default dma_ops will be
set to one of these during boot.  But a device is also allowed to have
its own dev->archdata.dma_ops, set via set_dma_ops().

So even if you set the default dma_ops to Xen ops, you may not get them
via dma_alloc_coherent().  I don't see any easier solution other than
patching the dma_alloc_attrs() function to issue a Hyp call after the
memory has been allocated with the get_dma_ops()->alloc().  But I don't
like this either.

Catalin
diff mbox

Patch

diff --git a/arch/arm/include/asm/xen/page-coherent.h b/arch/arm/include/asm/xen/page-coherent.h
new file mode 100644
index 0000000..af2cf8d
--- /dev/null
+++ b/arch/arm/include/asm/xen/page-coherent.h
@@ -0,0 +1,22 @@ 
+#ifndef _ASM_ARM_XEN_PAGE_COHERENT_H
+#define _ASM_ARM_XEN_PAGE_COHERENT_H
+
+#include <asm/page.h>
+#include <linux/dma-attrs.h>
+#include <linux/dma-mapping.h>
+
+static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
+		dma_addr_t *dma_handle, gfp_t flags,
+		struct dma_attrs *attrs)
+{
+	return arm_dma_ops.alloc(hwdev, size, dma_handle, flags, attrs);
+}
+
+static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
+		void *cpu_addr, dma_addr_t dma_handle,
+		struct dma_attrs *attrs)
+{
+	return arm_dma_ops.free(hwdev, size, cpu_addr, dma_handle, attrs);
+}
+
+#endif /* _ASM_ARM_XEN_PAGE_COHERENT_H */
diff --git a/arch/arm64/include/asm/xen/page-coherent.h b/arch/arm64/include/asm/xen/page-coherent.h
new file mode 100644
index 0000000..0d6ad25
--- /dev/null
+++ b/arch/arm64/include/asm/xen/page-coherent.h
@@ -0,0 +1,24 @@ 
+#ifndef _ASM_ARM64_XEN_PAGE_COHERENT_H
+#define _ASM_ARM64_XEN_PAGE_COHERENT_H
+
+#include <asm/page.h>
+#include <linux/dma-attrs.h>
+#include <linux/dma-mapping.h>
+
+static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
+		dma_addr_t *dma_handle, gfp_t flags,
+		struct dma_attrs *attrs)
+{
+	void *vstart = (void*)__get_free_pages(flags, get_order(size));
+	*dma_handle = virt_to_phys(vstart);
+	return vstart;
+}
+
+static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
+		void *cpu_addr, dma_addr_t dma_handle,
+		struct dma_attrs *attrs)
+{
+	free_pages((unsigned long) cpu_addr, get_order(size));
+}
+
+#endif /* _ASM_ARM64_XEN_PAGE_COHERENT_H */
diff --git a/arch/ia64/include/asm/xen/page-coherent.h b/arch/ia64/include/asm/xen/page-coherent.h
new file mode 100644
index 0000000..37b929c
--- /dev/null
+++ b/arch/ia64/include/asm/xen/page-coherent.h
@@ -0,0 +1,24 @@ 
+#ifndef _ASM_IA64_XEN_PAGE_COHERENT_H
+#define _ASM_IA64_XEN_PAGE_COHERENT_H
+
+#include <asm/page.h>
+#include <linux/dma-attrs.h>
+#include <linux/dma-mapping.h>
+
+static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
+		dma_addr_t *dma_handle, gfp_t flags,
+		struct dma_attrs *attrs)
+{
+	void *vstart = (void*)__get_free_pages(flags, get_order(size));
+	*dma_handle = virt_to_phys(vstart);
+	return vstart;
+}
+
+static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
+		void *cpu_addr, dma_addr_t dma_handle,
+		struct dma_attrs *attrs)
+{
+	free_pages((unsigned long) cpu_addr, get_order(size));
+}
+
+#endif /* _ASM_IA64_XEN_PAGE_COHERENT_H */
diff --git a/arch/x86/include/asm/xen/page-coherent.h b/arch/x86/include/asm/xen/page-coherent.h
new file mode 100644
index 0000000..31de2e0
--- /dev/null
+++ b/arch/x86/include/asm/xen/page-coherent.h
@@ -0,0 +1,24 @@ 
+#ifndef _ASM_X86_XEN_PAGE_COHERENT_H
+#define _ASM_X86_XEN_PAGE_COHERENT_H
+
+#include <asm/page.h>
+#include <linux/dma-attrs.h>
+#include <linux/dma-mapping.h>
+
+static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
+		dma_addr_t *dma_handle, gfp_t flags,
+		struct dma_attrs *attrs)
+{
+	void *vstart = (void*)__get_free_pages(flags, get_order(size));
+	*dma_handle = virt_to_phys(vstart);
+	return vstart;
+}
+
+static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
+		void *cpu_addr, dma_addr_t dma_handle,
+		struct dma_attrs *attrs)
+{
+	free_pages((unsigned long) cpu_addr, get_order(size));
+}
+
+#endif /* _ASM_X86_XEN_PAGE_COHERENT_H */