diff mbox

[RFC,1/2] xen/page_alloc: Add size_align parameter to provide MFNs which are size aligned.

Message ID 1480480779-12078-2-git-send-email-konrad.wilk@oracle.com (mailing list archive)
State New, archived
Headers show

Commit Message

Konrad Rzeszutek Wilk Nov. 30, 2016, 4:39 a.m. UTC
This is to support the requirement that exists in PV dom0
when doing DMA requests:

"dma_alloc_coherent()
[...]
The CPU virtual address and the DMA address are both guaranteed to be
aligned to the smallest PAGE_SIZE order which is greater than or equal
to the requested size.  This invariant exists (for example) to guarantee
that if you allocate a chunk which is smaller than or equal to 64
kilobytes, the extent of the buffer you receive will not cross a 64K
boundary."

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 xen/common/memory.c         |  3 +++
 xen/common/page_alloc.c     | 22 +++++++++++++++++++++-
 xen/include/public/memory.h |  2 ++
 xen/include/xen/mm.h        |  2 ++
 4 files changed, 28 insertions(+), 1 deletion(-)

Comments

Jan Beulich Nov. 30, 2016, 9:30 a.m. UTC | #1
>>> On 30.11.16 at 05:39, <konrad@kernel.org> wrote:
> This is to support the requirement that exists in PV dom0
> when doing DMA requests:
> 
> "dma_alloc_coherent()
> [...]
> The CPU virtual address and the DMA address are both guaranteed to be
> aligned to the smallest PAGE_SIZE order which is greater than or equal
> to the requested size.  This invariant exists (for example) to guarantee
> that if you allocate a chunk which is smaller than or equal to 64
> kilobytes, the extent of the buffer you receive will not cross a 64K
> boundary."

So I'm having trouble understanding what it is that actually needs
fixing / changing here: Any order-N allocation will be order-N-aligned
already. Is your caller perhaps simply not passing in a large enough
order? And changing alloc_heap_pages(), which guarantees the
requested alignment already anyway (after all it takes an order
input, not a size one), looks completely pointless regardless of what
extra requirements you may want to put on the exchange hypercall.

Jan
Konrad Rzeszutek Wilk Nov. 30, 2016, 4:42 p.m. UTC | #2
On Wed, Nov 30, 2016 at 02:30:41AM -0700, Jan Beulich wrote:
> >>> On 30.11.16 at 05:39, <konrad@kernel.org> wrote:
> > This is to support the requirement that exists in PV dom0
> > when doing DMA requests:
> > 
> > "dma_alloc_coherent()
> > [...]
> > The CPU virtual address and the DMA address are both guaranteed to be
> > aligned to the smallest PAGE_SIZE order which is greater than or equal
> > to the requested size.  This invariant exists (for example) to guarantee
> > that if you allocate a chunk which is smaller than or equal to 64
> > kilobytes, the extent of the buffer you receive will not cross a 64K
> > boundary."
> 
> So I'm having trouble understanding what it is that actually needs
> fixing / changing here: Any order-N allocation will be order-N-aligned
> already. Is your caller perhaps simply not passing in a large enough
> order? And changing alloc_heap_pages(), which guarantees the
> requested alignment already anyway (after all it takes an order
> input, not a size one), looks completely pointless regardless of what
> extra requirements you may want to put on the exchange hypercall.

The page_alloc.c code walks through different order pages. Which means
that if it can't find one within the requested order pages it will
go one up (and so on). Eventually that means you do get the requested
order pages, but they are not guaranteed to be order aligned (as they
may be order aligned to a higher value).

> 
> Jan
>
Jan Beulich Nov. 30, 2016, 4:45 p.m. UTC | #3
>>> On 30.11.16 at 17:42, <konrad.wilk@oracle.com> wrote:
> On Wed, Nov 30, 2016 at 02:30:41AM -0700, Jan Beulich wrote:
>> >>> On 30.11.16 at 05:39, <konrad@kernel.org> wrote:
>> > This is to support the requirement that exists in PV dom0
>> > when doing DMA requests:
>> > 
>> > "dma_alloc_coherent()
>> > [...]
>> > The CPU virtual address and the DMA address are both guaranteed to be
>> > aligned to the smallest PAGE_SIZE order which is greater than or equal
>> > to the requested size.  This invariant exists (for example) to guarantee
>> > that if you allocate a chunk which is smaller than or equal to 64
>> > kilobytes, the extent of the buffer you receive will not cross a 64K
>> > boundary."
>> 
>> So I'm having trouble understanding what it is that actually needs
>> fixing / changing here: Any order-N allocation will be order-N-aligned
>> already. Is your caller perhaps simply not passing in a large enough
>> order? And changing alloc_heap_pages(), which guarantees the
>> requested alignment already anyway (after all it takes an order
>> input, not a size one), looks completely pointless regardless of what
>> extra requirements you may want to put on the exchange hypercall.
> 
> The page_alloc.c code walks through different order pages. Which means
> that if it can't find one within the requested order pages it will
> go one up (and so on). Eventually that means you do get the requested
> order pages, but they are not guaranteed to be order aligned (as they
> may be order aligned to a higher value).

But that's _better_ alignment than you asked for then.

Jan
diff mbox

Patch

diff --git a/xen/common/memory.c b/xen/common/memory.c
index 21797ca..a4c0c54 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -475,6 +475,9 @@  static long memory_exchange(XEN_GUEST_HANDLE_PARAM(xen_memory_exchange_t) arg)
         (BITS_PER_LONG+PAGE_SHIFT)));
     memflags |= MEMF_node(XENMEMF_get_node(exch.out.mem_flags));
 
+    if ( XENMEMF_align_size & exch.out.mem_flags && is_hardware_domain(d) )
+        memflags |= MEMF_size_align;
+
     for ( i = (exch.nr_exchanged >> in_chunk_order);
           i < (exch.in.nr_extents >> in_chunk_order);
           i++ )
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index ae2476d..e43f52f 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -738,7 +738,7 @@  static struct page_info *alloc_heap_pages(
      * Others try tmem pools then fail.  This is a workaround until all
      * post-dom0-creation-multi-page allocations can be eliminated.
      */
-    if ( ((order == 0) || (order >= 9)) &&
+    if ( ((order == 0) || (order >= 9)) && !(memflags & MEMF_size_align) &&
          (total_avail_pages <= midsize_alloc_zone_pages) &&
          tmem_freeable_pages() )
         goto try_tmem;
@@ -752,14 +752,34 @@  static struct page_info *alloc_heap_pages(
     {
         zone = zone_hi;
         do {
+            struct page_info *old = NULL;
+
             /* Check if target node can support the allocation. */
             if ( !avail[node] || (avail[node][zone] < request) )
                 continue;
 
             /* Find smallest order which can satisfy the request. */
             for ( j = order; j <= MAX_ORDER; j++ )
+            {
+ next_page:
                 if ( (pg = page_list_remove_head(&heap(node, zone, j))) )
+                {
+                    if ( memflags & MEMF_size_align )
+                    {
+                        if (pg == old)
+                            continue;
+
+                        if ( (page_to_mfn(pg) % request ) == 0 )
+                            goto found;
+
+                        page_list_add_tail(pg, &heap(node, zone, j));
+                        old = pg;
+                        pg = NULL;
+                        goto next_page;
+                    }
                     goto found;
+                }
+            }
         } while ( zone-- > zone_lo ); /* careful: unsigned zone may wrap */
 
         if ( (memflags & MEMF_exact_node) && req_node != NUMA_NO_NODE )
diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
index 5bf840f..311e7d8 100644
--- a/xen/include/public/memory.h
+++ b/xen/include/public/memory.h
@@ -58,6 +58,8 @@ 
 #define XENMEMF_exact_node(n) (XENMEMF_node(n) | XENMEMF_exact_node_request)
 /* Flag to indicate the node specified is virtual node */
 #define XENMEMF_vnode  (1<<18)
+/* Flag to indicate the allocation to be size aligned. */
+#define XENMEMF_align_size (1U<<19)
 #endif
 
 struct xen_memory_reservation {
diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
index 76fbb82..c505170 100644
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -224,6 +224,8 @@  struct npfec {
 #define  MEMF_no_owner    (1U<<_MEMF_no_owner)
 #define _MEMF_no_tlbflush 6
 #define  MEMF_no_tlbflush (1U<<_MEMF_no_tlbflush)
+#define _MEMF_size_align  7
+#define  MEMF_size_align  (1U<<_MEMF_size_align)
 #define _MEMF_node        8
 #define  MEMF_node_mask   ((1U << (8 * sizeof(nodeid_t))) - 1)
 #define  MEMF_node(n)     ((((n) + 1) & MEMF_node_mask) << _MEMF_node)