diff mbox

[v5,04/28] vmap: Add vmalloc_cb and vfree_cb

Message ID 1458849640-22588-5-git-send-email-konrad.wilk@oracle.com (mailing list archive)
State New, archived
Headers show

Commit Message

Konrad Rzeszutek Wilk March 24, 2016, 8 p.m. UTC
For those users who want to supply their own vmap callback.
To be called _after_ the pages have been allocated and
the vmap API is ready to hand out virtual addresses.

Instead of using the vmap ones it can call the callback
which will be responsible for generating the virtual
address.

This allows users (such as xSplice) to provide their own
mechanism to set the page flags.
The users (such as patch titled "xsplice: Implement payload
loading") can wrap the calls to __vmap to accomplish this.

We also provide a mechanism for the calleer to squirrel
the MFN array in case they want to modify the virtual
addresses easily.

We also provide the free-ing code path - to use the vunmap_cb
to take care of tearing down the virtual addresses.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

---
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Tim Deegan <tim@xen.org>

v4: New patch.
v5: Update per Jan's comments.
---
---
 xen/common/vmap.c      | 33 ++++++++++++++++++++++++++-------
 xen/include/xen/vmap.h | 14 ++++++++++++++
 2 files changed, 40 insertions(+), 7 deletions(-)

Comments

Jan Beulich March 30, 2016, 4:24 p.m. UTC | #1
>>> On 24.03.16 at 21:00, <konrad.wilk@oracle.com> wrote:
> @@ -266,16 +275,15 @@ void *vzalloc(size_t size)
>      return p;
>  }
>  
> -void vfree(void *va)
> +void vfree_cb(void *va, unsigned int pages, vfree_cb_t *vfree_cb_fnc)

Just to repeat: This "caller provides size" worries me, the more that
this doesn't mirror anything the allocation side does. Would you mind
providing a case where using vm_size() instead is not correct?

> --- a/xen/include/xen/vmap.h
> +++ b/xen/include/xen/vmap.h
> @@ -12,9 +12,23 @@ void *__vmap(const mfn_t *mfn, unsigned int granularity,
>  void *vmap(const mfn_t *mfn, unsigned int nr);
>  void vunmap(const void *);
>  void *vmalloc(size_t size);
> +
> +/*
> + * Callback for vmalloc_cb to use when vmap-ing.
> + */

Comment style.

> +typedef void *(vmap_cb_t)(const mfn_t *mfn, unsigned int pages);

Stray parentheses (again).

Jan
Konrad Rzeszutek Wilk March 30, 2016, 4:44 p.m. UTC | #2
On Wed, Mar 30, 2016 at 10:24:41AM -0600, Jan Beulich wrote:
> >>> On 24.03.16 at 21:00, <konrad.wilk@oracle.com> wrote:
> > @@ -266,16 +275,15 @@ void *vzalloc(size_t size)
> >      return p;
> >  }
> >  
> > -void vfree(void *va)
> > +void vfree_cb(void *va, unsigned int pages, vfree_cb_t *vfree_cb_fnc)
> 
> Just to repeat: This "caller provides size" worries me, the more that
> this doesn't mirror anything the allocation side does. Would you mind
> providing a case where using vm_size() instead is not correct?

When the virtual addresses (to which we will stitch the pages allocated
by vmalloc) are not allocated (provided?) by vmap.

vm_size() will be very unhappy if that virtual address it is provided with
are not from the 'vmap' pool and will return 0.

> 
> > --- a/xen/include/xen/vmap.h
> > +++ b/xen/include/xen/vmap.h
> > @@ -12,9 +12,23 @@ void *__vmap(const mfn_t *mfn, unsigned int granularity,
> >  void *vmap(const mfn_t *mfn, unsigned int nr);
> >  void vunmap(const void *);
> >  void *vmalloc(size_t size);
> > +
> > +/*
> > + * Callback for vmalloc_cb to use when vmap-ing.
> > + */
> 
> Comment style.
> 
> > +typedef void *(vmap_cb_t)(const mfn_t *mfn, unsigned int pages);
> 
> Stray parentheses (again).

<sigh> I swore I fixed it and then did a pass through the rest to fix
others!

But a grep tells me:

+typedef void *(vmap_cb_t)(const mfn_t *mfn, unsigned int pages);
+typedef void (vfree_cb_t)(void *va, unsigned int pages);
+typedef int (find_space_t)(size_t, unsigned long *, unsigned long *);
+typedef void (*xsplice_loadcall_t)(void);
+typedef void (*xsplice_unloadcall_t)(void);

I need to fix those. Sorry about that - I really though I had them fixed.

> 
> Jan
>
Jan Beulich March 31, 2016, 6:46 a.m. UTC | #3
>>> On 30.03.16 at 18:44, <konrad.wilk@oracle.com> wrote:
> On Wed, Mar 30, 2016 at 10:24:41AM -0600, Jan Beulich wrote:
>> >>> On 24.03.16 at 21:00, <konrad.wilk@oracle.com> wrote:
>> > @@ -266,16 +275,15 @@ void *vzalloc(size_t size)
>> >      return p;
>> >  }
>> >  
>> > -void vfree(void *va)
>> > +void vfree_cb(void *va, unsigned int pages, vfree_cb_t *vfree_cb_fnc)
>> 
>> Just to repeat: This "caller provides size" worries me, the more that
>> this doesn't mirror anything the allocation side does. Would you mind
>> providing a case where using vm_size() instead is not correct?
> 
> When the virtual addresses (to which we will stitch the pages allocated
> by vmalloc) are not allocated (provided?) by vmap.
> 
> vm_size() will be very unhappy if that virtual address it is provided with
> are not from the 'vmap' pool and will return 0.

Hmm, okay, I now see that I got mislead by "This allows users (such
as xSplice) to provide their own mechanism to set the page flags",
which suggested to me that all you want is control over those flags.
Now that I look again I realize that I could have easily spotted the
actual intention right away. If all you really want to re-use from the
existing vmalloc() is the allocation mechanism of the set of pages,
then no, I don't think this should be done by playing with vmalloc().
Just have your caller do that allocation itself.

If, otoh, you left that VA management to (an extended version of)
vmap(), by e.g. allowing the caller to request allocation from a
different VA range (much like iirc x86-64 Linux handles its modules
address range allocation), things would be different. After all the
VA management is the important part here, while the backing
memory allocation is just a trivial auxiliary operation.

Jan
Konrad Rzeszutek Wilk March 31, 2016, 11:49 a.m. UTC | #4
On Thu, Mar 31, 2016 at 12:46:24AM -0600, Jan Beulich wrote:
> >>> On 30.03.16 at 18:44, <konrad.wilk@oracle.com> wrote:
> > On Wed, Mar 30, 2016 at 10:24:41AM -0600, Jan Beulich wrote:
> >> >>> On 24.03.16 at 21:00, <konrad.wilk@oracle.com> wrote:
> >> > @@ -266,16 +275,15 @@ void *vzalloc(size_t size)
> >> >      return p;
> >> >  }
> >> >  
> >> > -void vfree(void *va)
> >> > +void vfree_cb(void *va, unsigned int pages, vfree_cb_t *vfree_cb_fnc)
> >> 
> >> Just to repeat: This "caller provides size" worries me, the more that
> >> this doesn't mirror anything the allocation side does. Would you mind
> >> providing a case where using vm_size() instead is not correct?
> > 
> > When the virtual addresses (to which we will stitch the pages allocated
> > by vmalloc) are not allocated (provided?) by vmap.
> > 
> > vm_size() will be very unhappy if that virtual address it is provided with
> > are not from the 'vmap' pool and will return 0.
> 
> Hmm, okay, I now see that I got mislead by "This allows users (such
> as xSplice) to provide their own mechanism to set the page flags",
> which suggested to me that all you want is control over those flags.

Sorry about that! That was the initial part which didn't work out
as the displacement from kernel virtual address ranges to vmap ones
was 34-bits. And that broke ELF relocations which were 32-bit!


> Now that I look again I realize that I could have easily spotted the
> actual intention right away. If all you really want to re-use from the
> existing vmalloc() is the allocation mechanism of the set of pages,
> then no, I don't think this should be done by playing with vmalloc().
> Just have your caller do that allocation itself.

Like it was in v4 :-)

I will drop this and rework the patch that does the backing memory operation
to use its own version.
> 
> If, otoh, you left that VA management to (an extended version of)
> vmap(), by e.g. allowing the caller to request allocation from a
> different VA range (much like iirc x86-64 Linux handles its modules
> address range allocation), things would be different. After all the
> VA management is the important part here, while the backing
> memory allocation is just a trivial auxiliary operation.
> 
> Jan
>
diff mbox

Patch

diff --git a/xen/common/vmap.c b/xen/common/vmap.c
index 134eda0..08e7859 100644
--- a/xen/common/vmap.c
+++ b/xen/common/vmap.c
@@ -216,7 +216,7 @@  void vunmap(const void *va)
     vm_free(va);
 }
 
-void *vmalloc(size_t size)
+void *vmalloc_cb(size_t size, vmap_cb_t *vmap_cb, mfn_t **mfn_array)
 {
     mfn_t *mfn;
     size_t pages, i;
@@ -238,11 +238,15 @@  void *vmalloc(size_t size)
         mfn[i] = _mfn(page_to_mfn(pg));
     }
 
-    va = vmap(mfn, pages);
+    va = vmap_cb ? vmap_cb(mfn, pages) : vmap(mfn, pages);
     if ( va == NULL )
         goto error;
 
-    xfree(mfn);
+    if ( mfn_array )
+        *mfn_array = mfn;
+    else
+        xfree(mfn);
+
     return va;
 
  error:
@@ -252,6 +256,11 @@  void *vmalloc(size_t size)
     return NULL;
 }
 
+void *vmalloc(size_t size)
+{
+    return vmalloc_cb(size, NULL, NULL);
+}
+
 void *vzalloc(size_t size)
 {
     void *p = vmalloc(size);
@@ -266,16 +275,15 @@  void *vzalloc(size_t size)
     return p;
 }
 
-void vfree(void *va)
+void vfree_cb(void *va, unsigned int pages, vfree_cb_t *vfree_cb_fnc)
 {
-    unsigned int i, pages;
+    unsigned int i;
     struct page_info *pg;
     PAGE_LIST_HEAD(pg_list);
 
     if ( !va )
         return;
 
-    pages = vm_size(va);
     ASSERT(pages);
 
     for ( i = 0; i < pages; i++ )
@@ -285,9 +293,20 @@  void vfree(void *va)
         ASSERT(page);
         page_list_add(page, &pg_list);
     }
-    vunmap(va);
+    if ( !vfree_cb_fnc )
+        vunmap(va);
+    else
+        vfree_cb_fnc(va, pages);
 
     while ( (pg = page_list_remove_head(&pg_list)) != NULL )
         free_domheap_page(pg);
 }
+
+void vfree(void *va)
+{
+    if ( !va )
+        return;
+
+    vfree_cb(va, vm_size(va), NULL);
+}
 #endif
diff --git a/xen/include/xen/vmap.h b/xen/include/xen/vmap.h
index 5671ac8..02cadb3 100644
--- a/xen/include/xen/vmap.h
+++ b/xen/include/xen/vmap.h
@@ -12,9 +12,23 @@  void *__vmap(const mfn_t *mfn, unsigned int granularity,
 void *vmap(const mfn_t *mfn, unsigned int nr);
 void vunmap(const void *);
 void *vmalloc(size_t size);
+
+/*
+ * Callback for vmalloc_cb to use when vmap-ing.
+ */
+typedef void *(vmap_cb_t)(const mfn_t *mfn, unsigned int pages);
+void *vmalloc_cb(size_t size, vmap_cb_t *vmap_cb, mfn_t **);
+
 void *vzalloc(size_t size);
 void vfree(void *va);
 
+/*
+ * Callback for vfree to use an equivalent of vmap_cb_t
+ * when tearing down.
+ */
+typedef void (vfree_cb_t)(void *va, unsigned int pages);
+void vfree_cb(void *va, unsigned int pages, vfree_cb_t *vfree_cb_fnc);
+
 void __iomem *ioremap(paddr_t, size_t);
 
 static inline void iounmap(void __iomem *va)