diff mbox

[v12,05/11] x86/mm: add HYPERVISOR_memory_op to acquire guest resources

Message ID 20171017132432.24093-6-paul.durrant@citrix.com (mailing list archive)
State New, archived
Headers show

Commit Message

Paul Durrant Oct. 17, 2017, 1:24 p.m. UTC
Certain memory resources associated with a guest are not necessarily
present in the guest P2M.

This patch adds the boilerplate for new memory op to allow such a resource
to be priv-mapped directly, by either a PV or HVM tools domain.

NOTE: Whilst the new op is not intrinsicly specific to the x86 architecture,
      I have no means to test it on an ARM platform and so cannot verify
      that it functions correctly.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Cc: Julien Grall <julien.grall@arm.com>

v12:
 - Addressed more comments form Jan.
 - Removed #ifdef CONFIG_X86 from common code and instead introduced a
   stub set_foreign_p2m_entry() in asm-arm/p2m.h returning -EOPNOTSUPP.
 - Restricted mechanism for querying implementation limit on nr_frames
   and simplified compat code.

v11:
 - Addressed more comments from Jan.

v9:
 - Addressed more comments from Jan.

v8:
 - Move the code into common as requested by Jan.
 - Make the gmfn_list handle a 64-bit type to avoid limiting the MFN
   range for a 32-bit tools domain.
 - Add missing pad.
 - Add compat code.
 - Make this patch deal with purely boilerplate.
 - Drop George's A-b and Wei's R-b because the changes are non-trivial,
   and update Cc list now the boilerplate is common.

v5:
 - Switched __copy_to/from_guest_offset() to copy_to/from_guest_offset().
---
 tools/flask/policy/modules/xen.if   |  4 +-
 xen/arch/x86/mm/p2m.c               |  3 +-
 xen/common/compat/memory.c          | 95 +++++++++++++++++++++++++++++++++++++
 xen/common/memory.c                 | 94 ++++++++++++++++++++++++++++++++++++
 xen/include/asm-arm/p2m.h           |  6 +++
 xen/include/asm-x86/p2m.h           |  3 ++
 xen/include/public/memory.h         | 43 ++++++++++++++++-
 xen/include/xlat.lst                |  1 +
 xen/include/xsm/dummy.h             |  6 +++
 xen/include/xsm/xsm.h               |  6 +++
 xen/xsm/dummy.c                     |  1 +
 xen/xsm/flask/hooks.c               |  6 +++
 xen/xsm/flask/policy/access_vectors |  2 +
 13 files changed, 266 insertions(+), 4 deletions(-)

Comments

Daniel De Graaf Oct. 17, 2017, 2:45 p.m. UTC | #1
On 10/17/2017 09:24 AM, Paul Durrant wrote:
> Certain memory resources associated with a guest are not necessarily
> present in the guest P2M.
> 
> This patch adds the boilerplate for new memory op to allow such a resource
> to be priv-mapped directly, by either a PV or HVM tools domain.
> 
> NOTE: Whilst the new op is not intrinsicly specific to the x86 architecture,
>        I have no means to test it on an ARM platform and so cannot verify
>        that it functions correctly.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>

Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Julien Grall Oct. 19, 2017, 12:22 p.m. UTC | #2
Hi,

On 17/10/17 14:24, Paul Durrant wrote:
> Certain memory resources associated with a guest are not necessarily
> present in the guest P2M.
> 
> This patch adds the boilerplate for new memory op to allow such a resource
> to be priv-mapped directly, by either a PV or HVM tools domain.
> 
> NOTE: Whilst the new op is not intrinsicly specific to the x86 architecture,
>        I have no means to test it on an ARM platform and so cannot verify
>        that it functions correctly.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> ---

[...]

> diff --git a/xen/common/memory.c b/xen/common/memory.c
> index ad987e0f29..cdd2e030cf 100644
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -965,6 +965,95 @@ static long xatp_permission_check(struct domain *d, unsigned int space)

[...]

> +    if ( rc )
> +        goto out;
> +
> +    if ( !paging_mode_translate(currd) )
> +    {
> +        if ( copy_to_guest(xmar.frame_list, mfn_list, xmar.nr_frames) )
> +            rc = -EFAULT;
> +    }
> +    else
> +    {
> +        xen_pfn_t gfn_list[ARRAY_SIZE(mfn_list)];
> +        unsigned int i;
> +
> +        rc = -EFAULT;
> +        if ( copy_from_guest(gfn_list, xmar.frame_list, xmar.nr_frames) )
> +            goto out;
> +
> +        for ( i = 0; i < xmar.nr_frames; i++ )
> +        {
> +            rc = set_foreign_p2m_entry(currd, gfn_list[i],
> +                                       _mfn(mfn_list[i]));

Something looks a bit odd to me here. When I read foreign mapping, I 
directly associate to mapping from a foreign domain.

On Arm, we will always get a reference on that page to prevent it 
disappearing if the foreign domain is destroyed but the mapping is still 
present.

This reference will either be put with an unmapped hypercall or while 
teardown the domain.

Per my understanding, this MFN does not belong to any domain (or at 
least currd). Right? So there is no way to get/put a reference on that 
page. So I am unconvinced that this is very safe.

Also looking at the x86 side, I can't find such reference in the foreign 
path in p2m_add_foreign. Did I miss anything?

Note that x86 does not handle p2m teardown with foreign map at the 
moment (see p2m_add_foreign).

You are by-passing this check and I can't see how this would be safe for 
the x86 side too.

> +            if ( rc )
> +            {
> +                /*
> +                 * Make sure rc is -EIO for any interation other than
> +                 * the first.
> +                 */
> +                rc = (i != 0) ? -EIO : rc;
> +                break;
> +            }
> +        }
> +    }
> +
> + out:
> +    rcu_unlock_domain(d);
> +    return rc;
> +}
> +
>   long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>   {
>       struct domain *d, *curr_d = current->domain;
> @@ -1406,6 +1495,11 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>       }
>   #endif
>   
> +    case XENMEM_acquire_resource:
> +        rc = acquire_resource(
> +            guest_handle_cast(arg, xen_mem_acquire_resource_t));
> +        break;
> +
>       default:
>           rc = arch_memory_op(cmd, arg);
>           break;
> diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
> index faadcfe8fe..a5caa747ce 100644
> --- a/xen/include/asm-arm/p2m.h
> +++ b/xen/include/asm-arm/p2m.h
> @@ -346,6 +346,12 @@ static inline gfn_t gfn_next_boundary(gfn_t gfn, unsigned int order)
>       return gfn_add(gfn, 1UL << order);
>   }
>   
> +static inline int set_foreign_p2m_entry(struct domain *d, unsigned long gfn,

Please modifify the prototype to use gfn_t.

> +                                        mfn_t mfn)
> +{
> +    return -EOPNOTSUPP;
> +} > +
>   #endif /* _XEN_P2M_H */
>   

[...]

> diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
> index 29386df98b..18118ea5c6 100644
> --- a/xen/include/public/memory.h
> +++ b/xen/include/public/memory.h
> @@ -599,6 +599,47 @@ struct xen_reserved_device_memory_map {
>   typedef struct xen_reserved_device_memory_map xen_reserved_device_memory_map_t;
>   DEFINE_XEN_GUEST_HANDLE(xen_reserved_device_memory_map_t);
>   
> +/*
> + * Get the pages for a particular guest resource, so that they can be
> + * mapped directly by a tools domain.
> + */
> +#define XENMEM_acquire_resource 28
> +struct xen_mem_acquire_resource {
> +    /* IN - the domain whose resource is to be mapped */
> +    domid_t domid;
> +    /* IN - the type of resource */
> +    uint16_t type;
> +    /*
> +     * IN - a type-specific resource identifier, which must be zero
> +     *      unless stated otherwise.
> +     */
> +    uint32_t id;
> +    /* IN/OUT - As an IN parameter number of frames of the resource

Coding style:

/*
  *

> +     *          to be mapped. However, if the specified value is 0 and
> +     *          frame_list is NULL then this field will be set to the
> +     *          maximum value supported by the implementation on return.
> +     */
> +    uint32_t nr_frames;
> +    uint32_t pad;
> +    /* IN - the index of the initial frame to be mapped. This parameter

Ditto

> +     *      is ignored if nr_frames is 0.
> +     */
> +    uint64_aligned_t frame;
> +    /* IN/OUT - If the tools domain is PV then, upon return, frame_list

Ditto
> +     *          will be populated with the MFNs of the resource.
> +     *          If the tools domain is HVM then it is expected that, on
> +     *          entry, frame_list will be populated with a list of GFNs
> +     *          that will be mapped to the MFNs of the resource.
> +     *          If -EIO is returned then the frame_list has only been
> +     *          partially mapped and it is up to the caller to unmap all
> +     *          the GFNs.
> +     *          This parameter may be NULL if nr_frames is 0.
> +     */
> +    XEN_GUEST_HANDLE(xen_ulong_t) frame_list;
> +};
> +typedef struct xen_mem_acquire_resource xen_mem_acquire_resource_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_mem_acquire_resource_t);
> +
>   #endif /* defined(__XEN__) || defined(__XEN_TOOLS__) */
>   
>   /*

Cheers,
Paul Durrant Oct. 19, 2017, 12:57 p.m. UTC | #3
> -----Original Message-----

> From: Julien Grall [mailto:julien.grall@linaro.org]

> Sent: 19 October 2017 13:23

> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org

> Cc: Stefano Stabellini <sstabellini@kernel.org>; Wei Liu

> <wei.liu2@citrix.com>; Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>;

> George Dunlap <George.Dunlap@citrix.com>; Andrew Cooper

> <Andrew.Cooper3@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Tim

> (Xen.org) <tim@xen.org>; Julien Grall <julien.grall@arm.com>; Jan Beulich

> <jbeulich@suse.com>; Daniel De Graaf <dgdegra@tycho.nsa.gov>; Roger

> Pau Monne <roger.pau@citrix.com>

> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add

> HYPERVISOR_memory_op to acquire guest resources

> 

> Hi,

> 

> On 17/10/17 14:24, Paul Durrant wrote:

> > Certain memory resources associated with a guest are not necessarily

> > present in the guest P2M.

> >

> > This patch adds the boilerplate for new memory op to allow such a

> resource

> > to be priv-mapped directly, by either a PV or HVM tools domain.

> >

> > NOTE: Whilst the new op is not intrinsicly specific to the x86 architecture,

> >        I have no means to test it on an ARM platform and so cannot verify

> >        that it functions correctly.

> >

> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>

> > ---

> 

> [...]

> 

> > diff --git a/xen/common/memory.c b/xen/common/memory.c

> > index ad987e0f29..cdd2e030cf 100644

> > --- a/xen/common/memory.c

> > +++ b/xen/common/memory.c

> > @@ -965,6 +965,95 @@ static long xatp_permission_check(struct domain

> *d, unsigned int space)

> 

> [...]

> 

> > +    if ( rc )

> > +        goto out;

> > +

> > +    if ( !paging_mode_translate(currd) )

> > +    {

> > +        if ( copy_to_guest(xmar.frame_list, mfn_list, xmar.nr_frames) )

> > +            rc = -EFAULT;

> > +    }

> > +    else

> > +    {

> > +        xen_pfn_t gfn_list[ARRAY_SIZE(mfn_list)];

> > +        unsigned int i;

> > +

> > +        rc = -EFAULT;

> > +        if ( copy_from_guest(gfn_list, xmar.frame_list, xmar.nr_frames) )

> > +            goto out;

> > +

> > +        for ( i = 0; i < xmar.nr_frames; i++ )

> > +        {

> > +            rc = set_foreign_p2m_entry(currd, gfn_list[i],

> > +                                       _mfn(mfn_list[i]));

> 

> Something looks a bit odd to me here. When I read foreign mapping, I

> directly associate to mapping from a foreign domain.

> 

> On Arm, we will always get a reference on that page to prevent it

> disappearing if the foreign domain is destroyed but the mapping is still

> present.

> 

> This reference will either be put with an unmapped hypercall or while

> teardown the domain.

> 

> Per my understanding, this MFN does not belong to any domain (or at

> least currd). Right?


No. The mfns do belong to the target domain.

> So there is no way to get/put a reference on that

> page. So I am unconvinced that this is very safe.

> 

> Also looking at the x86 side, I can't find such reference in the foreign

> path in p2m_add_foreign. Did I miss anything?


No, I don't think there is any reference counting there... but this is no different to priv mapping. I'm not trying to fix the mapping infrastructure at this point.

> 

> Note that x86 does not handle p2m teardown with foreign map at the

> moment (see p2m_add_foreign).

> 

> You are by-passing this check and I can't see how this would be safe for

> the x86 side too.

> 


I don't follow. What check am I by-passing that is covered when priv mapping?

> > +            if ( rc )

> > +            {

> > +                /*

> > +                 * Make sure rc is -EIO for any interation other than

> > +                 * the first.

> > +                 */

> > +                rc = (i != 0) ? -EIO : rc;

> > +                break;

> > +            }

> > +        }

> > +    }

> > +

> > + out:

> > +    rcu_unlock_domain(d);

> > +    return rc;

> > +}

> > +

> >   long do_memory_op(unsigned long cmd,

> XEN_GUEST_HANDLE_PARAM(void) arg)

> >   {

> >       struct domain *d, *curr_d = current->domain;

> > @@ -1406,6 +1495,11 @@ long do_memory_op(unsigned long cmd,

> XEN_GUEST_HANDLE_PARAM(void) arg)

> >       }

> >   #endif

> >

> > +    case XENMEM_acquire_resource:

> > +        rc = acquire_resource(

> > +            guest_handle_cast(arg, xen_mem_acquire_resource_t));

> > +        break;

> > +

> >       default:

> >           rc = arch_memory_op(cmd, arg);

> >           break;

> > diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h

> > index faadcfe8fe..a5caa747ce 100644

> > --- a/xen/include/asm-arm/p2m.h

> > +++ b/xen/include/asm-arm/p2m.h

> > @@ -346,6 +346,12 @@ static inline gfn_t gfn_next_boundary(gfn_t gfn,

> unsigned int order)

> >       return gfn_add(gfn, 1UL << order);

> >   }

> >

> > +static inline int set_foreign_p2m_entry(struct domain *d, unsigned long

> gfn,

> 

> Please modifify the prototype to use gfn_t.

> 


<sigh> I only put this stub in to match x86 so I can avoid the #ifdefs that Jan objects to. Which other bits of the universe do I need to fix?

> > +                                        mfn_t mfn)

> > +{

> > +    return -EOPNOTSUPP;

> > +} > +

> >   #endif /* _XEN_P2M_H */

> >

> 

> [...]

> 

> > diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h

> > index 29386df98b..18118ea5c6 100644

> > --- a/xen/include/public/memory.h

> > +++ b/xen/include/public/memory.h

> > @@ -599,6 +599,47 @@ struct xen_reserved_device_memory_map {

> >   typedef struct xen_reserved_device_memory_map

> xen_reserved_device_memory_map_t;

> >   DEFINE_XEN_GUEST_HANDLE(xen_reserved_device_memory_map_t);

> >

> > +/*

> > + * Get the pages for a particular guest resource, so that they can be

> > + * mapped directly by a tools domain.

> > + */

> > +#define XENMEM_acquire_resource 28

> > +struct xen_mem_acquire_resource {

> > +    /* IN - the domain whose resource is to be mapped */

> > +    domid_t domid;

> > +    /* IN - the type of resource */

> > +    uint16_t type;

> > +    /*

> > +     * IN - a type-specific resource identifier, which must be zero

> > +     *      unless stated otherwise.

> > +     */

> > +    uint32_t id;

> > +    /* IN/OUT - As an IN parameter number of frames of the resource

> 

> Coding style:

> 

> /*

>   *


Ok.

> 

> > +     *          to be mapped. However, if the specified value is 0 and

> > +     *          frame_list is NULL then this field will be set to the

> > +     *          maximum value supported by the implementation on return.

> > +     */

> > +    uint32_t nr_frames;

> > +    uint32_t pad;

> > +    /* IN - the index of the initial frame to be mapped. This parameter

> 

> Ditto

> 


Ok.

> > +     *      is ignored if nr_frames is 0.

> > +     */

> > +    uint64_aligned_t frame;

> > +    /* IN/OUT - If the tools domain is PV then, upon return, frame_list

> 

> Ditto

> > +     *          will be populated with the MFNs of the resource.

> > +     *          If the tools domain is HVM then it is expected that, on

> > +     *          entry, frame_list will be populated with a list of GFNs

> > +     *          that will be mapped to the MFNs of the resource.

> > +     *          If -EIO is returned then the frame_list has only been

> > +     *          partially mapped and it is up to the caller to unmap all

> > +     *          the GFNs.

> > +     *          This parameter may be NULL if nr_frames is 0.

> > +     */

> > +    XEN_GUEST_HANDLE(xen_ulong_t) frame_list;

> > +};

> > +typedef struct xen_mem_acquire_resource

> xen_mem_acquire_resource_t;

> > +DEFINE_XEN_GUEST_HANDLE(xen_mem_acquire_resource_t);

> > +

> >   #endif /* defined(__XEN__) || defined(__XEN_TOOLS__) */

> >

> >   /*

>


Sorry to be getting frustrated with this, but I'm wondering how many more colours I need to paint this bike-shed.

  Paul


> Cheers,

> 

> --

> Julien Grall
Julien Grall Oct. 19, 2017, 1:29 p.m. UTC | #4
Hi,

On 10/19/2017 01:57 PM, Paul Durrant wrote:
>> -----Original Message-----
>> From: Julien Grall [mailto:julien.grall@linaro.org]
>> Sent: 19 October 2017 13:23
>> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org
>> Cc: Stefano Stabellini <sstabellini@kernel.org>; Wei Liu
>> <wei.liu2@citrix.com>; Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>;
>> George Dunlap <George.Dunlap@citrix.com>; Andrew Cooper
>> <Andrew.Cooper3@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Tim
>> (Xen.org) <tim@xen.org>; Julien Grall <julien.grall@arm.com>; Jan Beulich
>> <jbeulich@suse.com>; Daniel De Graaf <dgdegra@tycho.nsa.gov>; Roger
>> Pau Monne <roger.pau@citrix.com>
>> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add
>> HYPERVISOR_memory_op to acquire guest resources
>>
>> Hi,
>>
>> On 17/10/17 14:24, Paul Durrant wrote:
>>> Certain memory resources associated with a guest are not necessarily
>>> present in the guest P2M.
>>>
>>> This patch adds the boilerplate for new memory op to allow such a
>> resource
>>> to be priv-mapped directly, by either a PV or HVM tools domain.
>>>
>>> NOTE: Whilst the new op is not intrinsicly specific to the x86 architecture,
>>>         I have no means to test it on an ARM platform and so cannot verify
>>>         that it functions correctly.
>>>
>>> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>>> ---
>>
>> [...]
>>
>>> diff --git a/xen/common/memory.c b/xen/common/memory.c
>>> index ad987e0f29..cdd2e030cf 100644
>>> --- a/xen/common/memory.c
>>> +++ b/xen/common/memory.c
>>> @@ -965,6 +965,95 @@ static long xatp_permission_check(struct domain
>> *d, unsigned int space)
>>
>> [...]
>>
>>> +    if ( rc )
>>> +        goto out;
>>> +
>>> +    if ( !paging_mode_translate(currd) )
>>> +    {
>>> +        if ( copy_to_guest(xmar.frame_list, mfn_list, xmar.nr_frames) )
>>> +            rc = -EFAULT;
>>> +    }
>>> +    else
>>> +    {
>>> +        xen_pfn_t gfn_list[ARRAY_SIZE(mfn_list)];
>>> +        unsigned int i;
>>> +
>>> +        rc = -EFAULT;
>>> +        if ( copy_from_guest(gfn_list, xmar.frame_list, xmar.nr_frames) )
>>> +            goto out;
>>> +
>>> +        for ( i = 0; i < xmar.nr_frames; i++ )
>>> +        {
>>> +            rc = set_foreign_p2m_entry(currd, gfn_list[i],
>>> +                                       _mfn(mfn_list[i]));
>>
>> Something looks a bit odd to me here. When I read foreign mapping, I
>> directly associate to mapping from a foreign domain.
>>
>> On Arm, we will always get a reference on that page to prevent it
>> disappearing if the foreign domain is destroyed but the mapping is still
>> present.
>>
>> This reference will either be put with an unmapped hypercall or while
>> teardown the domain.
>>
>> Per my understanding, this MFN does not belong to any domain (or at
>> least currd). Right?
>  
> No. The mfns do belong to the target domain.

To be fully safe, you need to take a reference on each page you mapped. 
So who is going to get a reference on them? Who is going to drop that?

> 
>> So there is no way to get/put a reference on that
>> page. So I am unconvinced that this is very safe.
>>
>> Also looking at the x86 side, I can't find such reference in the foreign
>> path in p2m_add_foreign. Did I miss anything?
> 
> No, I don't think there is any reference counting there... but this is no different to priv mapping. I'm not trying to fix the mapping infrastructure at this point.
> 
>>
>> Note that x86 does not handle p2m teardown with foreign map at the
>> moment (see p2m_add_foreign).
>>
>> You are by-passing this check and I can't see how this would be safe for
>> the x86 side too.
>>
> 
> I don't follow. What check am I by-passing that is covered when priv mapping?

     /*
      * hvm fixme: until support is added to p2m teardown code to 
cleanup any
      * foreign entries, limit this to hardware domain only.
      */

How this is safe with your new solution? That looks like a regression...

[...]

>>> +     *          will be populated with the MFNs of the resource.
>>> +     *          If the tools domain is HVM then it is expected that, on
>>> +     *          entry, frame_list will be populated with a list of GFNs
>>> +     *          that will be mapped to the MFNs of the resource.
>>> +     *          If -EIO is returned then the frame_list has only been
>>> +     *          partially mapped and it is up to the caller to unmap all
>>> +     *          the GFNs.
>>> +     *          This parameter may be NULL if nr_frames is 0.
>>> +     */
>>> +    XEN_GUEST_HANDLE(xen_ulong_t) frame_list;
>>> +};
>>> +typedef struct xen_mem_acquire_resource
>> xen_mem_acquire_resource_t;
>>> +DEFINE_XEN_GUEST_HANDLE(xen_mem_acquire_resource_t);
>>> +
>>>    #endif /* defined(__XEN__) || defined(__XEN_TOOLS__) */
>>>
>>>    /*
>>
> 
> Sorry to be getting frustrated with this, but I'm wondering how many more colours I need to paint this bike-shed.

I don't know how x86 looks like and maybe this is fine for Andrew and 
Jan. But for Arm, it does not look correct.

To give you an idea, my first thought to implement your newly wrongly 
named function was to just call p2m_set_entry with p2m_map_foreign. But 
from this discussion it would look plain wrong.

So this means the interface is not clear enough.

Cheers,
Paul Durrant Oct. 19, 2017, 1:35 p.m. UTC | #5
> -----Original Message-----

> From: Julien Grall [mailto:julien.grall@linaro.org]

> Sent: 19 October 2017 14:29

> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org

> Cc: Stefano Stabellini <sstabellini@kernel.org>; Wei Liu

> <wei.liu2@citrix.com>; Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>;

> George Dunlap <George.Dunlap@citrix.com>; Andrew Cooper

> <Andrew.Cooper3@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Tim

> (Xen.org) <tim@xen.org>; Julien Grall <julien.grall@arm.com>; Jan Beulich

> <jbeulich@suse.com>; Daniel De Graaf <dgdegra@tycho.nsa.gov>; Roger

> Pau Monne <roger.pau@citrix.com>

> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add

> HYPERVISOR_memory_op to acquire guest resources

> 

> Hi,

> 

> On 10/19/2017 01:57 PM, Paul Durrant wrote:

> >> -----Original Message-----

> >> From: Julien Grall [mailto:julien.grall@linaro.org]

> >> Sent: 19 October 2017 13:23

> >> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-

> devel@lists.xenproject.org

> >> Cc: Stefano Stabellini <sstabellini@kernel.org>; Wei Liu

> >> <wei.liu2@citrix.com>; Konrad Rzeszutek Wilk

> <konrad.wilk@oracle.com>;

> >> George Dunlap <George.Dunlap@citrix.com>; Andrew Cooper

> >> <Andrew.Cooper3@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>;

> Tim

> >> (Xen.org) <tim@xen.org>; Julien Grall <julien.grall@arm.com>; Jan

> Beulich

> >> <jbeulich@suse.com>; Daniel De Graaf <dgdegra@tycho.nsa.gov>; Roger

> >> Pau Monne <roger.pau@citrix.com>

> >> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add

> >> HYPERVISOR_memory_op to acquire guest resources

> >>

> >> Hi,

> >>

> >> On 17/10/17 14:24, Paul Durrant wrote:

> >>> Certain memory resources associated with a guest are not necessarily

> >>> present in the guest P2M.

> >>>

> >>> This patch adds the boilerplate for new memory op to allow such a

> >> resource

> >>> to be priv-mapped directly, by either a PV or HVM tools domain.

> >>>

> >>> NOTE: Whilst the new op is not intrinsicly specific to the x86 architecture,

> >>>         I have no means to test it on an ARM platform and so cannot verify

> >>>         that it functions correctly.

> >>>

> >>> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>

> >>> ---

> >>

> >> [...]

> >>

> >>> diff --git a/xen/common/memory.c b/xen/common/memory.c

> >>> index ad987e0f29..cdd2e030cf 100644

> >>> --- a/xen/common/memory.c

> >>> +++ b/xen/common/memory.c

> >>> @@ -965,6 +965,95 @@ static long xatp_permission_check(struct

> domain

> >> *d, unsigned int space)

> >>

> >> [...]

> >>

> >>> +    if ( rc )

> >>> +        goto out;

> >>> +

> >>> +    if ( !paging_mode_translate(currd) )

> >>> +    {

> >>> +        if ( copy_to_guest(xmar.frame_list, mfn_list, xmar.nr_frames) )

> >>> +            rc = -EFAULT;

> >>> +    }

> >>> +    else

> >>> +    {

> >>> +        xen_pfn_t gfn_list[ARRAY_SIZE(mfn_list)];

> >>> +        unsigned int i;

> >>> +

> >>> +        rc = -EFAULT;

> >>> +        if ( copy_from_guest(gfn_list, xmar.frame_list, xmar.nr_frames) )

> >>> +            goto out;

> >>> +

> >>> +        for ( i = 0; i < xmar.nr_frames; i++ )

> >>> +        {

> >>> +            rc = set_foreign_p2m_entry(currd, gfn_list[i],

> >>> +                                       _mfn(mfn_list[i]));

> >>

> >> Something looks a bit odd to me here. When I read foreign mapping, I

> >> directly associate to mapping from a foreign domain.

> >>

> >> On Arm, we will always get a reference on that page to prevent it

> >> disappearing if the foreign domain is destroyed but the mapping is still

> >> present.

> >>

> >> This reference will either be put with an unmapped hypercall or while

> >> teardown the domain.

> >>

> >> Per my understanding, this MFN does not belong to any domain (or at

> >> least currd). Right?

> >

> > No. The mfns do belong to the target domain.

> 

> To be fully safe, you need to take a reference on each page you mapped.

> So who is going to get a reference on them? Who is going to drop that?

> 


Yes, that's true but it's also true of priv mapping AIUI. I think the correct fix is to deal with this in set_p2m_foreign_entry() so that it is fixed for both cases. I don't think it is something that ought to be addressed here... unless I'm missing something.

> >

> >> So there is no way to get/put a reference on that

> >> page. So I am unconvinced that this is very safe.

> >>

> >> Also looking at the x86 side, I can't find such reference in the foreign

> >> path in p2m_add_foreign. Did I miss anything?

> >

> > No, I don't think there is any reference counting there... but this is no

> different to priv mapping. I'm not trying to fix the mapping infrastructure at

> this point.

> >

> >>

> >> Note that x86 does not handle p2m teardown with foreign map at the

> >> moment (see p2m_add_foreign).

> >>

> >> You are by-passing this check and I can't see how this would be safe for

> >> the x86 side too.

> >>

> >

> > I don't follow. What check am I by-passing that is covered when priv

> mapping?

> 

>      /*

>       * hvm fixme: until support is added to p2m teardown code to

> cleanup any

>       * foreign entries, limit this to hardware domain only.

>       */

> 

> How this is safe with your new solution? That looks like a regression...


Well, the new hypercall is tools-only but I can add the extra check for the hardware domain although it's probably redundant in practice.

> 

> [...]

> 

> >>> +     *          will be populated with the MFNs of the resource.

> >>> +     *          If the tools domain is HVM then it is expected that, on

> >>> +     *          entry, frame_list will be populated with a list of GFNs

> >>> +     *          that will be mapped to the MFNs of the resource.

> >>> +     *          If -EIO is returned then the frame_list has only been

> >>> +     *          partially mapped and it is up to the caller to unmap all

> >>> +     *          the GFNs.

> >>> +     *          This parameter may be NULL if nr_frames is 0.

> >>> +     */

> >>> +    XEN_GUEST_HANDLE(xen_ulong_t) frame_list;

> >>> +};

> >>> +typedef struct xen_mem_acquire_resource

> >> xen_mem_acquire_resource_t;

> >>> +DEFINE_XEN_GUEST_HANDLE(xen_mem_acquire_resource_t);

> >>> +

> >>>    #endif /* defined(__XEN__) || defined(__XEN_TOOLS__) */

> >>>

> >>>    /*

> >>

> >

> > Sorry to be getting frustrated with this, but I'm wondering how many more

> colours I need to paint this bike-shed.

> 

> I don't know how x86 looks like and maybe this is fine for Andrew and

> Jan. But for Arm, it does not look correct.

> 

> To give you an idea, my first thought to implement your newly wrongly

> named function was to just call p2m_set_entry with p2m_map_foreign. But

> from this discussion it would look plain wrong.

> 

> So this means the interface is not clear enough.


I'd prefer to make the whole thing x86-only since that's the only platform on which I can test it, and indeed the code used to be x86-only. Jan objected to this so all I'm trying to achieve is that it builds for ARM. Please can you and Jan reach agreement on where the code should live and how, if at all, it should be #ifdef-ed?

  Paul

> 

> Cheers,

> 

> --

> Julien Grall
Julien Grall Oct. 19, 2017, 2:12 p.m. UTC | #6
Hi,

On 19/10/17 14:35, Paul Durrant wrote:
>> -----Original Message-----
>> From: Julien Grall [mailto:julien.grall@linaro.org]
>> Sent: 19 October 2017 14:29
>> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org
>> Cc: Stefano Stabellini <sstabellini@kernel.org>; Wei Liu
>> <wei.liu2@citrix.com>; Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>;
>> George Dunlap <George.Dunlap@citrix.com>; Andrew Cooper
>> <Andrew.Cooper3@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Tim
>> (Xen.org) <tim@xen.org>; Julien Grall <julien.grall@arm.com>; Jan Beulich
>> <jbeulich@suse.com>; Daniel De Graaf <dgdegra@tycho.nsa.gov>; Roger
>> Pau Monne <roger.pau@citrix.com>
>> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add
>> HYPERVISOR_memory_op to acquire guest resources
>>
>> Hi,
>>
>> On 10/19/2017 01:57 PM, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Julien Grall [mailto:julien.grall@linaro.org]
>>>> Sent: 19 October 2017 13:23
>>>> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-
>> devel@lists.xenproject.org
>>>> Cc: Stefano Stabellini <sstabellini@kernel.org>; Wei Liu
>>>> <wei.liu2@citrix.com>; Konrad Rzeszutek Wilk
>> <konrad.wilk@oracle.com>;
>>>> George Dunlap <George.Dunlap@citrix.com>; Andrew Cooper
>>>> <Andrew.Cooper3@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>;
>> Tim
>>>> (Xen.org) <tim@xen.org>; Julien Grall <julien.grall@arm.com>; Jan
>> Beulich
>>>> <jbeulich@suse.com>; Daniel De Graaf <dgdegra@tycho.nsa.gov>; Roger
>>>> Pau Monne <roger.pau@citrix.com>
>>>> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add
>>>> HYPERVISOR_memory_op to acquire guest resources
>>>>
>>>> Hi,
>>>>
>>>> On 17/10/17 14:24, Paul Durrant wrote:
>>>>> Certain memory resources associated with a guest are not necessarily
>>>>> present in the guest P2M.
>>>>>
>>>>> This patch adds the boilerplate for new memory op to allow such a
>>>> resource
>>>>> to be priv-mapped directly, by either a PV or HVM tools domain.
>>>>>
>>>>> NOTE: Whilst the new op is not intrinsicly specific to the x86 architecture,
>>>>>          I have no means to test it on an ARM platform and so cannot verify
>>>>>          that it functions correctly.
>>>>>
>>>>> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>>>>> ---
>>>>
>>>> [...]
>>>>
>>>>> diff --git a/xen/common/memory.c b/xen/common/memory.c
>>>>> index ad987e0f29..cdd2e030cf 100644
>>>>> --- a/xen/common/memory.c
>>>>> +++ b/xen/common/memory.c
>>>>> @@ -965,6 +965,95 @@ static long xatp_permission_check(struct
>> domain
>>>> *d, unsigned int space)
>>>>
>>>> [...]
>>>>
>>>>> +    if ( rc )
>>>>> +        goto out;
>>>>> +
>>>>> +    if ( !paging_mode_translate(currd) )
>>>>> +    {
>>>>> +        if ( copy_to_guest(xmar.frame_list, mfn_list, xmar.nr_frames) )
>>>>> +            rc = -EFAULT;
>>>>> +    }
>>>>> +    else
>>>>> +    {
>>>>> +        xen_pfn_t gfn_list[ARRAY_SIZE(mfn_list)];
>>>>> +        unsigned int i;
>>>>> +
>>>>> +        rc = -EFAULT;
>>>>> +        if ( copy_from_guest(gfn_list, xmar.frame_list, xmar.nr_frames) )
>>>>> +            goto out;
>>>>> +
>>>>> +        for ( i = 0; i < xmar.nr_frames; i++ )
>>>>> +        {
>>>>> +            rc = set_foreign_p2m_entry(currd, gfn_list[i],
>>>>> +                                       _mfn(mfn_list[i]));
>>>>
>>>> Something looks a bit odd to me here. When I read foreign mapping, I
>>>> directly associate to mapping from a foreign domain.
>>>>
>>>> On Arm, we will always get a reference on that page to prevent it
>>>> disappearing if the foreign domain is destroyed but the mapping is still
>>>> present.
>>>>
>>>> This reference will either be put with an unmapped hypercall or while
>>>> teardown the domain.
>>>>
>>>> Per my understanding, this MFN does not belong to any domain (or at
>>>> least currd). Right?
>>>
>>> No. The mfns do belong to the target domain.
>>
>> To be fully safe, you need to take a reference on each page you mapped.
>> So who is going to get a reference on them? Who is going to drop that?
>>
> 
> Yes, that's true but it's also true of priv mapping AIUI. I think the correct fix is to deal with this in set_p2m_foreign_entry() so that it is fixed for both cases. I don't think it is something that ought to be addressed here... unless I'm missing something.

For x86 maybe. For Arm, foreign mapping are also exposed to guest and we 
get a reference every time. It is probably an oversight on the x86 side.

> 
>>>
>>>> So there is no way to get/put a reference on that
>>>> page. So I am unconvinced that this is very safe.
>>>>
>>>> Also looking at the x86 side, I can't find such reference in the foreign
>>>> path in p2m_add_foreign. Did I miss anything?
>>>
>>> No, I don't think there is any reference counting there... but this is no
>> different to priv mapping. I'm not trying to fix the mapping infrastructure at
>> this point.
>>>
>>>>
>>>> Note that x86 does not handle p2m teardown with foreign map at the
>>>> moment (see p2m_add_foreign).
>>>>
>>>> You are by-passing this check and I can't see how this would be safe for
>>>> the x86 side too.
>>>>
>>>
>>> I don't follow. What check am I by-passing that is covered when priv
>> mapping?
>>
>>       /*
>>        * hvm fixme: until support is added to p2m teardown code to
>> cleanup any
>>        * foreign entries, limit this to hardware domain only.
>>        */
>>
>> How this is safe with your new solution? That looks like a regression...
> 
> Well, the new hypercall is tools-only but I can add the extra check for the hardware domain although it's probably redundant in practice.

Well a foreign domain can be nasty with you. Like removing the mapping 
from itself resulting to free the memory... So you may end up 
writing/read wrong mapping or even worth in another domain.

But that's may not impact this new hypercall.

> 
>>
>> [...]
>>
>>>>> +     *          will be populated with the MFNs of the resource.
>>>>> +     *          If the tools domain is HVM then it is expected that, on
>>>>> +     *          entry, frame_list will be populated with a list of GFNs
>>>>> +     *          that will be mapped to the MFNs of the resource.
>>>>> +     *          If -EIO is returned then the frame_list has only been
>>>>> +     *          partially mapped and it is up to the caller to unmap all
>>>>> +     *          the GFNs.
>>>>> +     *          This parameter may be NULL if nr_frames is 0.
>>>>> +     */
>>>>> +    XEN_GUEST_HANDLE(xen_ulong_t) frame_list;
>>>>> +};
>>>>> +typedef struct xen_mem_acquire_resource
>>>> xen_mem_acquire_resource_t;
>>>>> +DEFINE_XEN_GUEST_HANDLE(xen_mem_acquire_resource_t);
>>>>> +
>>>>>     #endif /* defined(__XEN__) || defined(__XEN_TOOLS__) */
>>>>>
>>>>>     /*
>>>>
>>>
>>> Sorry to be getting frustrated with this, but I'm wondering how many more
>> colours I need to paint this bike-shed.
>>
>> I don't know how x86 looks like and maybe this is fine for Andrew and
>> Jan. But for Arm, it does not look correct.
>>
>> To give you an idea, my first thought to implement your newly wrongly
>> named function was to just call p2m_set_entry with p2m_map_foreign. But
>> from this discussion it would look plain wrong.
>>
>> So this means the interface is not clear enough.
> 
> I'd prefer to make the whole thing x86-only since that's the only platform on which I can test it, and indeed the code used to be x86-only. Jan objected to this so all I'm trying to achieve is that it builds for ARM. Please can you and Jan reach agreement on where the code should live and how, if at all, it should be #ifdef-ed?

I am quite surprised of "it is tools-only" so it is fine to not protect 
it even if it is x86 only. That's probably going to bite us in the future.

Cheers,
Paul Durrant Oct. 19, 2017, 2:49 p.m. UTC | #7
> -----Original Message-----

[snip]
> >

> > I'd prefer to make the whole thing x86-only since that's the only platform

> on which I can test it, and indeed the code used to be x86-only. Jan objected

> to this so all I'm trying to achieve is that it builds for ARM. Please can you and

> Jan reach agreement on where the code should live and how, if at all, it

> should be #ifdef-ed?

> 

> I am quite surprised of "it is tools-only" so it is fine to not protect

> it even if it is x86 only. That's probably going to bite us in the future.

> 


So, this appears to have reached an impasse. I don't know how to proceed without having to also fix priv mapping for x86, which is a yak far too large for me to shave at the moment.

  Paul
Jan Beulich Oct. 19, 2017, 3:11 p.m. UTC | #8
>>> On 19.10.17 at 16:49, <Paul.Durrant@citrix.com> wrote:
>> > I'd prefer to make the whole thing x86-only since that's the only platform
>> on which I can test it, and indeed the code used to be x86-only. Jan objected
>> to this so all I'm trying to achieve is that it builds for ARM. Please can you and
>> Jan reach agreement on where the code should live and how, if at all, it
>> should be #ifdef-ed?
>> 
>> I am quite surprised of "it is tools-only" so it is fine to not protect
>> it even if it is x86 only. That's probably going to bite us in the future.
>> 
> 
> So, this appears to have reached an impasse. I don't know how to proceed 
> without having to also fix priv mapping for x86, which is a yak far too large 
> for me to shave at the moment.

Julien,

why is it that you are making refcounting on p2m insertion / removal
a requirement for this series? We all know it's not there right now
(and similarly also not for the IOMMU, which might affect ARM as well
unless you always use shared page tables), and it's - as Paul validly
says - unrelated to his series.

Jan
Julien Grall Oct. 19, 2017, 3:37 p.m. UTC | #9
Hi,

On 19/10/17 16:11, Jan Beulich wrote:
>>>> On 19.10.17 at 16:49, <Paul.Durrant@citrix.com> wrote:
>>>> I'd prefer to make the whole thing x86-only since that's the only platform
>>> on which I can test it, and indeed the code used to be x86-only. Jan objected
>>> to this so all I'm trying to achieve is that it builds for ARM. Please can you and
>>> Jan reach agreement on where the code should live and how, if at all, it
>>> should be #ifdef-ed?
>>>
>>> I am quite surprised of "it is tools-only" so it is fine to not protect
>>> it even if it is x86 only. That's probably going to bite us in the future.
>>>
>>
>> So, this appears to have reached an impasse. I don't know how to proceed
>> without having to also fix priv mapping for x86, which is a yak far too large
>> for me to shave at the moment.
> 
> Julien,
> 
> why is it that you are making refcounting on p2m insertion / removal
> a requirement for this series? We all know it's not there right now
> (and similarly also not for the IOMMU, which might affect ARM as well
> unless you always use shared page tables), and it's - as Paul validly
> says - unrelated to his series.

Well, we do at least have refcounting for foreign mapping on Arm. So if 
a foreign domain remove the mapping, the page will stay allocated until 
the last mapping is removed. For IOMMU, page tables are for the moment 
always shared.

If you don't want to fix x86 now, then it is fine. But I would 
appreciate if you don't spread that on Arm.

To give you a bit more context, I was ready to implement the Arm version 
of set_foreign_p2m_entry (it is quite trivial) to append at the end of 
this series. But given that refcounting is not done, I am more reluctant 
to do that.

Anyway, I don't plan to block common code. But I will block any 
implementation of set_foreign_p2m_entry (other than returning not 
implemented) on Arm until someone step up and fix the refcounting.

Cheers,
Jan Beulich Oct. 19, 2017, 3:47 p.m. UTC | #10
>>> On 19.10.17 at 17:37, <julien.grall@linaro.org> wrote:
> Hi,
> 
> On 19/10/17 16:11, Jan Beulich wrote:
>>>>> On 19.10.17 at 16:49, <Paul.Durrant@citrix.com> wrote:
>>>>> I'd prefer to make the whole thing x86-only since that's the only platform
>>>> on which I can test it, and indeed the code used to be x86-only. Jan 
> objected
>>>> to this so all I'm trying to achieve is that it builds for ARM. Please can 
> you and
>>>> Jan reach agreement on where the code should live and how, if at all, it
>>>> should be #ifdef-ed?
>>>>
>>>> I am quite surprised of "it is tools-only" so it is fine to not protect
>>>> it even if it is x86 only. That's probably going to bite us in the future.
>>>>
>>>
>>> So, this appears to have reached an impasse. I don't know how to proceed
>>> without having to also fix priv mapping for x86, which is a yak far too 
> large
>>> for me to shave at the moment.
>> 
>> Julien,
>> 
>> why is it that you are making refcounting on p2m insertion / removal
>> a requirement for this series? We all know it's not there right now
>> (and similarly also not for the IOMMU, which might affect ARM as well
>> unless you always use shared page tables), and it's - as Paul validly
>> says - unrelated to his series.
> 
> Well, we do at least have refcounting for foreign mapping on Arm. So if 
> a foreign domain remove the mapping, the page will stay allocated until 
> the last mapping is removed. For IOMMU, page tables are for the moment 
> always shared.
> 
> If you don't want to fix x86 now, then it is fine. But I would 
> appreciate if you don't spread that on Arm.
> 
> To give you a bit more context, I was ready to implement the Arm version 
> of set_foreign_p2m_entry (it is quite trivial) to append at the end of 
> this series. But given that refcounting is not done, I am more reluctant 
> to do that.

I don't understand: The refcounting is to be done by ARM-specific
code anyway, i.e. by the implementation of set_foreign_p2m_entry(),
not its caller. At least that's what I would have expected.

Jan
Julien Grall Oct. 19, 2017, 4:06 p.m. UTC | #11
Hi,

On 19/10/17 16:47, Jan Beulich wrote:
>>>> On 19.10.17 at 17:37, <julien.grall@linaro.org> wrote:
>> Hi,
>>
>> On 19/10/17 16:11, Jan Beulich wrote:
>>>>>> On 19.10.17 at 16:49, <Paul.Durrant@citrix.com> wrote:
>>>>>> I'd prefer to make the whole thing x86-only since that's the only platform
>>>>> on which I can test it, and indeed the code used to be x86-only. Jan
>> objected
>>>>> to this so all I'm trying to achieve is that it builds for ARM. Please can
>> you and
>>>>> Jan reach agreement on where the code should live and how, if at all, it
>>>>> should be #ifdef-ed?
>>>>>
>>>>> I am quite surprised of "it is tools-only" so it is fine to not protect
>>>>> it even if it is x86 only. That's probably going to bite us in the future.
>>>>>
>>>>
>>>> So, this appears to have reached an impasse. I don't know how to proceed
>>>> without having to also fix priv mapping for x86, which is a yak far too
>> large
>>>> for me to shave at the moment.
>>>
>>> Julien,
>>>
>>> why is it that you are making refcounting on p2m insertion / removal
>>> a requirement for this series? We all know it's not there right now
>>> (and similarly also not for the IOMMU, which might affect ARM as well
>>> unless you always use shared page tables), and it's - as Paul validly
>>> says - unrelated to his series.
>>
>> Well, we do at least have refcounting for foreign mapping on Arm. So if
>> a foreign domain remove the mapping, the page will stay allocated until
>> the last mapping is removed. For IOMMU, page tables are for the moment
>> always shared.
>>
>> If you don't want to fix x86 now, then it is fine. But I would
>> appreciate if you don't spread that on Arm.
>>
>> To give you a bit more context, I was ready to implement the Arm version
>> of set_foreign_p2m_entry (it is quite trivial) to append at the end of
>> this series. But given that refcounting is not done, I am more reluctant
>> to do that.
> 
> I don't understand: The refcounting is to be done by ARM-specific
> code anyway, i.e. by the implementation of set_foreign_p2m_entry(),
> not its caller. At least that's what I would have expected.

I thought I said it before, but it looks like not. Assuming the MFN is 
always baked by a domain, the prototype would likely need to be extended 
and take the foreign domain.

If it is not the case, we would need to find another way to do refcounting.

Cheers,
Julien Grall Oct. 19, 2017, 4:21 p.m. UTC | #12
Hi,

On 19/10/17 17:06, Julien Grall wrote:
> On 19/10/17 16:47, Jan Beulich wrote:
>>>>> On 19.10.17 at 17:37, <julien.grall@linaro.org> wrote:
>>> Hi,
>>>
>>> On 19/10/17 16:11, Jan Beulich wrote:
>>>>>>> On 19.10.17 at 16:49, <Paul.Durrant@citrix.com> wrote:
>>>>>>> I'd prefer to make the whole thing x86-only since that's the only 
>>>>>>> platform
>>>>>> on which I can test it, and indeed the code used to be x86-only. Jan
>>> objected
>>>>>> to this so all I'm trying to achieve is that it builds for ARM. 
>>>>>> Please can
>>> you and
>>>>>> Jan reach agreement on where the code should live and how, if at 
>>>>>> all, it
>>>>>> should be #ifdef-ed?
>>>>>>
>>>>>> I am quite surprised of "it is tools-only" so it is fine to not 
>>>>>> protect
>>>>>> it even if it is x86 only. That's probably going to bite us in the 
>>>>>> future.
>>>>>>
>>>>>
>>>>> So, this appears to have reached an impasse. I don't know how to 
>>>>> proceed
>>>>> without having to also fix priv mapping for x86, which is a yak far 
>>>>> too
>>> large
>>>>> for me to shave at the moment.
>>>>
>>>> Julien,
>>>>
>>>> why is it that you are making refcounting on p2m insertion / removal
>>>> a requirement for this series? We all know it's not there right now
>>>> (and similarly also not for the IOMMU, which might affect ARM as well
>>>> unless you always use shared page tables), and it's - as Paul validly
>>>> says - unrelated to his series.
>>>
>>> Well, we do at least have refcounting for foreign mapping on Arm. So if
>>> a foreign domain remove the mapping, the page will stay allocated until
>>> the last mapping is removed. For IOMMU, page tables are for the moment
>>> always shared.
>>>
>>> If you don't want to fix x86 now, then it is fine. But I would
>>> appreciate if you don't spread that on Arm.
>>>
>>> To give you a bit more context, I was ready to implement the Arm version
>>> of set_foreign_p2m_entry (it is quite trivial) to append at the end of
>>> this series. But given that refcounting is not done, I am more reluctant
>>> to do that.
>>
>> I don't understand: The refcounting is to be done by ARM-specific
>> code anyway, i.e. by the implementation of set_foreign_p2m_entry(),
>> not its caller. At least that's what I would have expected.
> 
> I thought I said it before, but it looks like not. Assuming the MFN is 
> always baked by a domain, the prototype would likely need to be extended 
> and take the foreign domain.
> 
> If it is not the case, we would need to find another way to do refcounting.

Looking a bit more at the resource you can acquire from this hypercall. 
Some of them are allocated using alloc_xenheap_page() so not assigned to 
a domain.

So I am not sure how you can expect a function set_foreign_p2m_entry to 
take reference in that case.

Cheers,
Jan Beulich Oct. 20, 2017, 6:17 a.m. UTC | #13
>>> On 19.10.17 at 18:06, <julien.grall@linaro.org> wrote:
> On 19/10/17 16:47, Jan Beulich wrote:
>> I don't understand: The refcounting is to be done by ARM-specific
>> code anyway, i.e. by the implementation of set_foreign_p2m_entry(),
>> not its caller. At least that's what I would have expected.
> 
> I thought I said it before, but it looks like not. Assuming the MFN is 
> always baked by a domain, the prototype would likely need to be extended 
> and take the foreign domain.
> 
> If it is not the case, we would need to find another way to do refcounting.

Well, adding another parameter can't be that bad of a problem to solve.

Jan
Jan Beulich Oct. 20, 2017, 6:24 a.m. UTC | #14
>>> On 19.10.17 at 18:21, <julien.grall@linaro.org> wrote:
> Looking a bit more at the resource you can acquire from this hypercall. 
> Some of them are allocated using alloc_xenheap_page() so not assigned to 
> a domain.
> 
> So I am not sure how you can expect a function set_foreign_p2m_entry to 
> take reference in that case.

Hmm, with the domain parameter added, DOMID_XEN there (for
Xen heap pages) could identify no references to be taken, if that
was really the intended behavior in that case. However, even for
Xen heap pages life time tracking ought to be done - it is for a
reason that share_xen_page_with_guest() assigns the target
domain as the owner of such pages, as that allows get_page() to
succeed for them.

Jan
Paul Durrant Oct. 20, 2017, 8:26 a.m. UTC | #15
> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 20 October 2017 07:25
> To: Julien Grall <julien.grall@linaro.org>
> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; George Dunlap
> <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Paul
> Durrant <Paul.Durrant@citrix.com>; Roger Pau Monne
> <roger.pau@citrix.com>; Wei Liu <wei.liu2@citrix.com>; Stefano Stabellini
> <sstabellini@kernel.org>; xen-devel@lists.xenproject.org; Konrad Rzeszutek
> Wilk <konrad.wilk@oracle.com>; Daniel De Graaf <dgdegra@tycho.nsa.gov>;
> Tim (Xen.org) <tim@xen.org>
> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add
> HYPERVISOR_memory_op to acquire guest resources
> 
> >>> On 19.10.17 at 18:21, <julien.grall@linaro.org> wrote:
> > Looking a bit more at the resource you can acquire from this hypercall.
> > Some of them are allocated using alloc_xenheap_page() so not assigned to
> > a domain.
> >
> > So I am not sure how you can expect a function set_foreign_p2m_entry to
> > take reference in that case.
> 
> Hmm, with the domain parameter added, DOMID_XEN there (for
> Xen heap pages) could identify no references to be taken, if that
> was really the intended behavior in that case. However, even for
> Xen heap pages life time tracking ought to be done - it is for a
> reason that share_xen_page_with_guest() assigns the target
> domain as the owner of such pages, as that allows get_page() to
> succeed for them.
> 

So, nothing I'm doing here is making anything worse, right? Grant tables are assigned to the guest, and IOREQ server pages are allocated with alloc_domheap_page() so nothing is anonymous.

  Paul

> Jan
Julien Grall Oct. 20, 2017, 10 a.m. UTC | #16
Hi Paul,

On 20/10/17 09:26, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 20 October 2017 07:25
>> To: Julien Grall <julien.grall@linaro.org>
>> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper
>> <Andrew.Cooper3@citrix.com>; George Dunlap
>> <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Paul
>> Durrant <Paul.Durrant@citrix.com>; Roger Pau Monne
>> <roger.pau@citrix.com>; Wei Liu <wei.liu2@citrix.com>; Stefano Stabellini
>> <sstabellini@kernel.org>; xen-devel@lists.xenproject.org; Konrad Rzeszutek
>> Wilk <konrad.wilk@oracle.com>; Daniel De Graaf <dgdegra@tycho.nsa.gov>;
>> Tim (Xen.org) <tim@xen.org>
>> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add
>> HYPERVISOR_memory_op to acquire guest resources
>>
>>>>> On 19.10.17 at 18:21, <julien.grall@linaro.org> wrote:
>>> Looking a bit more at the resource you can acquire from this hypercall.
>>> Some of them are allocated using alloc_xenheap_page() so not assigned to
>>> a domain.
>>>
>>> So I am not sure how you can expect a function set_foreign_p2m_entry to
>>> take reference in that case.
>>
>> Hmm, with the domain parameter added, DOMID_XEN there (for
>> Xen heap pages) could identify no references to be taken, if that
>> was really the intended behavior in that case. However, even for
>> Xen heap pages life time tracking ought to be done - it is for a
>> reason that share_xen_page_with_guest() assigns the target
>> domain as the owner of such pages, as that allows get_page() to
>> succeed for them.
>>
> 
> So, nothing I'm doing here is making anything worse, right? Grant tables are assigned to the guest, and IOREQ server pages are allocated with alloc_domheap_page() so nothing is anonymous.

I don't think grant tables is assigned to the guest today. They are 
allocated using xenheap_pages() and I can't find 
share_xen_page_with_guest().

Anyway, I discussed with Stefano about it. set_foreign_p2m_entry is 
going to be left unimplemented on Arm until someone as time to implement 
correctly the function.

Cheers,
Paul Durrant Oct. 20, 2017, 10:10 a.m. UTC | #17
> -----Original Message-----

> From: Julien Grall [mailto:julien.grall@linaro.org]

> Sent: 20 October 2017 11:00

> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich'

> <JBeulich@suse.com>

> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper

> <Andrew.Cooper3@citrix.com>; George Dunlap

> <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Roger

> Pau Monne <roger.pau@citrix.com>; Wei Liu <wei.liu2@citrix.com>; Stefano

> Stabellini <sstabellini@kernel.org>; xen-devel@lists.xenproject.org; Konrad

> Rzeszutek Wilk <konrad.wilk@oracle.com>; Daniel De Graaf

> <dgdegra@tycho.nsa.gov>; Tim (Xen.org) <tim@xen.org>

> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add

> HYPERVISOR_memory_op to acquire guest resources

> 

> Hi Paul,

> 

> On 20/10/17 09:26, Paul Durrant wrote:

> >> -----Original Message-----

> >> From: Jan Beulich [mailto:JBeulich@suse.com]

> >> Sent: 20 October 2017 07:25

> >> To: Julien Grall <julien.grall@linaro.org>

> >> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper

> >> <Andrew.Cooper3@citrix.com>; George Dunlap

> >> <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Paul

> >> Durrant <Paul.Durrant@citrix.com>; Roger Pau Monne

> >> <roger.pau@citrix.com>; Wei Liu <wei.liu2@citrix.com>; Stefano Stabellini

> >> <sstabellini@kernel.org>; xen-devel@lists.xenproject.org; Konrad

> Rzeszutek

> >> Wilk <konrad.wilk@oracle.com>; Daniel De Graaf

> <dgdegra@tycho.nsa.gov>;

> >> Tim (Xen.org) <tim@xen.org>

> >> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add

> >> HYPERVISOR_memory_op to acquire guest resources

> >>

> >>>>> On 19.10.17 at 18:21, <julien.grall@linaro.org> wrote:

> >>> Looking a bit more at the resource you can acquire from this hypercall.

> >>> Some of them are allocated using alloc_xenheap_page() so not assigned

> to

> >>> a domain.

> >>>

> >>> So I am not sure how you can expect a function set_foreign_p2m_entry

> to

> >>> take reference in that case.

> >>

> >> Hmm, with the domain parameter added, DOMID_XEN there (for

> >> Xen heap pages) could identify no references to be taken, if that

> >> was really the intended behavior in that case. However, even for

> >> Xen heap pages life time tracking ought to be done - it is for a

> >> reason that share_xen_page_with_guest() assigns the target

> >> domain as the owner of such pages, as that allows get_page() to

> >> succeed for them.

> >>

> >


Hi Julien,

> > So, nothing I'm doing here is making anything worse, right? Grant tables are

> assigned to the guest, and IOREQ server pages are allocated with

> alloc_domheap_page() so nothing is anonymous.

> 

> I don't think grant tables is assigned to the guest today. They are

> allocated using xenheap_pages() and I can't find

> share_xen_page_with_guest().


The guest would not be able to map them if they were not assigned in some way!
See the code block at http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/common/grant_table.c;hb=HEAD#l1716
It calls gnttab_create_shared_page() which is what calls through to share_xen_page_with_guest().

> 

> Anyway, I discussed with Stefano about it. set_foreign_p2m_entry is

> going to be left unimplemented on Arm until someone as time to implement

> correctly the function.

> 


That makes sense. Do you still have any issues with this patch apart from the cosmetic ones you spotted in the header?

Cheers,

  Paul

> Cheers,

> 

> --

> Julien Grall
Julien Grall Oct. 23, 2017, 6:04 p.m. UTC | #18
On 20/10/17 11:10, Paul Durrant wrote:
>> -----Original Message-----
>> From: Julien Grall [mailto:julien.grall@linaro.org]
>> Sent: 20 October 2017 11:00
>> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich'
>> <JBeulich@suse.com>
>> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper
>> <Andrew.Cooper3@citrix.com>; George Dunlap
>> <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Roger
>> Pau Monne <roger.pau@citrix.com>; Wei Liu <wei.liu2@citrix.com>; Stefano
>> Stabellini <sstabellini@kernel.org>; xen-devel@lists.xenproject.org; Konrad
>> Rzeszutek Wilk <konrad.wilk@oracle.com>; Daniel De Graaf
>> <dgdegra@tycho.nsa.gov>; Tim (Xen.org) <tim@xen.org>
>> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add
>> HYPERVISOR_memory_op to acquire guest resources
>>
>> Hi Paul,
>>
>> On 20/10/17 09:26, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>> Sent: 20 October 2017 07:25
>>>> To: Julien Grall <julien.grall@linaro.org>
>>>> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper
>>>> <Andrew.Cooper3@citrix.com>; George Dunlap
>>>> <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Paul
>>>> Durrant <Paul.Durrant@citrix.com>; Roger Pau Monne
>>>> <roger.pau@citrix.com>; Wei Liu <wei.liu2@citrix.com>; Stefano Stabellini
>>>> <sstabellini@kernel.org>; xen-devel@lists.xenproject.org; Konrad
>> Rzeszutek
>>>> Wilk <konrad.wilk@oracle.com>; Daniel De Graaf
>> <dgdegra@tycho.nsa.gov>;
>>>> Tim (Xen.org) <tim@xen.org>
>>>> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add
>>>> HYPERVISOR_memory_op to acquire guest resources
>>>>
>>>>>>> On 19.10.17 at 18:21, <julien.grall@linaro.org> wrote:
>>>>> Looking a bit more at the resource you can acquire from this hypercall.
>>>>> Some of them are allocated using alloc_xenheap_page() so not assigned
>> to
>>>>> a domain.
>>>>>
>>>>> So I am not sure how you can expect a function set_foreign_p2m_entry
>> to
>>>>> take reference in that case.
>>>>
>>>> Hmm, with the domain parameter added, DOMID_XEN there (for
>>>> Xen heap pages) could identify no references to be taken, if that
>>>> was really the intended behavior in that case. However, even for
>>>> Xen heap pages life time tracking ought to be done - it is for a
>>>> reason that share_xen_page_with_guest() assigns the target
>>>> domain as the owner of such pages, as that allows get_page() to
>>>> succeed for them.
>>>>
>>>
> 
> Hi Julien,
> 
>>> So, nothing I'm doing here is making anything worse, right? Grant tables are
>> assigned to the guest, and IOREQ server pages are allocated with
>> alloc_domheap_page() so nothing is anonymous.
>>
>> I don't think grant tables is assigned to the guest today. They are
>> allocated using xenheap_pages() and I can't find
>> share_xen_page_with_guest().
> 
> The guest would not be able to map them if they were not assigned in some way!

Do you mean for PV? For HVM/PVH, we don't check whether the page is 
assigned (see gnttab_map_frame).

> See the code block at http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/common/grant_table.c;hb=HEAD#l1716
> It calls gnttab_create_shared_page() which is what calls through to share_xen_page_with_guest().

Thank you for the link, I will have a look.

> 
>>
>> Anyway, I discussed with Stefano about it. set_foreign_p2m_entry is
>> going to be left unimplemented on Arm until someone as time to implement
>> correctly the function.
>>
> 
> That makes sense. Do you still have any issues with this patch apart from the cosmetic ones you spotted in the header?

No. Although, may I request to add a comment in the ARM helpers about 
the reference counting?

Cheers,
Paul Durrant Oct. 25, 2017, 8:40 a.m. UTC | #19
> -----Original Message-----

> From: Julien Grall [mailto:julien.grall@linaro.org]

> Sent: 23 October 2017 20:04

> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich'

> <JBeulich@suse.com>

> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper

> <Andrew.Cooper3@citrix.com>; George Dunlap

> <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Roger

> Pau Monne <roger.pau@citrix.com>; Wei Liu <wei.liu2@citrix.com>; Stefano

> Stabellini <sstabellini@kernel.org>; xen-devel@lists.xenproject.org; Konrad

> Rzeszutek Wilk <konrad.wilk@oracle.com>; Daniel De Graaf

> <dgdegra@tycho.nsa.gov>; Tim (Xen.org) <tim@xen.org>

> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add

> HYPERVISOR_memory_op to acquire guest resources

> 

> 

> 

> On 20/10/17 11:10, Paul Durrant wrote:

> >> -----Original Message-----

> >> From: Julien Grall [mailto:julien.grall@linaro.org]

> >> Sent: 20 October 2017 11:00

> >> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich'

> >> <JBeulich@suse.com>

> >> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper

> >> <Andrew.Cooper3@citrix.com>; George Dunlap

> >> <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>;

> Roger

> >> Pau Monne <roger.pau@citrix.com>; Wei Liu <wei.liu2@citrix.com>;

> Stefano

> >> Stabellini <sstabellini@kernel.org>; xen-devel@lists.xenproject.org;

> Konrad

> >> Rzeszutek Wilk <konrad.wilk@oracle.com>; Daniel De Graaf

> >> <dgdegra@tycho.nsa.gov>; Tim (Xen.org) <tim@xen.org>

> >> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add

> >> HYPERVISOR_memory_op to acquire guest resources

> >>

> >> Hi Paul,

> >>

> >> On 20/10/17 09:26, Paul Durrant wrote:

> >>>> -----Original Message-----

> >>>> From: Jan Beulich [mailto:JBeulich@suse.com]

> >>>> Sent: 20 October 2017 07:25

> >>>> To: Julien Grall <julien.grall@linaro.org>

> >>>> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper

> >>>> <Andrew.Cooper3@citrix.com>; George Dunlap

> >>>> <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>;

> Paul

> >>>> Durrant <Paul.Durrant@citrix.com>; Roger Pau Monne

> >>>> <roger.pau@citrix.com>; Wei Liu <wei.liu2@citrix.com>; Stefano

> Stabellini

> >>>> <sstabellini@kernel.org>; xen-devel@lists.xenproject.org; Konrad

> >> Rzeszutek

> >>>> Wilk <konrad.wilk@oracle.com>; Daniel De Graaf

> >> <dgdegra@tycho.nsa.gov>;

> >>>> Tim (Xen.org) <tim@xen.org>

> >>>> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add

> >>>> HYPERVISOR_memory_op to acquire guest resources

> >>>>

> >>>>>>> On 19.10.17 at 18:21, <julien.grall@linaro.org> wrote:

> >>>>> Looking a bit more at the resource you can acquire from this hypercall.

> >>>>> Some of them are allocated using alloc_xenheap_page() so not

> assigned

> >> to

> >>>>> a domain.

> >>>>>

> >>>>> So I am not sure how you can expect a function

> set_foreign_p2m_entry

> >> to

> >>>>> take reference in that case.

> >>>>

> >>>> Hmm, with the domain parameter added, DOMID_XEN there (for

> >>>> Xen heap pages) could identify no references to be taken, if that

> >>>> was really the intended behavior in that case. However, even for

> >>>> Xen heap pages life time tracking ought to be done - it is for a

> >>>> reason that share_xen_page_with_guest() assigns the target

> >>>> domain as the owner of such pages, as that allows get_page() to

> >>>> succeed for them.

> >>>>

> >>>

> >

> > Hi Julien,

> >

> >>> So, nothing I'm doing here is making anything worse, right? Grant tables

> are

> >> assigned to the guest, and IOREQ server pages are allocated with

> >> alloc_domheap_page() so nothing is anonymous.

> >>

> >> I don't think grant tables is assigned to the guest today. They are

> >> allocated using xenheap_pages() and I can't find

> >> share_xen_page_with_guest().

> >

> > The guest would not be able to map them if they were not assigned in

> some way!

> 

> Do you mean for PV? For HVM/PVH, we don't check whether the page is

> assigned (see gnttab_map_frame).


Not there, but I though there were checks in guest_physmap_add_page() where the mfn passed back from gnttab_map_frame() is actually consumed. It's all quite convoluted though so I'm not sure.

> 

> > See the code block at

> http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/common/grant_tab

> le.c;hb=HEAD#l1716

> > It calls gnttab_create_shared_page() which is what calls through to

> share_xen_page_with_guest().

> 

> Thank you for the link, I will have a look.

> 

> >

> >>

> >> Anyway, I discussed with Stefano about it. set_foreign_p2m_entry is

> >> going to be left unimplemented on Arm until someone as time to

> implement

> >> correctly the function.

> >>

> >

> > That makes sense. Do you still have any issues with this patch apart from

> the cosmetic ones you spotted in the header?

> 

> No. Although, may I request to add a comment in the ARM helpers about

> the reference counting?

> 


Sure. Thanks,

  Paul

> Cheers,

> 

> --

> Julien Grall
Jan Beulich Oct. 26, 2017, 3:26 p.m. UTC | #20
>>> On 17.10.17 at 15:24, <paul.durrant@citrix.com> wrote:
> @@ -535,6 +588,48 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
>                  rc = -EFAULT;
>              break;
>  
> +        case XENMEM_acquire_resource:
> +        {
> +            const xen_ulong_t *xen_frame_list =
> +                (xen_ulong_t *)(nat.mar + 1);
> +            compat_ulong_t *compat_frame_list =
> +                (compat_ulong_t *)(nat.mar + 1);
> +
> +            if ( cmp.mar.nr_frames == 0 )

Doesn't this need to be compat_handle_is_null(cmp.mar.frame_list), or
a combination of both?

> +            {
> +                DEFINE_XEN_GUEST_HANDLE(compat_mem_acquire_resource_t);
> +
> +                if ( __copy_field_to_guest(
> +                         guest_handle_cast(compat,
> +                                           compat_mem_acquire_resource_t),
> +                         &cmp.mar, nr_frames) )
> +                    return -EFAULT;
> +            }
> +            else
> +            {
> +                /*
> +                 * NOTE: the smaller compat array overwrites the native
> +                 *       array.
> +                 */

I think I had already asked for a respective BUILD_BUG_ON().

> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -965,6 +965,95 @@ static long xatp_permission_check(struct domain *d, unsigned int space)
>      return xsm_add_to_physmap(XSM_TARGET, current->domain, d);
>  }
>  
> +static int acquire_resource(
> +    XEN_GUEST_HANDLE_PARAM(xen_mem_acquire_resource_t) arg)
> +{
> +    struct domain *d, *currd = current->domain;
> +    xen_mem_acquire_resource_t xmar;
> +    unsigned long mfn_list[2];
> +    int rc;
> +
> +    if ( copy_from_guest(&xmar, arg, 1) )
> +        return -EFAULT;
> +
> +    if ( xmar.pad != 0 )
> +        return -EINVAL;
> +
> +    if ( guest_handle_is_null(xmar.frame_list) )
> +    {
> +        /* Special case for querying implementation limit */
> +        if ( xmar.nr_frames == 0 )

Perhaps invert the condition to reduce ...

> +        {
> +            xmar.nr_frames = ARRAY_SIZE(mfn_list);
> +
> +            if ( __copy_field_to_guest(arg, &xmar, nr_frames) )
> +                return -EFAULT;
> +
> +            return 0;
> +        }

... overall indentation?

> +        return -EINVAL;
> +    }
> +
> +    if ( xmar.nr_frames == 0 )
> +        return -EINVAL;

Why? (Almost?) everywhere else zero counts are simply no-ops, which
result in success returns.

> +    if ( xmar.nr_frames > ARRAY_SIZE(mfn_list) )
> +        return -E2BIG;
> +
> +    d = rcu_lock_domain_by_any_id(xmar.domid);

This being a tools only interface, why "by_any_id" instead of
"remote_domain_by_id"? In particular ...

> +    if ( d == NULL )
> +        return -ESRCH;
> +
> +    rc = xsm_domain_resource_map(XSM_DM_PRIV, d);

... an unprivileged dm domain should probably not be permitted to
invoke this on itself.

> +    if ( rc )
> +        goto out;
> +
> +    switch ( xmar.type )
> +    {
> +    default:
> +        rc = -EOPNOTSUPP;
> +        break;
> +    }
> +
> +    if ( rc )
> +        goto out;
> +
> +    if ( !paging_mode_translate(currd) )
> +    {
> +        if ( copy_to_guest(xmar.frame_list, mfn_list, xmar.nr_frames) )
> +            rc = -EFAULT;
> +    }
> +    else
> +    {
> +        xen_pfn_t gfn_list[ARRAY_SIZE(mfn_list)];
> +        unsigned int i;
> +
> +        rc = -EFAULT;
> +        if ( copy_from_guest(gfn_list, xmar.frame_list, xmar.nr_frames) )
> +            goto out;
> +
> +        for ( i = 0; i < xmar.nr_frames; i++ )
> +        {
> +            rc = set_foreign_p2m_entry(currd, gfn_list[i],
> +                                       _mfn(mfn_list[i]));
> +            if ( rc )
> +            {
> +                /*
> +                 * Make sure rc is -EIO for any interation other than
> +                 * the first.

"iteration", but why is this important in the first place?

> --- a/xen/include/public/memory.h
> +++ b/xen/include/public/memory.h
> @@ -599,6 +599,47 @@ struct xen_reserved_device_memory_map {
>  typedef struct xen_reserved_device_memory_map 
> xen_reserved_device_memory_map_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_reserved_device_memory_map_t);
>  
> +/*
> + * Get the pages for a particular guest resource, so that they can be
> + * mapped directly by a tools domain.
> + */
> +#define XENMEM_acquire_resource 28
> +struct xen_mem_acquire_resource {
> +    /* IN - the domain whose resource is to be mapped */
> +    domid_t domid;
> +    /* IN - the type of resource */
> +    uint16_t type;
> +    /*
> +     * IN - a type-specific resource identifier, which must be zero
> +     *      unless stated otherwise.
> +     */
> +    uint32_t id;
> +    /* IN/OUT - As an IN parameter number of frames of the resource
> +     *          to be mapped. However, if the specified value is 0 and
> +     *          frame_list is NULL then this field will be set to the
> +     *          maximum value supported by the implementation on return.
> +     */
> +    uint32_t nr_frames;
> +    uint32_t pad;
> +    /* IN - the index of the initial frame to be mapped. This parameter
> +     *      is ignored if nr_frames is 0.
> +     */
> +    uint64_aligned_t frame;
> +    /* IN/OUT - If the tools domain is PV then, upon return, frame_list
> +     *          will be populated with the MFNs of the resource.
> +     *          If the tools domain is HVM then it is expected that, on
> +     *          entry, frame_list will be populated with a list of GFNs
> +     *          that will be mapped to the MFNs of the resource.
> +     *          If -EIO is returned then the frame_list has only been
> +     *          partially mapped and it is up to the caller to unmap all
> +     *          the GFNs.
> +     *          This parameter may be NULL if nr_frames is 0.
> +     */
> +    XEN_GUEST_HANDLE(xen_ulong_t) frame_list;

This is still xen_ulong_t, which I can live with, but then you shouldn't
copy into / out of arrays of other types in acquire_resource() (the
more that this is common code, and iirc xen_ulong_t and
unsigned long aren't the same thing on ARM32).

Jan
Julien Grall Oct. 26, 2017, 3:32 p.m. UTC | #21
On 26/10/17 16:26, Jan Beulich wrote:
>>>> On 17.10.17 at 15:24, <paul.durrant@citrix.com> wrote:
>> +    /* IN/OUT - If the tools domain is PV then, upon return, frame_list
>> +     *          will be populated with the MFNs of the resource.
>> +     *          If the tools domain is HVM then it is expected that, on
>> +     *          entry, frame_list will be populated with a list of GFNs
>> +     *          that will be mapped to the MFNs of the resource.
>> +     *          If -EIO is returned then the frame_list has only been
>> +     *          partially mapped and it is up to the caller to unmap all
>> +     *          the GFNs.
>> +     *          This parameter may be NULL if nr_frames is 0.
>> +     */
>> +    XEN_GUEST_HANDLE(xen_ulong_t) frame_list;
> 
> This is still xen_ulong_t, which I can live with, but then you shouldn't
> copy into / out of arrays of other types in acquire_resource() (the
> more that this is common code, and iirc xen_ulong_t and
> unsigned long aren't the same thing on ARM32).

xen_ulong_t is always 64-bit on Arm (32-bit and 64-bit). But shouldn't 
we use xen_pfn_t here?

Cheers,
Jan Beulich Oct. 26, 2017, 3:39 p.m. UTC | #22
>>> On 26.10.17 at 17:32, <julien.grall@linaro.org> wrote:
> On 26/10/17 16:26, Jan Beulich wrote:
>>>>> On 17.10.17 at 15:24, <paul.durrant@citrix.com> wrote:
>>> +    /* IN/OUT - If the tools domain is PV then, upon return, frame_list
>>> +     *          will be populated with the MFNs of the resource.
>>> +     *          If the tools domain is HVM then it is expected that, on
>>> +     *          entry, frame_list will be populated with a list of GFNs
>>> +     *          that will be mapped to the MFNs of the resource.
>>> +     *          If -EIO is returned then the frame_list has only been
>>> +     *          partially mapped and it is up to the caller to unmap all
>>> +     *          the GFNs.
>>> +     *          This parameter may be NULL if nr_frames is 0.
>>> +     */
>>> +    XEN_GUEST_HANDLE(xen_ulong_t) frame_list;
>> 
>> This is still xen_ulong_t, which I can live with, but then you shouldn't
>> copy into / out of arrays of other types in acquire_resource() (the
>> more that this is common code, and iirc xen_ulong_t and
>> unsigned long aren't the same thing on ARM32).
> 
> xen_ulong_t is always 64-bit on Arm (32-bit and 64-bit). But shouldn't 
> we use xen_pfn_t here?

I had put this question up earlier, but iirc Paul didn't like it.

Jan
Julien Grall Oct. 27, 2017, 10:46 a.m. UTC | #23
Hi,

On 26/10/17 16:39, Jan Beulich wrote:
>>>> On 26.10.17 at 17:32, <julien.grall@linaro.org> wrote:
>> On 26/10/17 16:26, Jan Beulich wrote:
>>>>>> On 17.10.17 at 15:24, <paul.durrant@citrix.com> wrote:
>>>> +    /* IN/OUT - If the tools domain is PV then, upon return, frame_list
>>>> +     *          will be populated with the MFNs of the resource.
>>>> +     *          If the tools domain is HVM then it is expected that, on
>>>> +     *          entry, frame_list will be populated with a list of GFNs
>>>> +     *          that will be mapped to the MFNs of the resource.
>>>> +     *          If -EIO is returned then the frame_list has only been
>>>> +     *          partially mapped and it is up to the caller to unmap all
>>>> +     *          the GFNs.
>>>> +     *          This parameter may be NULL if nr_frames is 0.
>>>> +     */
>>>> +    XEN_GUEST_HANDLE(xen_ulong_t) frame_list;
>>>
>>> This is still xen_ulong_t, which I can live with, but then you shouldn't
>>> copy into / out of arrays of other types in acquire_resource() (the
>>> more that this is common code, and iirc xen_ulong_t and
>>> unsigned long aren't the same thing on ARM32).
>>
>> xen_ulong_t is always 64-bit on Arm (32-bit and 64-bit). But shouldn't
>> we use xen_pfn_t here?
> 
> I had put this question up earlier, but iirc Paul didn't like it.

I'd like to understand why Paul doesn't like it. We should never assume 
that a frame fit in xen_ulong_t. xen_pfn_t was exactly introduced for 
that purpose.

Cheers,
Paul Durrant Oct. 27, 2017, 3:19 p.m. UTC | #24
> -----Original Message-----

> From: Julien Grall [mailto:julien.grall@linaro.org]

> Sent: 27 October 2017 12:46

> To: Jan Beulich <JBeulich@suse.com>; Paul Durrant

> <Paul.Durrant@citrix.com>

> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper

> <Andrew.Cooper3@citrix.com>; Wei Liu <wei.liu2@citrix.com>; George

> Dunlap <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>;

> Stefano Stabellini <sstabellini@kernel.org>; xen-devel@lists.xenproject.org;

> Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; Daniel De Graaf

> <dgdegra@tycho.nsa.gov>; Tim (Xen.org) <tim@xen.org>

> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add

> HYPERVISOR_memory_op to acquire guest resources

> 

> Hi,

> 

> On 26/10/17 16:39, Jan Beulich wrote:

> >>>> On 26.10.17 at 17:32, <julien.grall@linaro.org> wrote:

> >> On 26/10/17 16:26, Jan Beulich wrote:

> >>>>>> On 17.10.17 at 15:24, <paul.durrant@citrix.com> wrote:

> >>>> +    /* IN/OUT - If the tools domain is PV then, upon return, frame_list

> >>>> +     *          will be populated with the MFNs of the resource.

> >>>> +     *          If the tools domain is HVM then it is expected that, on

> >>>> +     *          entry, frame_list will be populated with a list of GFNs

> >>>> +     *          that will be mapped to the MFNs of the resource.

> >>>> +     *          If -EIO is returned then the frame_list has only been

> >>>> +     *          partially mapped and it is up to the caller to unmap all

> >>>> +     *          the GFNs.

> >>>> +     *          This parameter may be NULL if nr_frames is 0.

> >>>> +     */

> >>>> +    XEN_GUEST_HANDLE(xen_ulong_t) frame_list;

> >>>

> >>> This is still xen_ulong_t, which I can live with, but then you shouldn't

> >>> copy into / out of arrays of other types in acquire_resource() (the

> >>> more that this is common code, and iirc xen_ulong_t and

> >>> unsigned long aren't the same thing on ARM32).

> >>

> >> xen_ulong_t is always 64-bit on Arm (32-bit and 64-bit). But shouldn't

> >> we use xen_pfn_t here?

> >

> > I had put this question up earlier, but iirc Paul didn't like it.

> 

> I'd like to understand why Paul doesn't like it. We should never assume

> that a frame fit in xen_ulong_t. xen_pfn_t was exactly introduced for

> that purpose.


My reservation is whether xen_pfn_t is intended to hold either gfns or mfns, since this hypercall uses the same array for both. If it suitable then I am happy to change it, but Andrew led me to believe otherwise.

  Paul

> 

> Cheers,

> 

> --

> Julien Grall
Paul Durrant Oct. 30, 2017, 12:05 p.m. UTC | #25
> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 26 October 2017 16:27
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; Wei Liu <wei.liu2@citrix.com>; George
> Dunlap <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>;
> Stefano Stabellini <sstabellini@kernel.org>; xen-devel@lists.xenproject.org;
> Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; Daniel De Graaf
> <dgdegra@tycho.nsa.gov>; Tim (Xen.org) <tim@xen.org>
> Subject: Re: [PATCH v12 05/11] x86/mm: add HYPERVISOR_memory_op to
> acquire guest resources
> 
> >>> On 17.10.17 at 15:24, <paul.durrant@citrix.com> wrote:
> > @@ -535,6 +588,48 @@ int compat_memory_op(unsigned int cmd,
> XEN_GUEST_HANDLE_PARAM(void) compat)
> >                  rc = -EFAULT;
> >              break;
> >
> > +        case XENMEM_acquire_resource:
> > +        {
> > +            const xen_ulong_t *xen_frame_list =
> > +                (xen_ulong_t *)(nat.mar + 1);
> > +            compat_ulong_t *compat_frame_list =
> > +                (compat_ulong_t *)(nat.mar + 1);
> > +
> > +            if ( cmp.mar.nr_frames == 0 )
> 
> Doesn't this need to be compat_handle_is_null(cmp.mar.frame_list), or
> a combination of both?

Sorry, yes this was a hang-over from the old scheme.

> 
> > +            {
> > +
> DEFINE_XEN_GUEST_HANDLE(compat_mem_acquire_resource_t);
> > +
> > +                if ( __copy_field_to_guest(
> > +                         guest_handle_cast(compat,
> > +                                           compat_mem_acquire_resource_t),
> > +                         &cmp.mar, nr_frames) )
> > +                    return -EFAULT;
> > +            }
> > +            else
> > +            {
> > +                /*
> > +                 * NOTE: the smaller compat array overwrites the native
> > +                 *       array.
> > +                 */
> 
> I think I had already asked for a respective BUILD_BUG_ON().

You asked for the comment. I can't find where you asked for a BUILD_BUG_ON() but I can certainly add one.

> 
> > --- a/xen/common/memory.c
> > +++ b/xen/common/memory.c
> > @@ -965,6 +965,95 @@ static long xatp_permission_check(struct domain
> *d, unsigned int space)
> >      return xsm_add_to_physmap(XSM_TARGET, current->domain, d);
> >  }
> >
> > +static int acquire_resource(
> > +    XEN_GUEST_HANDLE_PARAM(xen_mem_acquire_resource_t) arg)
> > +{
> > +    struct domain *d, *currd = current->domain;
> > +    xen_mem_acquire_resource_t xmar;
> > +    unsigned long mfn_list[2];
> > +    int rc;
> > +
> > +    if ( copy_from_guest(&xmar, arg, 1) )
> > +        return -EFAULT;
> > +
> > +    if ( xmar.pad != 0 )
> > +        return -EINVAL;
> > +
> > +    if ( guest_handle_is_null(xmar.frame_list) )
> > +    {
> > +        /* Special case for querying implementation limit */
> > +        if ( xmar.nr_frames == 0 )
> 
> Perhaps invert the condition to reduce ...
> 
> > +        {
> > +            xmar.nr_frames = ARRAY_SIZE(mfn_list);
> > +
> > +            if ( __copy_field_to_guest(arg, &xmar, nr_frames) )
> > +                return -EFAULT;
> > +
> > +            return 0;
> > +        }
> 
> ... overall indentation?
> 
> > +        return -EINVAL;
> > +    }
> > +
> > +    if ( xmar.nr_frames == 0 )
> > +        return -EINVAL;
> 
> Why? (Almost?) everywhere else zero counts are simply no-ops, which
> result in success returns.

Ok, I'll drop the check.

> 
> > +    if ( xmar.nr_frames > ARRAY_SIZE(mfn_list) )
> > +        return -E2BIG;
> > +
> > +    d = rcu_lock_domain_by_any_id(xmar.domid);
> 
> This being a tools only interface, why "by_any_id" instead of
> "remote_domain_by_id"? In particular ...
> 
> > +    if ( d == NULL )
> > +        return -ESRCH;
> > +
> > +    rc = xsm_domain_resource_map(XSM_DM_PRIV, d);
> 
> ... an unprivileged dm domain should probably not be permitted to
> invoke this on itself.

True.

> 
> > +    if ( rc )
> > +        goto out;
> > +
> > +    switch ( xmar.type )
> > +    {
> > +    default:
> > +        rc = -EOPNOTSUPP;
> > +        break;
> > +    }
> > +
> > +    if ( rc )
> > +        goto out;
> > +
> > +    if ( !paging_mode_translate(currd) )
> > +    {
> > +        if ( copy_to_guest(xmar.frame_list, mfn_list, xmar.nr_frames) )
> > +            rc = -EFAULT;
> > +    }
> > +    else
> > +    {
> > +        xen_pfn_t gfn_list[ARRAY_SIZE(mfn_list)];
> > +        unsigned int i;
> > +
> > +        rc = -EFAULT;
> > +        if ( copy_from_guest(gfn_list, xmar.frame_list, xmar.nr_frames) )
> > +            goto out;
> > +
> > +        for ( i = 0; i < xmar.nr_frames; i++ )
> > +        {
> > +            rc = set_foreign_p2m_entry(currd, gfn_list[i],
> > +                                       _mfn(mfn_list[i]));
> > +            if ( rc )
> > +            {
> > +                /*
> > +                 * Make sure rc is -EIO for any interation other than
> > +                 * the first.
> 
> "iteration", but why is this important in the first place?

The header explains:

"If -EIO is returned then the frame_list has only been partially mapped and it is up to the caller to unmap all the GFNs."

Particularly, on ARM, set_foreign_p2m_entry() will always return -EOPNOTSUPP so I want to make sure that is returned.

> 
> > --- a/xen/include/public/memory.h
> > +++ b/xen/include/public/memory.h
> > @@ -599,6 +599,47 @@ struct xen_reserved_device_memory_map {
> >  typedef struct xen_reserved_device_memory_map
> > xen_reserved_device_memory_map_t;
> >  DEFINE_XEN_GUEST_HANDLE(xen_reserved_device_memory_map_t);
> >
> > +/*
> > + * Get the pages for a particular guest resource, so that they can be
> > + * mapped directly by a tools domain.
> > + */
> > +#define XENMEM_acquire_resource 28
> > +struct xen_mem_acquire_resource {
> > +    /* IN - the domain whose resource is to be mapped */
> > +    domid_t domid;
> > +    /* IN - the type of resource */
> > +    uint16_t type;
> > +    /*
> > +     * IN - a type-specific resource identifier, which must be zero
> > +     *      unless stated otherwise.
> > +     */
> > +    uint32_t id;
> > +    /* IN/OUT - As an IN parameter number of frames of the resource
> > +     *          to be mapped. However, if the specified value is 0 and
> > +     *          frame_list is NULL then this field will be set to the
> > +     *          maximum value supported by the implementation on return.
> > +     */
> > +    uint32_t nr_frames;
> > +    uint32_t pad;
> > +    /* IN - the index of the initial frame to be mapped. This parameter
> > +     *      is ignored if nr_frames is 0.
> > +     */
> > +    uint64_aligned_t frame;
> > +    /* IN/OUT - If the tools domain is PV then, upon return, frame_list
> > +     *          will be populated with the MFNs of the resource.
> > +     *          If the tools domain is HVM then it is expected that, on
> > +     *          entry, frame_list will be populated with a list of GFNs
> > +     *          that will be mapped to the MFNs of the resource.
> > +     *          If -EIO is returned then the frame_list has only been
> > +     *          partially mapped and it is up to the caller to unmap all
> > +     *          the GFNs.
> > +     *          This parameter may be NULL if nr_frames is 0.
> > +     */
> > +    XEN_GUEST_HANDLE(xen_ulong_t) frame_list;
> 
> This is still xen_ulong_t, which I can live with, but then you shouldn't
> copy into / out of arrays of other types in acquire_resource() (the
> more that this is common code, and iirc xen_ulong_t and
> unsigned long aren't the same thing on ARM32).

Given the weight of opinion, I'll change this to xen_pfn_t.

  Paul

> 
> Jan
Julien Grall Oct. 30, 2017, 12:08 p.m. UTC | #26
Hi Paul,

On 27/10/17 16:19, Paul Durrant wrote:
>> -----Original Message-----
>> From: Julien Grall [mailto:julien.grall@linaro.org]
>> Sent: 27 October 2017 12:46
>> To: Jan Beulich <JBeulich@suse.com>; Paul Durrant
>> <Paul.Durrant@citrix.com>
>> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper
>> <Andrew.Cooper3@citrix.com>; Wei Liu <wei.liu2@citrix.com>; George
>> Dunlap <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>;
>> Stefano Stabellini <sstabellini@kernel.org>; xen-devel@lists.xenproject.org;
>> Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; Daniel De Graaf
>> <dgdegra@tycho.nsa.gov>; Tim (Xen.org) <tim@xen.org>
>> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add
>> HYPERVISOR_memory_op to acquire guest resources
>>
>> Hi,
>>
>> On 26/10/17 16:39, Jan Beulich wrote:
>>>>>> On 26.10.17 at 17:32, <julien.grall@linaro.org> wrote:
>>>> On 26/10/17 16:26, Jan Beulich wrote:
>>>>>>>> On 17.10.17 at 15:24, <paul.durrant@citrix.com> wrote:
>>>>>> +    /* IN/OUT - If the tools domain is PV then, upon return, frame_list
>>>>>> +     *          will be populated with the MFNs of the resource.
>>>>>> +     *          If the tools domain is HVM then it is expected that, on
>>>>>> +     *          entry, frame_list will be populated with a list of GFNs
>>>>>> +     *          that will be mapped to the MFNs of the resource.
>>>>>> +     *          If -EIO is returned then the frame_list has only been
>>>>>> +     *          partially mapped and it is up to the caller to unmap all
>>>>>> +     *          the GFNs.
>>>>>> +     *          This parameter may be NULL if nr_frames is 0.
>>>>>> +     */
>>>>>> +    XEN_GUEST_HANDLE(xen_ulong_t) frame_list;
>>>>>
>>>>> This is still xen_ulong_t, which I can live with, but then you shouldn't
>>>>> copy into / out of arrays of other types in acquire_resource() (the
>>>>> more that this is common code, and iirc xen_ulong_t and
>>>>> unsigned long aren't the same thing on ARM32).
>>>>
>>>> xen_ulong_t is always 64-bit on Arm (32-bit and 64-bit). But shouldn't
>>>> we use xen_pfn_t here?
>>>
>>> I had put this question up earlier, but iirc Paul didn't like it.
>>
>> I'd like to understand why Paul doesn't like it. We should never assume
>> that a frame fit in xen_ulong_t. xen_pfn_t was exactly introduced for
>> that purpose.
> 
> My reservation is whether xen_pfn_t is intended to hold either gfns or mfns, since this hypercall uses the same array for both. If it suitable then I am happy to change it, but Andrew led me to believe otherwise.

Looking at the public hearders, xen_pfn_t is been used for both MFN (see 
xenpf_add_memtype) and GFN (see gnttab_setup_table).

So I think it would be fine to do the same here.

Cheers,
Paul Durrant Oct. 30, 2017, 1:10 p.m. UTC | #27
> -----Original Message-----

> From: Julien Grall [mailto:julien.grall@linaro.org]

> Sent: 30 October 2017 12:09

> To: Paul Durrant <Paul.Durrant@citrix.com>; Jan Beulich

> <JBeulich@suse.com>

> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper

> <Andrew.Cooper3@citrix.com>; Wei Liu <wei.liu2@citrix.com>; George

> Dunlap <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>;

> Stefano Stabellini <sstabellini@kernel.org>; xen-devel@lists.xenproject.org;

> Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; Daniel De Graaf

> <dgdegra@tycho.nsa.gov>; Tim (Xen.org) <tim@xen.org>

> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add

> HYPERVISOR_memory_op to acquire guest resources

> 

> Hi Paul,

> 

> On 27/10/17 16:19, Paul Durrant wrote:

> >> -----Original Message-----

> >> From: Julien Grall [mailto:julien.grall@linaro.org]

> >> Sent: 27 October 2017 12:46

> >> To: Jan Beulich <JBeulich@suse.com>; Paul Durrant

> >> <Paul.Durrant@citrix.com>

> >> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper

> >> <Andrew.Cooper3@citrix.com>; Wei Liu <wei.liu2@citrix.com>; George

> >> Dunlap <George.Dunlap@citrix.com>; Ian Jackson

> <Ian.Jackson@citrix.com>;

> >> Stefano Stabellini <sstabellini@kernel.org>; xen-

> devel@lists.xenproject.org;

> >> Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; Daniel De Graaf

> >> <dgdegra@tycho.nsa.gov>; Tim (Xen.org) <tim@xen.org>

> >> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add

> >> HYPERVISOR_memory_op to acquire guest resources

> >>

> >> Hi,

> >>

> >> On 26/10/17 16:39, Jan Beulich wrote:

> >>>>>> On 26.10.17 at 17:32, <julien.grall@linaro.org> wrote:

> >>>> On 26/10/17 16:26, Jan Beulich wrote:

> >>>>>>>> On 17.10.17 at 15:24, <paul.durrant@citrix.com> wrote:

> >>>>>> +    /* IN/OUT - If the tools domain is PV then, upon return,

> frame_list

> >>>>>> +     *          will be populated with the MFNs of the resource.

> >>>>>> +     *          If the tools domain is HVM then it is expected that, on

> >>>>>> +     *          entry, frame_list will be populated with a list of GFNs

> >>>>>> +     *          that will be mapped to the MFNs of the resource.

> >>>>>> +     *          If -EIO is returned then the frame_list has only been

> >>>>>> +     *          partially mapped and it is up to the caller to unmap all

> >>>>>> +     *          the GFNs.

> >>>>>> +     *          This parameter may be NULL if nr_frames is 0.

> >>>>>> +     */

> >>>>>> +    XEN_GUEST_HANDLE(xen_ulong_t) frame_list;

> >>>>>

> >>>>> This is still xen_ulong_t, which I can live with, but then you shouldn't

> >>>>> copy into / out of arrays of other types in acquire_resource() (the

> >>>>> more that this is common code, and iirc xen_ulong_t and

> >>>>> unsigned long aren't the same thing on ARM32).

> >>>>

> >>>> xen_ulong_t is always 64-bit on Arm (32-bit and 64-bit). But shouldn't

> >>>> we use xen_pfn_t here?

> >>>

> >>> I had put this question up earlier, but iirc Paul didn't like it.

> >>

> >> I'd like to understand why Paul doesn't like it. We should never assume

> >> that a frame fit in xen_ulong_t. xen_pfn_t was exactly introduced for

> >> that purpose.

> >

> > My reservation is whether xen_pfn_t is intended to hold either gfns or

> mfns, since this hypercall uses the same array for both. If it suitable then I am

> happy to change it, but Andrew led me to believe otherwise.

> 

> Looking at the public hearders, xen_pfn_t is been used for both MFN (see

> xenpf_add_memtype) and GFN (see gnttab_setup_table).

> 

> So I think it would be fine to do the same here.


Yes, I'm going to change it in the next version.

  Cheers,

    Paul

> Cheers,

> 

> --

> Julien Grall
diff mbox

Patch

diff --git a/tools/flask/policy/modules/xen.if b/tools/flask/policy/modules/xen.if
index 55437496f6..07cba8a15d 100644
--- a/tools/flask/policy/modules/xen.if
+++ b/tools/flask/policy/modules/xen.if
@@ -52,7 +52,8 @@  define(`create_domain_common', `
 			settime setdomainhandle getvcpucontext set_misc_info };
 	allow $1 $2:domain2 { set_cpuid settsc setscheduler setclaim
 			set_max_evtchn set_vnumainfo get_vnumainfo cacheflush
-			psr_cmt_op psr_cat_op soft_reset set_gnttab_limits };
+			psr_cmt_op psr_cat_op soft_reset set_gnttab_limits
+			resource_map };
 	allow $1 $2:security check_context;
 	allow $1 $2:shadow enable;
 	allow $1 $2:mmu { map_read map_write adjust memorymap physmap pinpage mmuext_op updatemp };
@@ -152,6 +153,7 @@  define(`device_model', `
 	allow $1 $2_target:domain { getdomaininfo shutdown };
 	allow $1 $2_target:mmu { map_read map_write adjust physmap target_hack };
 	allow $1 $2_target:hvm { getparam setparam hvmctl cacheattr dm };
+	allow $1 $2_target:domain2 resource_map;
 ')
 
 # make_device_model(priv, dm_dom, hvm_dom)
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index c72a3cdebb..71bb9b4f93 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1132,8 +1132,7 @@  static int set_typed_p2m_entry(struct domain *d, unsigned long gfn_l,
 }
 
 /* Set foreign mfn in the given guest's p2m table. */
-static int set_foreign_p2m_entry(struct domain *d, unsigned long gfn,
-                                 mfn_t mfn)
+int set_foreign_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
 {
     return set_typed_p2m_entry(d, gfn, mfn, PAGE_ORDER_4K, p2m_map_foreign,
                                p2m_get_hostp2m(d)->default_access);
diff --git a/xen/common/compat/memory.c b/xen/common/compat/memory.c
index 35bb259808..7f2e2e3107 100644
--- a/xen/common/compat/memory.c
+++ b/xen/common/compat/memory.c
@@ -71,6 +71,7 @@  int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
             struct xen_remove_from_physmap *xrfp;
             struct xen_vnuma_topology_info *vnuma;
             struct xen_mem_access_op *mao;
+            struct xen_mem_acquire_resource *mar;
         } nat;
         union {
             struct compat_memory_reservation rsrv;
@@ -79,6 +80,7 @@  int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
             struct compat_add_to_physmap_batch atpb;
             struct compat_vnuma_topology_info vnuma;
             struct compat_mem_access_op mao;
+            struct compat_mem_acquire_resource mar;
         } cmp;
 
         set_xen_guest_handle(nat.hnd, COMPAT_ARG_XLAT_VIRT_BASE);
@@ -395,6 +397,57 @@  int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
         }
 #endif
 
+        case XENMEM_acquire_resource:
+        {
+            xen_ulong_t *xen_frame_list;
+            unsigned int max_nr_frames;
+
+            if ( copy_from_guest(&cmp.mar, compat, 1) )
+                return -EFAULT;
+
+            /*
+             * The number of frames handled is currently limited to a
+             * small number by the underlying implementation, so the
+             * scratch space should be sufficient for bouncing the
+             * frame addresses.
+             */
+            max_nr_frames = (COMPAT_ARG_XLAT_SIZE - sizeof(*nat.mar)) /
+                sizeof(*xen_frame_list);
+
+            if ( cmp.mar.nr_frames > max_nr_frames )
+                return -E2BIG;
+
+            if ( compat_handle_is_null(cmp.mar.frame_list) )
+                xen_frame_list = NULL;
+            else
+            {
+                xen_frame_list = (xen_ulong_t *)(nat.mar + 1);
+
+                if ( !compat_handle_okay(cmp.mar.frame_list,
+                                         cmp.mar.nr_frames) )
+                    return -EFAULT;
+
+                for ( i = 0; i < cmp.mar.nr_frames; i++ )
+                {
+                    compat_ulong_t frame;
+
+                    if ( __copy_from_compat_offset(
+                             &frame, cmp.mar.frame_list, i, 1) )
+                        return -EFAULT;
+
+                    xen_frame_list[i] = frame;
+                }
+            }
+
+#define XLAT_mem_acquire_resource_HNDL_frame_list(_d_, _s_) \
+            set_xen_guest_handle((_d_)->frame_list, xen_frame_list)
+
+            XLAT_mem_acquire_resource(nat.mar, &cmp.mar);
+
+#undef XLAT_mem_acquire_resource_HNDL_frame_list
+
+            break;
+        }
         default:
             return compat_arch_memory_op(cmd, compat);
         }
@@ -535,6 +588,48 @@  int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
                 rc = -EFAULT;
             break;
 
+        case XENMEM_acquire_resource:
+        {
+            const xen_ulong_t *xen_frame_list =
+                (xen_ulong_t *)(nat.mar + 1);
+            compat_ulong_t *compat_frame_list =
+                (compat_ulong_t *)(nat.mar + 1);
+
+            if ( cmp.mar.nr_frames == 0 )
+            {
+                DEFINE_XEN_GUEST_HANDLE(compat_mem_acquire_resource_t);
+
+                if ( __copy_field_to_guest(
+                         guest_handle_cast(compat,
+                                           compat_mem_acquire_resource_t),
+                         &cmp.mar, nr_frames) )
+                    return -EFAULT;
+            }
+            else
+            {
+                /*
+                 * NOTE: the smaller compat array overwrites the native
+                 *       array.
+                 */
+                for ( i = 0; i < cmp.mar.nr_frames; i++ )
+                {
+                    compat_ulong_t frame = xen_frame_list[i];
+
+                    if ( frame != xen_frame_list[i] )
+                        return -ERANGE;
+
+                    compat_frame_list[i] = frame;
+                }
+
+                if ( __copy_to_compat_offset(cmp.mar.frame_list, 0,
+                                             compat_frame_list,
+                                             cmp.mar.nr_frames) )
+                    return -EFAULT;
+            }
+
+            break;
+        }
+
         default:
             domain_crash(current->domain);
             split = 0;
diff --git a/xen/common/memory.c b/xen/common/memory.c
index ad987e0f29..cdd2e030cf 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -965,6 +965,95 @@  static long xatp_permission_check(struct domain *d, unsigned int space)
     return xsm_add_to_physmap(XSM_TARGET, current->domain, d);
 }
 
+static int acquire_resource(
+    XEN_GUEST_HANDLE_PARAM(xen_mem_acquire_resource_t) arg)
+{
+    struct domain *d, *currd = current->domain;
+    xen_mem_acquire_resource_t xmar;
+    unsigned long mfn_list[2];
+    int rc;
+
+    if ( copy_from_guest(&xmar, arg, 1) )
+        return -EFAULT;
+
+    if ( xmar.pad != 0 )
+        return -EINVAL;
+
+    if ( guest_handle_is_null(xmar.frame_list) )
+    {
+        /* Special case for querying implementation limit */
+        if ( xmar.nr_frames == 0 )
+        {
+            xmar.nr_frames = ARRAY_SIZE(mfn_list);
+
+            if ( __copy_field_to_guest(arg, &xmar, nr_frames) )
+                return -EFAULT;
+
+            return 0;
+        }
+
+        return -EINVAL;
+    }
+
+    if ( xmar.nr_frames == 0 )
+        return -EINVAL;
+
+    if ( xmar.nr_frames > ARRAY_SIZE(mfn_list) )
+        return -E2BIG;
+
+    d = rcu_lock_domain_by_any_id(xmar.domid);
+    if ( d == NULL )
+        return -ESRCH;
+
+    rc = xsm_domain_resource_map(XSM_DM_PRIV, d);
+    if ( rc )
+        goto out;
+
+    switch ( xmar.type )
+    {
+    default:
+        rc = -EOPNOTSUPP;
+        break;
+    }
+
+    if ( rc )
+        goto out;
+
+    if ( !paging_mode_translate(currd) )
+    {
+        if ( copy_to_guest(xmar.frame_list, mfn_list, xmar.nr_frames) )
+            rc = -EFAULT;
+    }
+    else
+    {
+        xen_pfn_t gfn_list[ARRAY_SIZE(mfn_list)];
+        unsigned int i;
+
+        rc = -EFAULT;
+        if ( copy_from_guest(gfn_list, xmar.frame_list, xmar.nr_frames) )
+            goto out;
+
+        for ( i = 0; i < xmar.nr_frames; i++ )
+        {
+            rc = set_foreign_p2m_entry(currd, gfn_list[i],
+                                       _mfn(mfn_list[i]));
+            if ( rc )
+            {
+                /*
+                 * Make sure rc is -EIO for any interation other than
+                 * the first.
+                 */
+                rc = (i != 0) ? -EIO : rc;
+                break;
+            }
+        }
+    }
+
+ out:
+    rcu_unlock_domain(d);
+    return rc;
+}
+
 long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
     struct domain *d, *curr_d = current->domain;
@@ -1406,6 +1495,11 @@  long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
     }
 #endif
 
+    case XENMEM_acquire_resource:
+        rc = acquire_resource(
+            guest_handle_cast(arg, xen_mem_acquire_resource_t));
+        break;
+
     default:
         rc = arch_memory_op(cmd, arg);
         break;
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index faadcfe8fe..a5caa747ce 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -346,6 +346,12 @@  static inline gfn_t gfn_next_boundary(gfn_t gfn, unsigned int order)
     return gfn_add(gfn, 1UL << order);
 }
 
+static inline int set_foreign_p2m_entry(struct domain *d, unsigned long gfn,
+                                        mfn_t mfn)
+{
+    return -EOPNOTSUPP;
+}
+
 #endif /* _XEN_P2M_H */
 
 /*
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 17b1d0c8d3..44f7ec088c 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -620,6 +620,9 @@  void p2m_memory_type_changed(struct domain *d);
 int p2m_is_logdirty_range(struct p2m_domain *, unsigned long start,
                           unsigned long end);
 
+/* Set foreign entry in the p2m table (for priv-mapping) */
+int set_foreign_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn);
+
 /* Set mmio addresses in the p2m table (for pass-through) */
 int set_mmio_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn,
                        unsigned int order, p2m_access_t access);
diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
index 29386df98b..18118ea5c6 100644
--- a/xen/include/public/memory.h
+++ b/xen/include/public/memory.h
@@ -599,6 +599,47 @@  struct xen_reserved_device_memory_map {
 typedef struct xen_reserved_device_memory_map xen_reserved_device_memory_map_t;
 DEFINE_XEN_GUEST_HANDLE(xen_reserved_device_memory_map_t);
 
+/*
+ * Get the pages for a particular guest resource, so that they can be
+ * mapped directly by a tools domain.
+ */
+#define XENMEM_acquire_resource 28
+struct xen_mem_acquire_resource {
+    /* IN - the domain whose resource is to be mapped */
+    domid_t domid;
+    /* IN - the type of resource */
+    uint16_t type;
+    /*
+     * IN - a type-specific resource identifier, which must be zero
+     *      unless stated otherwise.
+     */
+    uint32_t id;
+    /* IN/OUT - As an IN parameter number of frames of the resource
+     *          to be mapped. However, if the specified value is 0 and
+     *          frame_list is NULL then this field will be set to the
+     *          maximum value supported by the implementation on return.
+     */
+    uint32_t nr_frames;
+    uint32_t pad;
+    /* IN - the index of the initial frame to be mapped. This parameter
+     *      is ignored if nr_frames is 0.
+     */
+    uint64_aligned_t frame;
+    /* IN/OUT - If the tools domain is PV then, upon return, frame_list
+     *          will be populated with the MFNs of the resource.
+     *          If the tools domain is HVM then it is expected that, on
+     *          entry, frame_list will be populated with a list of GFNs
+     *          that will be mapped to the MFNs of the resource.
+     *          If -EIO is returned then the frame_list has only been
+     *          partially mapped and it is up to the caller to unmap all
+     *          the GFNs.
+     *          This parameter may be NULL if nr_frames is 0.
+     */
+    XEN_GUEST_HANDLE(xen_ulong_t) frame_list;
+};
+typedef struct xen_mem_acquire_resource xen_mem_acquire_resource_t;
+DEFINE_XEN_GUEST_HANDLE(xen_mem_acquire_resource_t);
+
 #endif /* defined(__XEN__) || defined(__XEN_TOOLS__) */
 
 /*
@@ -650,7 +691,7 @@  struct xen_vnuma_topology_info {
 typedef struct xen_vnuma_topology_info xen_vnuma_topology_info_t;
 DEFINE_XEN_GUEST_HANDLE(xen_vnuma_topology_info_t);
 
-/* Next available subop number is 28 */
+/* Next available subop number is 29 */
 
 #endif /* __XEN_PUBLIC_MEMORY_H__ */
 
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 0f17000ea7..5835872334 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -83,6 +83,7 @@ 
 !	memory_map			memory.h
 !	memory_reservation		memory.h
 !	mem_access_op			memory.h
+!	mem_acquire_resource		memory.h
 !	pod_target			memory.h
 !	remove_from_physmap		memory.h
 !	reserved_device_memory_map	memory.h
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index ba89ea4bc1..8f04d59a3e 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -724,3 +724,9 @@  static XSM_INLINE int xsm_xen_version (XSM_DEFAULT_ARG uint32_t op)
         return xsm_default_action(XSM_PRIV, current->domain, NULL);
     }
 }
+
+static XSM_INLINE int xsm_domain_resource_map(XSM_DEFAULT_ARG struct domain *d)
+{
+    XSM_ASSERT_ACTION(XSM_DM_PRIV);
+    return xsm_default_action(action, current->domain, d);
+}
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index 7f7feffc68..d0db860ae0 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -180,6 +180,7 @@  struct xsm_operations {
     int (*dm_op) (struct domain *d);
 #endif
     int (*xen_version) (uint32_t cmd);
+    int (*domain_resource_map) (struct domain *d);
 };
 
 #ifdef CONFIG_XSM
@@ -692,6 +693,11 @@  static inline int xsm_xen_version (xsm_default_t def, uint32_t op)
     return xsm_ops->xen_version(op);
 }
 
+static inline int xsm_domain_resource_map(xsm_default_t def, struct domain *d)
+{
+    return xsm_ops->domain_resource_map(d);
+}
+
 #endif /* XSM_NO_WRAPPERS */
 
 #ifdef CONFIG_MULTIBOOT
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index 479b103614..6e751199ee 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -157,4 +157,5 @@  void __init xsm_fixup_ops (struct xsm_operations *ops)
     set_to_dummy_if_null(ops, dm_op);
 #endif
     set_to_dummy_if_null(ops, xen_version);
+    set_to_dummy_if_null(ops, domain_resource_map);
 }
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 7b005af834..7e9efcd865 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -1718,6 +1718,11 @@  static int flask_xen_version (uint32_t op)
     }
 }
 
+static int flask_domain_resource_map(struct domain *d)
+{
+    return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__RESOURCE_MAP);
+}
+
 long do_flask_op(XEN_GUEST_HANDLE_PARAM(xsm_op_t) u_flask_op);
 int compat_flask_op(XEN_GUEST_HANDLE_PARAM(xsm_op_t) u_flask_op);
 
@@ -1851,6 +1856,7 @@  static struct xsm_operations flask_ops = {
     .dm_op = flask_dm_op,
 #endif
     .xen_version = flask_xen_version,
+    .domain_resource_map = flask_domain_resource_map,
 };
 
 void __init flask_init(const void *policy_buffer, size_t policy_size)
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index 3a2d863b8f..341ade1f7d 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -250,6 +250,8 @@  class domain2
     psr_cat_op
 # XEN_DOMCTL_set_gnttab_limits
     set_gnttab_limits
+# XENMEM_resource_map
+    resource_map
 }
 
 # Similar to class domain, but primarily contains domctls related to HVM domains