diff mbox series

[V3,01/23] x86/ioreq: Prepare IOREQ feature for making it common

Message ID 1606732298-22107-2-git-send-email-olekstysh@gmail.com (mailing list archive)
State New, archived
Headers show
Series [V3,01/23] x86/ioreq: Prepare IOREQ feature for making it common | expand

Commit Message

Oleksandr Nov. 30, 2020, 10:31 a.m. UTC
From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

As a lot of x86 code can be re-used on Arm later on, this
patch makes some preparation to x86/hvm/ioreq.c before moving
to the common code. This way we will get a verbatim copy
for a code movement in subsequent patch.

This patch mostly introduces specific hooks to abstract arch
specific materials taking into the account the requirment to leave
the "legacy" mechanism of mapping magic pages for the IOREQ servers
x86 specific and not expose it to the common code.

These hooks are named according to the more consistent new naming
scheme right away (including dropping the "hvm" prefixes and infixes):
- IOREQ server functions should start with "ioreq_server_"
- IOREQ functions should start with "ioreq_"
other functions will be renamed in subsequent patches.

It worth mentioning that a code which checks the return value of
p2m_set_ioreq_server() in hvm_map_mem_type_to_ioreq_server() was
folded into arch_ioreq_server_map_mem_type() for the clear split.
So the p2m_change_entry_type_global() is called with ioreq_server
lock held.

Also re-order #include-s alphabetically.

This support is going to be used on Arm to be able run device
emulator outside of Xen hypervisor.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Julien Grall <julien.grall@arm.com>

---
Please note, this is a split/cleanup/hardening of Julien's PoC:
"Add support for Guest IO forwarding to a device emulator"

Changes RFC -> V1:
   - new patch, was split from:
     "[RFC PATCH V1 01/12] hvm/ioreq: Make x86's IOREQ feature common"
   - fold the check of p->type into hvm_get_ioreq_server_range_type()
     and make it return success/failure
   - remove relocate_portio_handler() call from arch_hvm_ioreq_destroy()
     in arch/x86/hvm/ioreq.c
   - introduce arch_hvm_destroy_ioreq_server()/arch_handle_hvm_io_completion()

Changes V1 -> V2:
   - update patch description
   - make arch functions inline and put them into arch header
     to achieve a truly rename by the subsequent patch
   - return void in arch_hvm_destroy_ioreq_server()
   - return bool in arch_hvm_ioreq_destroy()
   - bring relocate_portio_handler() back to arch_hvm_ioreq_destroy()
   - rename IOREQ_IO* to IOREQ_STATUS*
   - remove *handle* from arch_handle_hvm_io_completion()
   - re-order #include-s alphabetically
   - rename hvm_get_ioreq_server_range_type() to hvm_ioreq_server_get_type_addr()
     and add "const" to several arguments

Changes V2 -> V3:
   - update patch description
   - name new arch hooks according to the new naming scheme
   - don't make arch hooks inline, move them ioreq.c
   - make get_ioreq_server() local again
   - rework the whole patch taking into the account that "legacy" interface
     should remain x86 specific (additional arch hooks, etc)
   - update the code to be able to use hvm_map_mem_type_to_ioreq_server()
     in the common code (an extra arch hook, etc)
   - don’t include <asm/hvm/emulate.h> from arch header
   - add "arch" prefix to hvm_ioreq_server_get_type_addr()
   - move IOREQ_STATUS_* #define-s introduction to the separate patch
   - move HANDLE_BUFIOREQ to the arch header
   - just return relocate_portio_handler() from arch_ioreq_server_destroy_all()
   - misc adjustments proposed by Jan (adding const, unsigned int instead of uint32_t)
---
---
 xen/arch/x86/hvm/ioreq.c        | 174 ++++++++++++++++++++++++++--------------
 xen/include/asm-x86/hvm/ioreq.h |  19 +++++
 2 files changed, 133 insertions(+), 60 deletions(-)

Comments

Alex Bennée Dec. 1, 2020, 11:03 a.m. UTC | #1
Oleksandr Tyshchenko <olekstysh@gmail.com> writes:

> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>
> As a lot of x86 code can be re-used on Arm later on, this
> patch makes some preparation to x86/hvm/ioreq.c before moving
> to the common code. This way we will get a verbatim copy
<snip>
>
> It worth mentioning that a code which checks the return value of
> p2m_set_ioreq_server() in hvm_map_mem_type_to_ioreq_server() was
> folded into arch_ioreq_server_map_mem_type() for the clear split.
> So the p2m_change_entry_type_global() is called with ioreq_server
> lock held.
<snip>
>  
> +/* Called with ioreq_server lock held */
> +int arch_ioreq_server_map_mem_type(struct domain *d,
> +                                   struct hvm_ioreq_server *s,
> +                                   uint32_t flags)
> +{
> +    int rc = p2m_set_ioreq_server(d, flags, s);
> +
> +    if ( rc == 0 && flags == 0 )
> +    {
> +        const struct p2m_domain *p2m = p2m_get_hostp2m(d);
> +
> +        if ( read_atomic(&p2m->ioreq.entry_count) )
> +            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
> +    }
> +
> +    return rc;
> +}
> +
>  /*
>   * Map or unmap an ioreq server to specific memory type. For now, only
>   * HVMMEM_ioreq_server is supported, and in the future new types can be
> @@ -1112,19 +1155,11 @@ int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
>      if ( s->emulator != current->domain )
>          goto out;
>  
> -    rc = p2m_set_ioreq_server(d, flags, s);
> +    rc = arch_ioreq_server_map_mem_type(d, s, flags);
>  
>   out:
>      spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
>  
> -    if ( rc == 0 && flags == 0 )
> -    {
> -        struct p2m_domain *p2m = p2m_get_hostp2m(d);
> -
> -        if ( read_atomic(&p2m->ioreq.entry_count) )
> -            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
> -    }
> -

It should be noted that p2m holds it's own lock but I'm unfamiliar with
Xen's locking architecture. Is there anything that prevents another vCPU
accessing a page that is also being used my ioreq on the first vCPU?

Assuming that deadlock isn't a possibility to my relatively untrained
eye this looks good to me:

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Oleksandr Dec. 1, 2020, 6:53 p.m. UTC | #2
On 01.12.20 13:03, Alex Bennée wrote:

Hi Alex

> Oleksandr Tyshchenko <olekstysh@gmail.com> writes:
>
>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>
>> As a lot of x86 code can be re-used on Arm later on, this
>> patch makes some preparation to x86/hvm/ioreq.c before moving
>> to the common code. This way we will get a verbatim copy
> <snip>
>> It worth mentioning that a code which checks the return value of
>> p2m_set_ioreq_server() in hvm_map_mem_type_to_ioreq_server() was
>> folded into arch_ioreq_server_map_mem_type() for the clear split.
>> So the p2m_change_entry_type_global() is called with ioreq_server
>> lock held.
> <snip>
>>   
>> +/* Called with ioreq_server lock held */
>> +int arch_ioreq_server_map_mem_type(struct domain *d,
>> +                                   struct hvm_ioreq_server *s,
>> +                                   uint32_t flags)
>> +{
>> +    int rc = p2m_set_ioreq_server(d, flags, s);
>> +
>> +    if ( rc == 0 && flags == 0 )
>> +    {
>> +        const struct p2m_domain *p2m = p2m_get_hostp2m(d);
>> +
>> +        if ( read_atomic(&p2m->ioreq.entry_count) )
>> +            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
>> +    }
>> +
>> +    return rc;
>> +}
>> +
>>   /*
>>    * Map or unmap an ioreq server to specific memory type. For now, only
>>    * HVMMEM_ioreq_server is supported, and in the future new types can be
>> @@ -1112,19 +1155,11 @@ int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
>>       if ( s->emulator != current->domain )
>>           goto out;
>>   
>> -    rc = p2m_set_ioreq_server(d, flags, s);
>> +    rc = arch_ioreq_server_map_mem_type(d, s, flags);
>>   
>>    out:
>>       spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
>>   
>> -    if ( rc == 0 && flags == 0 )
>> -    {
>> -        struct p2m_domain *p2m = p2m_get_hostp2m(d);
>> -
>> -        if ( read_atomic(&p2m->ioreq.entry_count) )
>> -            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
>> -    }
>> -
> It should be noted that p2m holds it's own lock but I'm unfamiliar with
> Xen's locking architecture. Is there anything that prevents another vCPU
> accessing a page that is also being used my ioreq on the first vCPU?
I am not sure that I would be able to provide reasonable explanations here.
All what I understand is that p2m_change_entry_type_global() x86 
specific (we don't have p2m_ioreq_server concept on Arm) and should 
remain as such (not exposed to the common code).
IIRC, I raised a question during V2 review whether we could have ioreq 
server lock around the call to p2m_change_entry_type_global() and didn't 
get objections. I may mistake, but looks like the lock being used
in p2m_change_entry_type_global() is yet another lock for protecting 
page table operations, so unlikely we could get into the trouble calling 
this function with the ioreq server lock held.


>
> Assuming that deadlock isn't a possibility to my relatively untrained
> eye this looks good to me:
>
> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

Thank you.
Alex Bennée Dec. 1, 2020, 7:36 p.m. UTC | #3
Oleksandr <olekstysh@gmail.com> writes:

> On 01.12.20 13:03, Alex Bennée wrote:
>
> Hi Alex
>
>> Oleksandr Tyshchenko <olekstysh@gmail.com> writes:
>>
>>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>>
>>> As a lot of x86 code can be re-used on Arm later on, this
>>> patch makes some preparation to x86/hvm/ioreq.c before moving
>>> to the common code. This way we will get a verbatim copy
>> <snip>
>>> It worth mentioning that a code which checks the return value of
>>> p2m_set_ioreq_server() in hvm_map_mem_type_to_ioreq_server() was
>>> folded into arch_ioreq_server_map_mem_type() for the clear split.
>>> So the p2m_change_entry_type_global() is called with ioreq_server
>>> lock held.
>> <snip>
>>>   
>>> +/* Called with ioreq_server lock held */
>>> +int arch_ioreq_server_map_mem_type(struct domain *d,
>>> +                                   struct hvm_ioreq_server *s,
>>> +                                   uint32_t flags)
>>> +{
>>> +    int rc = p2m_set_ioreq_server(d, flags, s);
>>> +
>>> +    if ( rc == 0 && flags == 0 )
>>> +    {
>>> +        const struct p2m_domain *p2m = p2m_get_hostp2m(d);
>>> +
>>> +        if ( read_atomic(&p2m->ioreq.entry_count) )
>>> +            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
>>> +    }
>>> +
>>> +    return rc;
>>> +}
>>> +
>>>   /*
>>>    * Map or unmap an ioreq server to specific memory type. For now, only
>>>    * HVMMEM_ioreq_server is supported, and in the future new types can be
>>> @@ -1112,19 +1155,11 @@ int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
>>>       if ( s->emulator != current->domain )
>>>           goto out;
>>>   
>>> -    rc = p2m_set_ioreq_server(d, flags, s;)
>>> +    rc = arch_ioreq_server_map_mem_type(d, s, flags);
>>>   
>>>    out:
>>>       spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
>>>   
>>> -    if ( rc == 0 && flags == 0 )
>>> -    {
>>> -        struct p2m_domain *p2m = p2m_get_hostp2m(d);
>>> -
>>> -        if ( read_atomic(&p2m->ioreq.entry_count) )
>>> -            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
>>> -    }
>>> -
>> It should be noted that p2m holds it's own lock but I'm unfamiliar with
>> Xen's locking architecture. Is there anything that prevents another vCPU
>> accessing a page that is also being used my ioreq on the first vCPU?
> I am not sure that I would be able to provide reasonable explanations here.
> All what I understand is that p2m_change_entry_type_global() x86 
> specific (we don't have p2m_ioreq_server concept on Arm) and should 
> remain as such (not exposed to the common code).
> IIRC, I raised a question during V2 review whether we could have ioreq 
> server lock around the call to p2m_change_entry_type_global() and didn't 
> get objections. I may mistake, but looks like the lock being used
> in p2m_change_entry_type_global() is yet another lock for protecting 
> page table operations, so unlikely we could get into the trouble calling 
> this function with the ioreq server lock held.

The p2m lock code looks designed to be recursive so I could only
envision a problem where a page somehow racing to lock under the ioreq
lock which I don't think is possible. However reasoning about locking is
hard if your not familiar - it's one reason we added Promela/Spin [1] models to
QEMU for our various locking regimes.


[1] http://spinroot.com/spin/whatispin.html
[2] https://git.qemu.org/?p=qemu.git;a=tree;f=docs/spin;h=cc168025131676429a560ca70d7234a56f958092;hb=HEAD

>
>
>>
>> Assuming that deadlock isn't a possibility to my relatively untrained
>> eye this looks good to me:
>>
>> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
>
> Thank you.
Jan Beulich Dec. 2, 2020, 8 a.m. UTC | #4
On 01.12.2020 19:53, Oleksandr wrote:
> On 01.12.20 13:03, Alex Bennée wrote:
>> Oleksandr Tyshchenko <olekstysh@gmail.com> writes:
>>> @@ -1112,19 +1155,11 @@ int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
>>>       if ( s->emulator != current->domain )
>>>           goto out;
>>>   
>>> -    rc = p2m_set_ioreq_server(d, flags, s);
>>> +    rc = arch_ioreq_server_map_mem_type(d, s, flags);
>>>   
>>>    out:
>>>       spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
>>>   
>>> -    if ( rc == 0 && flags == 0 )
>>> -    {
>>> -        struct p2m_domain *p2m = p2m_get_hostp2m(d);
>>> -
>>> -        if ( read_atomic(&p2m->ioreq.entry_count) )
>>> -            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
>>> -    }
>>> -
>> It should be noted that p2m holds it's own lock but I'm unfamiliar with
>> Xen's locking architecture. Is there anything that prevents another vCPU
>> accessing a page that is also being used my ioreq on the first vCPU?
> I am not sure that I would be able to provide reasonable explanations here.
> All what I understand is that p2m_change_entry_type_global() x86 
> specific (we don't have p2m_ioreq_server concept on Arm) and should 
> remain as such (not exposed to the common code).
> IIRC, I raised a question during V2 review whether we could have ioreq 
> server lock around the call to p2m_change_entry_type_global() and didn't 
> get objections.

Not getting objections doesn't mean much. Personally I don't recall
such a question, but this doesn't mean much. The important thing
here is that you properly justify this change in the description (I
didn't look at this version of the patch as a whole yet, so quite
likely you actually do). This is because you need to guarantee that
you don't introduce any lock order violations by this. There also
should be an attempt to avoid future introduction of issues, by
adding lock nesting related comments in suitable places. Again,
quite likely you actually do so, and I will notice it once looking
at the patch as a whole.

All of this said, I think it should be tried hard to avoid
introducing this extra lock nesting, if there aren't other places
already where the same nesting of locks is in effect.

> I may mistake, but looks like the lock being used
> in p2m_change_entry_type_global() is yet another lock for protecting 
> page table operations, so unlikely we could get into the trouble calling 
> this function with the ioreq server lock held.

I'm afraid I don't understand the "yet another" here: The ioreq
server lock clearly serves an entirely different purpose.

Jan
Oleksandr Dec. 2, 2020, 11:19 a.m. UTC | #5
On 02.12.20 10:00, Jan Beulich wrote:

Hi Jan

> On 01.12.2020 19:53, Oleksandr wrote:
>> On 01.12.20 13:03, Alex Bennée wrote:
>>> Oleksandr Tyshchenko <olekstysh@gmail.com> writes:
>>>> @@ -1112,19 +1155,11 @@ int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
>>>>        if ( s->emulator != current->domain )
>>>>            goto out;
>>>>    
>>>> -    rc = p2m_set_ioreq_server(d, flags, s);
>>>> +    rc = arch_ioreq_server_map_mem_type(d, s, flags);
>>>>    
>>>>     out:
>>>>        spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
>>>>    
>>>> -    if ( rc == 0 && flags == 0 )
>>>> -    {
>>>> -        struct p2m_domain *p2m = p2m_get_hostp2m(d);
>>>> -
>>>> -        if ( read_atomic(&p2m->ioreq.entry_count) )
>>>> -            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
>>>> -    }
>>>> -
>>> It should be noted that p2m holds it's own lock but I'm unfamiliar with
>>> Xen's locking architecture. Is there anything that prevents another vCPU
>>> accessing a page that is also being used my ioreq on the first vCPU?
>> I am not sure that I would be able to provide reasonable explanations here.
>> All what I understand is that p2m_change_entry_type_global() x86
>> specific (we don't have p2m_ioreq_server concept on Arm) and should
>> remain as such (not exposed to the common code).
>> IIRC, I raised a question during V2 review whether we could have ioreq
>> server lock around the call to p2m_change_entry_type_global() and didn't
>> get objections.
> Not getting objections doesn't mean much. Personally I don't recall
> such a question, but this doesn't mean much.

Sorry for not being clear here. Discussion happened at [1] when I was 
asked to move hvm_map_mem_type_to_ioreq_server() to the common code.


>   The important thing
> here is that you properly justify this change in the description (I
> didn't look at this version of the patch as a whole yet, so quite
> likely you actually do). This is because you need to guarantee that
> you don't introduce any lock order violations by this.
Yes, almost all changes in this patch are mostly mechanical and leave 
things as they are.
The change with p2m_change_entry_type_global() requires an additional 
attention, so decided to put emphasis on touching that
in the description and add a comment in the code that it is called with 
ioreq_server lock held.


[1] 
https://patchwork.kernel.org/project/xen-devel/patch/1602780274-29141-2-git-send-email-olekstysh@gmail.com/#23734839
Jan Beulich Dec. 7, 2020, 11:13 a.m. UTC | #6
On 30.11.2020 11:31, Oleksandr Tyshchenko wrote:
> --- a/xen/arch/x86/hvm/ioreq.c
> +++ b/xen/arch/x86/hvm/ioreq.c
> @@ -17,15 +17,15 @@
>   */
>  
>  #include <xen/ctype.h>
> +#include <xen/domain.h>
> +#include <xen/event.h>
>  #include <xen/init.h>
> +#include <xen/irq.h>
>  #include <xen/lib.h>
> -#include <xen/trace.h>
> +#include <xen/paging.h>
>  #include <xen/sched.h>
> -#include <xen/irq.h>
>  #include <xen/softirq.h>
> -#include <xen/domain.h>
> -#include <xen/event.h>
> -#include <xen/paging.h>
> +#include <xen/trace.h>
>  #include <xen/vpci.h>

Seeing this consolidation (thanks!), have you been able to figure
out what xen/ctype.h is needed for here? It looks to me as if it
could be dropped at the same time.

> @@ -601,7 +610,7 @@ static int hvm_ioreq_server_map_pages(struct hvm_ioreq_server *s)
>      return rc;
>  }
>  
> -static void hvm_ioreq_server_unmap_pages(struct hvm_ioreq_server *s)
> +void arch_ioreq_server_unmap_pages(struct hvm_ioreq_server *s)
>  {
>      hvm_unmap_ioreq_gfn(s, true);
>      hvm_unmap_ioreq_gfn(s, false);

How is this now different from ...

> @@ -674,6 +683,12 @@ static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
>      return rc;
>  }
>  
> +void arch_ioreq_server_enable(struct hvm_ioreq_server *s)
> +{
> +    hvm_remove_ioreq_gfn(s, false);
> +    hvm_remove_ioreq_gfn(s, true);
> +}

... this? Imo if at all possible there should be no such duplication
(i.e. at least have this one simply call the earlier one).

> @@ -1080,6 +1105,24 @@ int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
>      return rc;
>  }
>  
> +/* Called with ioreq_server lock held */
> +int arch_ioreq_server_map_mem_type(struct domain *d,
> +                                   struct hvm_ioreq_server *s,
> +                                   uint32_t flags)
> +{
> +    int rc = p2m_set_ioreq_server(d, flags, s);
> +
> +    if ( rc == 0 && flags == 0 )
> +    {
> +        const struct p2m_domain *p2m = p2m_get_hostp2m(d);
> +
> +        if ( read_atomic(&p2m->ioreq.entry_count) )
> +            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
> +    }
> +
> +    return rc;
> +}
> +
>  /*
>   * Map or unmap an ioreq server to specific memory type. For now, only
>   * HVMMEM_ioreq_server is supported, and in the future new types can be
> @@ -1112,19 +1155,11 @@ int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
>      if ( s->emulator != current->domain )
>          goto out;
>  
> -    rc = p2m_set_ioreq_server(d, flags, s);
> +    rc = arch_ioreq_server_map_mem_type(d, s, flags);
>  
>   out:
>      spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
>  
> -    if ( rc == 0 && flags == 0 )
> -    {
> -        struct p2m_domain *p2m = p2m_get_hostp2m(d);
> -
> -        if ( read_atomic(&p2m->ioreq.entry_count) )
> -            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
> -    }
> -
>      return rc;
>  }

While you mention this change in the description, I'm still
missing justification as to why the change is safe to make. I
continue to think p2m_change_entry_type_global() would better
not be called inside the locked region, if at all possible.

> @@ -1239,33 +1279,28 @@ void hvm_destroy_all_ioreq_servers(struct domain *d)
>      spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
>  }
>  
> -struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
> -                                                 ioreq_t *p)
> +int arch_ioreq_server_get_type_addr(const struct domain *d,
> +                                    const ioreq_t *p,
> +                                    uint8_t *type,
> +                                    uint64_t *addr)
>  {
> -    struct hvm_ioreq_server *s;
> -    uint32_t cf8;
> -    uint8_t type;
> -    uint64_t addr;
> -    unsigned int id;
> +    unsigned int cf8 = d->arch.hvm.pci_cf8;
>  
>      if ( p->type != IOREQ_TYPE_COPY && p->type != IOREQ_TYPE_PIO )
> -        return NULL;
> -
> -    cf8 = d->arch.hvm.pci_cf8;
> +        return -EINVAL;

The caller cares about only a boolean. Either make the function
return bool, or (imo better, but others may like this less) have
it return "type" instead of using indirection, using e.g.
negative values to identify errors (which then could still be
errno ones if you wish).

> --- a/xen/include/asm-x86/hvm/ioreq.h
> +++ b/xen/include/asm-x86/hvm/ioreq.h
> @@ -19,6 +19,25 @@
>  #ifndef __ASM_X86_HVM_IOREQ_H__
>  #define __ASM_X86_HVM_IOREQ_H__
>  
> +#define HANDLE_BUFIOREQ(s) \
> +    ((s)->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF)
> +
> +bool arch_vcpu_ioreq_completion(enum hvm_io_completion io_completion);
> +int arch_ioreq_server_map_pages(struct hvm_ioreq_server *s);
> +void arch_ioreq_server_unmap_pages(struct hvm_ioreq_server *s);
> +void arch_ioreq_server_enable(struct hvm_ioreq_server *s);
> +void arch_ioreq_server_disable(struct hvm_ioreq_server *s);
> +void arch_ioreq_server_destroy(struct hvm_ioreq_server *s);
> +int arch_ioreq_server_map_mem_type(struct domain *d,
> +                                   struct hvm_ioreq_server *s,
> +                                   uint32_t flags);
> +bool arch_ioreq_server_destroy_all(struct domain *d);
> +int arch_ioreq_server_get_type_addr(const struct domain *d,
> +                                    const ioreq_t *p,
> +                                    uint8_t *type,
> +                                    uint64_t *addr);
> +void arch_ioreq_domain_init(struct domain *d);
> +
>  bool hvm_io_pending(struct vcpu *v);
>  bool handle_hvm_io_completion(struct vcpu *v);
>  bool is_ioreq_server_page(struct domain *d, const struct page_info *page);

What's the plan here? Introduce them into the x86 header just
to later move the entire block into the common one? Wouldn't
it make sense to introduce the common header here right away?
Or do you expect to convert some of the simpler ones to inline
functions later on?

Jan
Oleksandr Dec. 7, 2020, 3:27 p.m. UTC | #7
On 07.12.20 13:13, Jan Beulich wrote:

Hi Jan

> On 30.11.2020 11:31, Oleksandr Tyshchenko wrote:
>> --- a/xen/arch/x86/hvm/ioreq.c
>> +++ b/xen/arch/x86/hvm/ioreq.c
>> @@ -17,15 +17,15 @@
>>    */
>>   
>>   #include <xen/ctype.h>
>> +#include <xen/domain.h>
>> +#include <xen/event.h>
>>   #include <xen/init.h>
>> +#include <xen/irq.h>
>>   #include <xen/lib.h>
>> -#include <xen/trace.h>
>> +#include <xen/paging.h>
>>   #include <xen/sched.h>
>> -#include <xen/irq.h>
>>   #include <xen/softirq.h>
>> -#include <xen/domain.h>
>> -#include <xen/event.h>
>> -#include <xen/paging.h>
>> +#include <xen/trace.h>
>>   #include <xen/vpci.h>
> Seeing this consolidation (thanks!), have you been able to figure
> out what xen/ctype.h is needed for here? It looks to me as if it
> could be dropped at the same time.

Not yet, but will re-check.


>
>> @@ -601,7 +610,7 @@ static int hvm_ioreq_server_map_pages(struct hvm_ioreq_server *s)
>>       return rc;
>>   }
>>   
>> -static void hvm_ioreq_server_unmap_pages(struct hvm_ioreq_server *s)
>> +void arch_ioreq_server_unmap_pages(struct hvm_ioreq_server *s)
>>   {
>>       hvm_unmap_ioreq_gfn(s, true);
>>       hvm_unmap_ioreq_gfn(s, false);
> How is this now different from ...
>
>> @@ -674,6 +683,12 @@ static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
>>       return rc;
>>   }
>>   
>> +void arch_ioreq_server_enable(struct hvm_ioreq_server *s)
>> +{
>> +    hvm_remove_ioreq_gfn(s, false);
>> +    hvm_remove_ioreq_gfn(s, true);
>> +}
> ... this? Imo if at all possible there should be no such duplication
> (i.e. at least have this one simply call the earlier one).

I am afraid, I don't see any duplication between mentioned functions. 
Would you mind explaining?


>
>> @@ -1080,6 +1105,24 @@ int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
>>       return rc;
>>   }
>>   
>> +/* Called with ioreq_server lock held */
>> +int arch_ioreq_server_map_mem_type(struct domain *d,
>> +                                   struct hvm_ioreq_server *s,
>> +                                   uint32_t flags)
>> +{
>> +    int rc = p2m_set_ioreq_server(d, flags, s);
>> +
>> +    if ( rc == 0 && flags == 0 )
>> +    {
>> +        const struct p2m_domain *p2m = p2m_get_hostp2m(d);
>> +
>> +        if ( read_atomic(&p2m->ioreq.entry_count) )
>> +            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
>> +    }
>> +
>> +    return rc;
>> +}
>> +
>>   /*
>>    * Map or unmap an ioreq server to specific memory type. For now, only
>>    * HVMMEM_ioreq_server is supported, and in the future new types can be
>> @@ -1112,19 +1155,11 @@ int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
>>       if ( s->emulator != current->domain )
>>           goto out;
>>   
>> -    rc = p2m_set_ioreq_server(d, flags, s);
>> +    rc = arch_ioreq_server_map_mem_type(d, s, flags);
>>   
>>    out:
>>       spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
>>   
>> -    if ( rc == 0 && flags == 0 )
>> -    {
>> -        struct p2m_domain *p2m = p2m_get_hostp2m(d);
>> -
>> -        if ( read_atomic(&p2m->ioreq.entry_count) )
>> -            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
>> -    }
>> -
>>       return rc;
>>   }
> While you mention this change in the description, I'm still
> missing justification as to why the change is safe to make. I
> continue to think p2m_change_entry_type_global() would better
> not be called inside the locked region, if at all possible.
Well. I am afraid, I don't have a 100% justification why the change is 
safe to make as well
as I don't see an obvious reason why it is not safe to make (at least I 
didn't find a possible deadlock scenario while investigating the code).
I raised a question earlier whether I can fold this check in, which 
implied calling p2m_change_entry_type_global() with ioreq_server lock held.


If there is a concern with calling this inside the locked region 
(unfortunately still unclear for me at the moment), I will try to find 
another way how to split hvm_map_mem_type_to_ioreq_server() without
potentially unsafe change here *and* exposing 
p2m_change_entry_type_global() to the common code. Right now, I don't 
have any ideas how this could be split other than
introducing one more hook here to deal with p2m_change_entry_type_global 
(probably arch_ioreq_server_map_mem_type_complete?), but I don't expect 
it to be accepted.
I appreciate any ideas on that.
>
>> @@ -1239,33 +1279,28 @@ void hvm_destroy_all_ioreq_servers(struct domain *d)
>>       spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
>>   }
>>   
>> -struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
>> -                                                 ioreq_t *p)
>> +int arch_ioreq_server_get_type_addr(const struct domain *d,
>> +                                    const ioreq_t *p,
>> +                                    uint8_t *type,
>> +                                    uint64_t *addr)
>>   {
>> -    struct hvm_ioreq_server *s;
>> -    uint32_t cf8;
>> -    uint8_t type;
>> -    uint64_t addr;
>> -    unsigned int id;
>> +    unsigned int cf8 = d->arch.hvm.pci_cf8;
>>   
>>       if ( p->type != IOREQ_TYPE_COPY && p->type != IOREQ_TYPE_PIO )
>> -        return NULL;
>> -
>> -    cf8 = d->arch.hvm.pci_cf8;
>> +        return -EINVAL;
> The caller cares about only a boolean. Either make the function
> return bool, or (imo better, but others may like this less) have
> it return "type" instead of using indirection, using e.g.
> negative values to identify errors (which then could still be
> errno ones if you wish).

Makes sense. I will probably make the function return bool. Even if 
return "type" we will still have an indirection for "addr".


>
>> --- a/xen/include/asm-x86/hvm/ioreq.h
>> +++ b/xen/include/asm-x86/hvm/ioreq.h
>> @@ -19,6 +19,25 @@
>>   #ifndef __ASM_X86_HVM_IOREQ_H__
>>   #define __ASM_X86_HVM_IOREQ_H__
>>   
>> +#define HANDLE_BUFIOREQ(s) \
>> +    ((s)->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF)
>> +
>> +bool arch_vcpu_ioreq_completion(enum hvm_io_completion io_completion);
>> +int arch_ioreq_server_map_pages(struct hvm_ioreq_server *s);
>> +void arch_ioreq_server_unmap_pages(struct hvm_ioreq_server *s);
>> +void arch_ioreq_server_enable(struct hvm_ioreq_server *s);
>> +void arch_ioreq_server_disable(struct hvm_ioreq_server *s);
>> +void arch_ioreq_server_destroy(struct hvm_ioreq_server *s);
>> +int arch_ioreq_server_map_mem_type(struct domain *d,
>> +                                   struct hvm_ioreq_server *s,
>> +                                   uint32_t flags);
>> +bool arch_ioreq_server_destroy_all(struct domain *d);
>> +int arch_ioreq_server_get_type_addr(const struct domain *d,
>> +                                    const ioreq_t *p,
>> +                                    uint8_t *type,
>> +                                    uint64_t *addr);
>> +void arch_ioreq_domain_init(struct domain *d);
>> +
>>   bool hvm_io_pending(struct vcpu *v);
>>   bool handle_hvm_io_completion(struct vcpu *v);
>>   bool is_ioreq_server_page(struct domain *d, const struct page_info *page);
> What's the plan here? Introduce them into the x86 header just
> to later move the entire block into the common one? Wouldn't
> it make sense to introduce the common header here right away?
> Or do you expect to convert some of the simpler ones to inline
> functions later on?
The former. The subsequent patch is moving the the entire block(s) from 
here and from x86/hvm/ioreq.c to the common code in one go.
I thought it was a little bit odd to expose a header before exposing an 
implementation to the common code. Another reason is to minimize places 
that need touching by current patch.
After all, this is done within single series and without breakage in 
between. But, if introducing the common header right away will make 
patch more cleaner and correct I am absolutely OK and happy to update a 
patch. Shall I?
Jan Beulich Dec. 7, 2020, 4:29 p.m. UTC | #8
On 07.12.2020 16:27, Oleksandr wrote:
> On 07.12.20 13:13, Jan Beulich wrote:
>> On 30.11.2020 11:31, Oleksandr Tyshchenko wrote:
>>> @@ -601,7 +610,7 @@ static int hvm_ioreq_server_map_pages(struct hvm_ioreq_server *s)
>>>       return rc;
>>>   }
>>>   
>>> -static void hvm_ioreq_server_unmap_pages(struct hvm_ioreq_server *s)
>>> +void arch_ioreq_server_unmap_pages(struct hvm_ioreq_server *s)
>>>   {
>>>       hvm_unmap_ioreq_gfn(s, true);
>>>       hvm_unmap_ioreq_gfn(s, false);
>> How is this now different from ...
>>
>>> @@ -674,6 +683,12 @@ static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
>>>       return rc;
>>>   }
>>>   
>>> +void arch_ioreq_server_enable(struct hvm_ioreq_server *s)
>>> +{
>>> +    hvm_remove_ioreq_gfn(s, false);
>>> +    hvm_remove_ioreq_gfn(s, true);
>>> +}
>> ... this? Imo if at all possible there should be no such duplication
>> (i.e. at least have this one simply call the earlier one).
> 
> I am afraid, I don't see any duplication between mentioned functions. 
> Would you mind explaining?

Ouch - somehow my eyes considered "unmap" == "remove". I'm sorry
for the noise.

>>> @@ -1080,6 +1105,24 @@ int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
>>>       return rc;
>>>   }
>>>   
>>> +/* Called with ioreq_server lock held */
>>> +int arch_ioreq_server_map_mem_type(struct domain *d,
>>> +                                   struct hvm_ioreq_server *s,
>>> +                                   uint32_t flags)
>>> +{
>>> +    int rc = p2m_set_ioreq_server(d, flags, s);
>>> +
>>> +    if ( rc == 0 && flags == 0 )
>>> +    {
>>> +        const struct p2m_domain *p2m = p2m_get_hostp2m(d);
>>> +
>>> +        if ( read_atomic(&p2m->ioreq.entry_count) )
>>> +            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
>>> +    }
>>> +
>>> +    return rc;
>>> +}
>>> +
>>>   /*
>>>    * Map or unmap an ioreq server to specific memory type. For now, only
>>>    * HVMMEM_ioreq_server is supported, and in the future new types can be
>>> @@ -1112,19 +1155,11 @@ int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
>>>       if ( s->emulator != current->domain )
>>>           goto out;
>>>   
>>> -    rc = p2m_set_ioreq_server(d, flags, s);
>>> +    rc = arch_ioreq_server_map_mem_type(d, s, flags);
>>>   
>>>    out:
>>>       spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
>>>   
>>> -    if ( rc == 0 && flags == 0 )
>>> -    {
>>> -        struct p2m_domain *p2m = p2m_get_hostp2m(d);
>>> -
>>> -        if ( read_atomic(&p2m->ioreq.entry_count) )
>>> -            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
>>> -    }
>>> -
>>>       return rc;
>>>   }
>> While you mention this change in the description, I'm still
>> missing justification as to why the change is safe to make. I
>> continue to think p2m_change_entry_type_global() would better
>> not be called inside the locked region, if at all possible.
> Well. I am afraid, I don't have a 100% justification why the change is 
> safe to make as well
> as I don't see an obvious reason why it is not safe to make (at least I 
> didn't find a possible deadlock scenario while investigating the code).
> I raised a question earlier whether I can fold this check in, which 
> implied calling p2m_change_entry_type_global() with ioreq_server lock held.

I'm aware of the earlier discussion. But "didn't find" isn't good
enough in a case like this, and since it's likely hard to indeed
prove there's no deadlock possible, I think it's best to avoid
having to provide such a proof by avoiding the nesting.

> If there is a concern with calling this inside the locked region 
> (unfortunately still unclear for me at the moment), I will try to find 
> another way how to split hvm_map_mem_type_to_ioreq_server() without
> potentially unsafe change here *and* exposing 
> p2m_change_entry_type_global() to the common code. Right now, I don't 
> have any ideas how this could be split other than
> introducing one more hook here to deal with p2m_change_entry_type_global 
> (probably arch_ioreq_server_map_mem_type_complete?), but I don't expect 
> it to be accepted.
> I appreciate any ideas on that.

Is there a reason why the simplest solution (two independent
arch_*() calls) won't do? If so, what are the constraints?
Can the first one e.g. somehow indicate what needs to happen
after the lock was dropped? But the two calls look independent
right now, so I don't see any complicating factors.

>>> --- a/xen/include/asm-x86/hvm/ioreq.h
>>> +++ b/xen/include/asm-x86/hvm/ioreq.h
>>> @@ -19,6 +19,25 @@
>>>   #ifndef __ASM_X86_HVM_IOREQ_H__
>>>   #define __ASM_X86_HVM_IOREQ_H__
>>>   
>>> +#define HANDLE_BUFIOREQ(s) \
>>> +    ((s)->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF)
>>> +
>>> +bool arch_vcpu_ioreq_completion(enum hvm_io_completion io_completion);
>>> +int arch_ioreq_server_map_pages(struct hvm_ioreq_server *s);
>>> +void arch_ioreq_server_unmap_pages(struct hvm_ioreq_server *s);
>>> +void arch_ioreq_server_enable(struct hvm_ioreq_server *s);
>>> +void arch_ioreq_server_disable(struct hvm_ioreq_server *s);
>>> +void arch_ioreq_server_destroy(struct hvm_ioreq_server *s);
>>> +int arch_ioreq_server_map_mem_type(struct domain *d,
>>> +                                   struct hvm_ioreq_server *s,
>>> +                                   uint32_t flags);
>>> +bool arch_ioreq_server_destroy_all(struct domain *d);
>>> +int arch_ioreq_server_get_type_addr(const struct domain *d,
>>> +                                    const ioreq_t *p,
>>> +                                    uint8_t *type,
>>> +                                    uint64_t *addr);
>>> +void arch_ioreq_domain_init(struct domain *d);
>>> +
>>>   bool hvm_io_pending(struct vcpu *v);
>>>   bool handle_hvm_io_completion(struct vcpu *v);
>>>   bool is_ioreq_server_page(struct domain *d, const struct page_info *page);
>> What's the plan here? Introduce them into the x86 header just
>> to later move the entire block into the common one? Wouldn't
>> it make sense to introduce the common header here right away?
>> Or do you expect to convert some of the simpler ones to inline
>> functions later on?
> The former. The subsequent patch is moving the the entire block(s) from 
> here and from x86/hvm/ioreq.c to the common code in one go.

I think I saw it move the _other_ pieces there, and this block
left here. (FAOD my comment is about the arch_*() declarations
you add, not the patch context in view.)

> I thought it was a little bit odd to expose a header before exposing an 
> implementation to the common code. Another reason is to minimize places 
> that need touching by current patch.

By exposing arch_*() declarations you don't give the impression
of exposing any "implementation". These are helpers the
implementation is to invoke; I'm fine with you moving the
declarations of the functions actually constituting this
component's external interface only once you also move the
implementation to common code.

Jan
Oleksandr Dec. 7, 2020, 5:21 p.m. UTC | #9
Hi Jan


>>>> @@ -1080,6 +1105,24 @@ int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
>>>>        return rc;
>>>>    }
>>>>    
>>>> +/* Called with ioreq_server lock held */
>>>> +int arch_ioreq_server_map_mem_type(struct domain *d,
>>>> +                                   struct hvm_ioreq_server *s,
>>>> +                                   uint32_t flags)
>>>> +{
>>>> +    int rc = p2m_set_ioreq_server(d, flags, s);
>>>> +
>>>> +    if ( rc == 0 && flags == 0 )
>>>> +    {
>>>> +        const struct p2m_domain *p2m = p2m_get_hostp2m(d);
>>>> +
>>>> +        if ( read_atomic(&p2m->ioreq.entry_count) )
>>>> +            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
>>>> +    }
>>>> +
>>>> +    return rc;
>>>> +}
>>>> +
>>>>    /*
>>>>     * Map or unmap an ioreq server to specific memory type. For now, only
>>>>     * HVMMEM_ioreq_server is supported, and in the future new types can be
>>>> @@ -1112,19 +1155,11 @@ int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
>>>>        if ( s->emulator != current->domain )
>>>>            goto out;
>>>>    
>>>> -    rc = p2m_set_ioreq_server(d, flags, s);
>>>> +    rc = arch_ioreq_server_map_mem_type(d, s, flags);
>>>>    
>>>>     out:
>>>>        spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
>>>>    
>>>> -    if ( rc == 0 && flags == 0 )
>>>> -    {
>>>> -        struct p2m_domain *p2m = p2m_get_hostp2m(d);
>>>> -
>>>> -        if ( read_atomic(&p2m->ioreq.entry_count) )
>>>> -            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
>>>> -    }
>>>> -
>>>>        return rc;
>>>>    }
>>> While you mention this change in the description, I'm still
>>> missing justification as to why the change is safe to make. I
>>> continue to think p2m_change_entry_type_global() would better
>>> not be called inside the locked region, if at all possible.
>> Well. I am afraid, I don't have a 100% justification why the change is
>> safe to make as well
>> as I don't see an obvious reason why it is not safe to make (at least I
>> didn't find a possible deadlock scenario while investigating the code).
>> I raised a question earlier whether I can fold this check in, which
>> implied calling p2m_change_entry_type_global() with ioreq_server lock held.
> I'm aware of the earlier discussion. But "didn't find" isn't good
> enough in a case like this, and since it's likely hard to indeed
> prove there's no deadlock possible, I think it's best to avoid
> having to provide such a proof by avoiding the nesting.

Agree here.


>
>> If there is a concern with calling this inside the locked region
>> (unfortunately still unclear for me at the moment), I will try to find
>> another way how to split hvm_map_mem_type_to_ioreq_server() without
>> potentially unsafe change here *and* exposing
>> p2m_change_entry_type_global() to the common code. Right now, I don't
>> have any ideas how this could be split other than
>> introducing one more hook here to deal with p2m_change_entry_type_global
>> (probably arch_ioreq_server_map_mem_type_complete?), but I don't expect
>> it to be accepted.
>> I appreciate any ideas on that.
> Is there a reason why the simplest solution (two independent
> arch_*() calls) won't do? If so, what are the constraints?

There is no reason.


> Can the first one e.g. somehow indicate what needs to happen
> after the lock was dropped?

I think, yes.


> But the two calls look independent
> right now, so I don't see any complicating factors.

ok, will go "two independent arch hooks" route then.

Thank you for the idea.
diff mbox series

Patch

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 1cc27df..e3dfb49 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -17,15 +17,15 @@ 
  */
 
 #include <xen/ctype.h>
+#include <xen/domain.h>
+#include <xen/event.h>
 #include <xen/init.h>
+#include <xen/irq.h>
 #include <xen/lib.h>
-#include <xen/trace.h>
+#include <xen/paging.h>
 #include <xen/sched.h>
-#include <xen/irq.h>
 #include <xen/softirq.h>
-#include <xen/domain.h>
-#include <xen/event.h>
-#include <xen/paging.h>
+#include <xen/trace.h>
 #include <xen/vpci.h>
 
 #include <asm/hvm/emulate.h>
@@ -170,6 +170,29 @@  static bool hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p)
     return true;
 }
 
+bool arch_vcpu_ioreq_completion(enum hvm_io_completion io_completion)
+{
+    switch ( io_completion )
+    {
+    case HVMIO_realmode_completion:
+    {
+        struct hvm_emulate_ctxt ctxt;
+
+        hvm_emulate_init_once(&ctxt, NULL, guest_cpu_user_regs());
+        vmx_realmode_emulate_one(&ctxt);
+        hvm_emulate_writeback(&ctxt);
+
+        break;
+    }
+
+    default:
+        ASSERT_UNREACHABLE();
+        break;
+    }
+
+    return true;
+}
+
 bool handle_hvm_io_completion(struct vcpu *v)
 {
     struct domain *d = v->domain;
@@ -209,19 +232,8 @@  bool handle_hvm_io_completion(struct vcpu *v)
         return handle_pio(vio->io_req.addr, vio->io_req.size,
                           vio->io_req.dir);
 
-    case HVMIO_realmode_completion:
-    {
-        struct hvm_emulate_ctxt ctxt;
-
-        hvm_emulate_init_once(&ctxt, NULL, guest_cpu_user_regs());
-        vmx_realmode_emulate_one(&ctxt);
-        hvm_emulate_writeback(&ctxt);
-
-        break;
-    }
     default:
-        ASSERT_UNREACHABLE();
-        break;
+        return arch_vcpu_ioreq_completion(io_completion);
     }
 
     return true;
@@ -477,9 +489,6 @@  static void hvm_update_ioreq_evtchn(struct hvm_ioreq_server *s,
     }
 }
 
-#define HANDLE_BUFIOREQ(s) \
-    ((s)->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF)
-
 static int hvm_ioreq_server_add_vcpu(struct hvm_ioreq_server *s,
                                      struct vcpu *v)
 {
@@ -586,7 +595,7 @@  static void hvm_ioreq_server_remove_all_vcpus(struct hvm_ioreq_server *s)
     spin_unlock(&s->lock);
 }
 
-static int hvm_ioreq_server_map_pages(struct hvm_ioreq_server *s)
+int arch_ioreq_server_map_pages(struct hvm_ioreq_server *s)
 {
     int rc;
 
@@ -601,7 +610,7 @@  static int hvm_ioreq_server_map_pages(struct hvm_ioreq_server *s)
     return rc;
 }
 
-static void hvm_ioreq_server_unmap_pages(struct hvm_ioreq_server *s)
+void arch_ioreq_server_unmap_pages(struct hvm_ioreq_server *s)
 {
     hvm_unmap_ioreq_gfn(s, true);
     hvm_unmap_ioreq_gfn(s, false);
@@ -674,6 +683,12 @@  static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
     return rc;
 }
 
+void arch_ioreq_server_enable(struct hvm_ioreq_server *s)
+{
+    hvm_remove_ioreq_gfn(s, false);
+    hvm_remove_ioreq_gfn(s, true);
+}
+
 static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s)
 {
     struct hvm_ioreq_vcpu *sv;
@@ -683,8 +698,7 @@  static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s)
     if ( s->enabled )
         goto done;
 
-    hvm_remove_ioreq_gfn(s, false);
-    hvm_remove_ioreq_gfn(s, true);
+    arch_ioreq_server_enable(s);
 
     s->enabled = true;
 
@@ -697,6 +711,12 @@  static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s)
     spin_unlock(&s->lock);
 }
 
+void arch_ioreq_server_disable(struct hvm_ioreq_server *s)
+{
+    hvm_add_ioreq_gfn(s, true);
+    hvm_add_ioreq_gfn(s, false);
+}
+
 static void hvm_ioreq_server_disable(struct hvm_ioreq_server *s)
 {
     spin_lock(&s->lock);
@@ -704,8 +724,7 @@  static void hvm_ioreq_server_disable(struct hvm_ioreq_server *s)
     if ( !s->enabled )
         goto done;
 
-    hvm_add_ioreq_gfn(s, true);
-    hvm_add_ioreq_gfn(s, false);
+    arch_ioreq_server_disable(s);
 
     s->enabled = false;
 
@@ -750,7 +769,7 @@  static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
 
  fail_add:
     hvm_ioreq_server_remove_all_vcpus(s);
-    hvm_ioreq_server_unmap_pages(s);
+    arch_ioreq_server_unmap_pages(s);
 
     hvm_ioreq_server_free_rangesets(s);
 
@@ -764,7 +783,7 @@  static void hvm_ioreq_server_deinit(struct hvm_ioreq_server *s)
     hvm_ioreq_server_remove_all_vcpus(s);
 
     /*
-     * NOTE: It is safe to call both hvm_ioreq_server_unmap_pages() and
+     * NOTE: It is safe to call both arch_ioreq_server_unmap_pages() and
      *       hvm_ioreq_server_free_pages() in that order.
      *       This is because the former will do nothing if the pages
      *       are not mapped, leaving the page to be freed by the latter.
@@ -772,7 +791,7 @@  static void hvm_ioreq_server_deinit(struct hvm_ioreq_server *s)
      *       the page_info pointer to NULL, meaning the latter will do
      *       nothing.
      */
-    hvm_ioreq_server_unmap_pages(s);
+    arch_ioreq_server_unmap_pages(s);
     hvm_ioreq_server_free_pages(s);
 
     hvm_ioreq_server_free_rangesets(s);
@@ -836,6 +855,12 @@  int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
     return rc;
 }
 
+/* Called when target domain is paused */
+void arch_ioreq_server_destroy(struct hvm_ioreq_server *s)
+{
+    p2m_set_ioreq_server(s->target, 0, s);
+}
+
 int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
 {
     struct hvm_ioreq_server *s;
@@ -855,7 +880,7 @@  int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
 
     domain_pause(d);
 
-    p2m_set_ioreq_server(d, 0, s);
+    arch_ioreq_server_destroy(s);
 
     hvm_ioreq_server_disable(s);
 
@@ -900,7 +925,7 @@  int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
 
     if ( ioreq_gfn || bufioreq_gfn )
     {
-        rc = hvm_ioreq_server_map_pages(s);
+        rc = arch_ioreq_server_map_pages(s);
         if ( rc )
             goto out;
     }
@@ -1080,6 +1105,24 @@  int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
     return rc;
 }
 
+/* Called with ioreq_server lock held */
+int arch_ioreq_server_map_mem_type(struct domain *d,
+                                   struct hvm_ioreq_server *s,
+                                   uint32_t flags)
+{
+    int rc = p2m_set_ioreq_server(d, flags, s);
+
+    if ( rc == 0 && flags == 0 )
+    {
+        const struct p2m_domain *p2m = p2m_get_hostp2m(d);
+
+        if ( read_atomic(&p2m->ioreq.entry_count) )
+            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
+    }
+
+    return rc;
+}
+
 /*
  * Map or unmap an ioreq server to specific memory type. For now, only
  * HVMMEM_ioreq_server is supported, and in the future new types can be
@@ -1112,19 +1155,11 @@  int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
     if ( s->emulator != current->domain )
         goto out;
 
-    rc = p2m_set_ioreq_server(d, flags, s);
+    rc = arch_ioreq_server_map_mem_type(d, s, flags);
 
  out:
     spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
 
-    if ( rc == 0 && flags == 0 )
-    {
-        struct p2m_domain *p2m = p2m_get_hostp2m(d);
-
-        if ( read_atomic(&p2m->ioreq.entry_count) )
-            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
-    }
-
     return rc;
 }
 
@@ -1210,12 +1245,17 @@  void hvm_all_ioreq_servers_remove_vcpu(struct domain *d, struct vcpu *v)
     spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
 }
 
+bool arch_ioreq_server_destroy_all(struct domain *d)
+{
+    return relocate_portio_handler(d, 0xcf8, 0xcf8, 4);
+}
+
 void hvm_destroy_all_ioreq_servers(struct domain *d)
 {
     struct hvm_ioreq_server *s;
     unsigned int id;
 
-    if ( !relocate_portio_handler(d, 0xcf8, 0xcf8, 4) )
+    if ( !arch_ioreq_server_destroy_all(d) )
         return;
 
     spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
@@ -1239,33 +1279,28 @@  void hvm_destroy_all_ioreq_servers(struct domain *d)
     spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
 }
 
-struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
-                                                 ioreq_t *p)
+int arch_ioreq_server_get_type_addr(const struct domain *d,
+                                    const ioreq_t *p,
+                                    uint8_t *type,
+                                    uint64_t *addr)
 {
-    struct hvm_ioreq_server *s;
-    uint32_t cf8;
-    uint8_t type;
-    uint64_t addr;
-    unsigned int id;
+    unsigned int cf8 = d->arch.hvm.pci_cf8;
 
     if ( p->type != IOREQ_TYPE_COPY && p->type != IOREQ_TYPE_PIO )
-        return NULL;
-
-    cf8 = d->arch.hvm.pci_cf8;
+        return -EINVAL;
 
     if ( p->type == IOREQ_TYPE_PIO &&
          (p->addr & ~3) == 0xcfc &&
          CF8_ENABLED(cf8) )
     {
-        uint32_t x86_fam;
+        unsigned int x86_fam, reg;
         pci_sbdf_t sbdf;
-        unsigned int reg;
 
         reg = hvm_pci_decode_addr(cf8, p->addr, &sbdf);
 
         /* PCI config data cycle */
-        type = XEN_DMOP_IO_RANGE_PCI;
-        addr = ((uint64_t)sbdf.sbdf << 32) | reg;
+        *type = XEN_DMOP_IO_RANGE_PCI;
+        *addr = ((uint64_t)sbdf.sbdf << 32) | reg;
         /* AMD extended configuration space access? */
         if ( CF8_ADDR_HI(cf8) &&
              d->arch.cpuid->x86_vendor == X86_VENDOR_AMD &&
@@ -1277,16 +1312,30 @@  struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
 
             if ( !rdmsr_safe(MSR_AMD64_NB_CFG, msr_val) &&
                  (msr_val & (1ULL << AMD64_NB_CFG_CF8_EXT_ENABLE_BIT)) )
-                addr |= CF8_ADDR_HI(cf8);
+                *addr |= CF8_ADDR_HI(cf8);
         }
     }
     else
     {
-        type = (p->type == IOREQ_TYPE_PIO) ?
-                XEN_DMOP_IO_RANGE_PORT : XEN_DMOP_IO_RANGE_MEMORY;
-        addr = p->addr;
+        *type = (p->type == IOREQ_TYPE_PIO) ?
+                 XEN_DMOP_IO_RANGE_PORT : XEN_DMOP_IO_RANGE_MEMORY;
+        *addr = p->addr;
     }
 
+    return 0;
+}
+
+struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
+                                                 ioreq_t *p)
+{
+    struct hvm_ioreq_server *s;
+    uint8_t type;
+    uint64_t addr;
+    unsigned int id;
+
+    if ( arch_ioreq_server_get_type_addr(d, p, &type, &addr) )
+        return NULL;
+
     FOR_EACH_IOREQ_SERVER(d, id, s)
     {
         struct rangeset *r;
@@ -1515,11 +1564,16 @@  static int hvm_access_cf8(
     return X86EMUL_UNHANDLEABLE;
 }
 
+void arch_ioreq_domain_init(struct domain *d)
+{
+    register_portio_handler(d, 0xcf8, 4, hvm_access_cf8);
+}
+
 void hvm_ioreq_init(struct domain *d)
 {
     spin_lock_init(&d->arch.hvm.ioreq_server.lock);
 
-    register_portio_handler(d, 0xcf8, 4, hvm_access_cf8);
+    arch_ioreq_domain_init(d);
 }
 
 /*
diff --git a/xen/include/asm-x86/hvm/ioreq.h b/xen/include/asm-x86/hvm/ioreq.h
index e2588e9..cc79285 100644
--- a/xen/include/asm-x86/hvm/ioreq.h
+++ b/xen/include/asm-x86/hvm/ioreq.h
@@ -19,6 +19,25 @@ 
 #ifndef __ASM_X86_HVM_IOREQ_H__
 #define __ASM_X86_HVM_IOREQ_H__
 
+#define HANDLE_BUFIOREQ(s) \
+    ((s)->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF)
+
+bool arch_vcpu_ioreq_completion(enum hvm_io_completion io_completion);
+int arch_ioreq_server_map_pages(struct hvm_ioreq_server *s);
+void arch_ioreq_server_unmap_pages(struct hvm_ioreq_server *s);
+void arch_ioreq_server_enable(struct hvm_ioreq_server *s);
+void arch_ioreq_server_disable(struct hvm_ioreq_server *s);
+void arch_ioreq_server_destroy(struct hvm_ioreq_server *s);
+int arch_ioreq_server_map_mem_type(struct domain *d,
+                                   struct hvm_ioreq_server *s,
+                                   uint32_t flags);
+bool arch_ioreq_server_destroy_all(struct domain *d);
+int arch_ioreq_server_get_type_addr(const struct domain *d,
+                                    const ioreq_t *p,
+                                    uint8_t *type,
+                                    uint64_t *addr);
+void arch_ioreq_domain_init(struct domain *d);
+
 bool hvm_io_pending(struct vcpu *v);
 bool handle_hvm_io_completion(struct vcpu *v);
 bool is_ioreq_server_page(struct domain *d, const struct page_info *page);