diff mbox series

[7/9] vpci/header: program p2m with guest BAR view

Message ID 20210903100831.177748-8-andr2000@gmail.com (mailing list archive)
State Superseded
Headers show
Series PCI devices passthrough on Arm, part 3 | expand

Commit Message

Oleksandr Andrushchenko Sept. 3, 2021, 10:08 a.m. UTC
From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Take into account guest's BAR view and program its p2m accordingly:
gfn is guest's view of the BAR and mfn is the physical BAR value as set
up by the host bridge in the hardware domain.
This way hardware doamin sees physical BAR values and guest sees
emulated ones.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
---
 xen/drivers/vpci/header.c | 35 +++++++++++++++++++++++++++++++++--
 1 file changed, 33 insertions(+), 2 deletions(-)

Comments

Jan Beulich Sept. 6, 2021, 2:51 p.m. UTC | #1
On 03.09.2021 12:08, Oleksandr Andrushchenko wrote:
> @@ -37,12 +41,28 @@ static int map_range(unsigned long s, unsigned long e, void *data,
>                       unsigned long *c)
>  {
>      const struct map_data *map = data;
> +    gfn_t start_gfn;
>      int rc;
>  
>      for ( ; ; )
>      {
>          unsigned long size = e - s + 1;
>  
> +        /*
> +         * Any BAR may have holes in its memory we want to map, e.g.
> +         * we don't want to map MSI regions which may be a part of that BAR,
> +         * e.g. when a single BAR is used for both MMIO and MSI.
> +         * In this case MSI regions are subtracted from the mapping, but
> +         * map->start_gfn still points to the very beginning of the BAR.
> +         * So if there is a hole present then we need to adjust start_gfn
> +         * to reflect the fact of that substraction.
> +         */
> +        start_gfn = gfn_add(map->start_gfn, s - mfn_x(map->start_mfn));

I may be missing something, but don't you need to adjust "size" then
as well? And don't you need to account for the "hole" not being at
the start? (As an aside - do you mean "MSI-X regions" everywhere you
say just "MSI" above?)

Jan
Oleksandr Andrushchenko Sept. 9, 2021, 6:13 a.m. UTC | #2
On 06.09.21 17:51, Jan Beulich wrote:
> On 03.09.2021 12:08, Oleksandr Andrushchenko wrote:
>> @@ -37,12 +41,28 @@ static int map_range(unsigned long s, unsigned long e, void *data,
>>                        unsigned long *c)
>>   {
>>       const struct map_data *map = data;
>> +    gfn_t start_gfn;
>>       int rc;
>>   
>>       for ( ; ; )
>>       {
>>           unsigned long size = e - s + 1;
>>   
>> +        /*
>> +         * Any BAR may have holes in its memory we want to map, e.g.
>> +         * we don't want to map MSI regions which may be a part of that BAR,
>> +         * e.g. when a single BAR is used for both MMIO and MSI.
>> +         * In this case MSI regions are subtracted from the mapping, but
>> +         * map->start_gfn still points to the very beginning of the BAR.
>> +         * So if there is a hole present then we need to adjust start_gfn
>> +         * to reflect the fact of that substraction.
>> +         */
>> +        start_gfn = gfn_add(map->start_gfn, s - mfn_x(map->start_mfn));
> I may be missing something, but don't you need to adjust "size" then
> as well?

No, as range sets get consumed we have e and s updated accordingly,

so each time size represents the right value.

>   And don't you need to account for the "hole" not being at
> the start?

We only have MMIO ranges here and all the ranges have their start set

appropriately

>   (As an aside - do you mean "MSI-X regions" everywhere you
> say just "MSI" above?)
Yes, I mean MSI-X: will update
>
> Jan
>
Jan Beulich Sept. 9, 2021, 8:26 a.m. UTC | #3
On 09.09.2021 08:13, Oleksandr Andrushchenko wrote:
> 
> On 06.09.21 17:51, Jan Beulich wrote:
>> On 03.09.2021 12:08, Oleksandr Andrushchenko wrote:
>>> @@ -37,12 +41,28 @@ static int map_range(unsigned long s, unsigned long e, void *data,
>>>                        unsigned long *c)
>>>   {
>>>       const struct map_data *map = data;
>>> +    gfn_t start_gfn;
>>>       int rc;
>>>   
>>>       for ( ; ; )
>>>       {
>>>           unsigned long size = e - s + 1;
>>>   
>>> +        /*
>>> +         * Any BAR may have holes in its memory we want to map, e.g.
>>> +         * we don't want to map MSI regions which may be a part of that BAR,
>>> +         * e.g. when a single BAR is used for both MMIO and MSI.
>>> +         * In this case MSI regions are subtracted from the mapping, but
>>> +         * map->start_gfn still points to the very beginning of the BAR.
>>> +         * So if there is a hole present then we need to adjust start_gfn
>>> +         * to reflect the fact of that substraction.
>>> +         */
>>> +        start_gfn = gfn_add(map->start_gfn, s - mfn_x(map->start_mfn));
>> I may be missing something, but don't you need to adjust "size" then
>> as well?
> 
> No, as range sets get consumed we have e and s updated accordingly,
> so each time size represents the right value.

It feels like something's wrong with the rangeset construction then:
Either it represents _all_ holes (including degenerate ones at the
start of end of a range), or none of them.

Jan
Oleksandr Andrushchenko Sept. 9, 2021, 9:16 a.m. UTC | #4
On 09.09.21 11:26, Jan Beulich wrote:
> On 09.09.2021 08:13, Oleksandr Andrushchenko wrote:
>> On 06.09.21 17:51, Jan Beulich wrote:
>>> On 03.09.2021 12:08, Oleksandr Andrushchenko wrote:
>>>> @@ -37,12 +41,28 @@ static int map_range(unsigned long s, unsigned long e, void *data,
>>>>                         unsigned long *c)
>>>>    {
>>>>        const struct map_data *map = data;
>>>> +    gfn_t start_gfn;
>>>>        int rc;
>>>>    
>>>>        for ( ; ; )
>>>>        {
>>>>            unsigned long size = e - s + 1;
>>>>    
>>>> +        /*
>>>> +         * Any BAR may have holes in its memory we want to map, e.g.
>>>> +         * we don't want to map MSI regions which may be a part of that BAR,
>>>> +         * e.g. when a single BAR is used for both MMIO and MSI.
>>>> +         * In this case MSI regions are subtracted from the mapping, but
>>>> +         * map->start_gfn still points to the very beginning of the BAR.
>>>> +         * So if there is a hole present then we need to adjust start_gfn
>>>> +         * to reflect the fact of that substraction.
>>>> +         */
>>>> +        start_gfn = gfn_add(map->start_gfn, s - mfn_x(map->start_mfn));
>>> I may be missing something, but don't you need to adjust "size" then
>>> as well?
>> No, as range sets get consumed we have e and s updated accordingly,
>> so each time size represents the right value.
> It feels like something's wrong with the rangeset construction then:
> Either it represents _all_ holes (including degenerate ones at the
> start of end of a range), or none of them.

The resulting range set only has the MMIOs in it. While constructing the range set

we cut off MSI-X out of it (make holes). But finally it only has the ranges that we

need to map/unmap.

>
> Jan
>
Thank you,

Oleksandr
Jan Beulich Sept. 9, 2021, 9:40 a.m. UTC | #5
On 09.09.2021 11:16, Oleksandr Andrushchenko wrote:
> 
> On 09.09.21 11:26, Jan Beulich wrote:
>> On 09.09.2021 08:13, Oleksandr Andrushchenko wrote:
>>> On 06.09.21 17:51, Jan Beulich wrote:
>>>> On 03.09.2021 12:08, Oleksandr Andrushchenko wrote:
>>>>> @@ -37,12 +41,28 @@ static int map_range(unsigned long s, unsigned long e, void *data,
>>>>>                         unsigned long *c)
>>>>>    {
>>>>>        const struct map_data *map = data;
>>>>> +    gfn_t start_gfn;
>>>>>        int rc;
>>>>>    
>>>>>        for ( ; ; )
>>>>>        {
>>>>>            unsigned long size = e - s + 1;
>>>>>    
>>>>> +        /*
>>>>> +         * Any BAR may have holes in its memory we want to map, e.g.
>>>>> +         * we don't want to map MSI regions which may be a part of that BAR,
>>>>> +         * e.g. when a single BAR is used for both MMIO and MSI.
>>>>> +         * In this case MSI regions are subtracted from the mapping, but
>>>>> +         * map->start_gfn still points to the very beginning of the BAR.
>>>>> +         * So if there is a hole present then we need to adjust start_gfn
>>>>> +         * to reflect the fact of that substraction.
>>>>> +         */
>>>>> +        start_gfn = gfn_add(map->start_gfn, s - mfn_x(map->start_mfn));
>>>> I may be missing something, but don't you need to adjust "size" then
>>>> as well?
>>> No, as range sets get consumed we have e and s updated accordingly,
>>> so each time size represents the right value.
>> It feels like something's wrong with the rangeset construction then:
>> Either it represents _all_ holes (including degenerate ones at the
>> start of end of a range), or none of them.
> 
> The resulting range set only has the MMIOs in it. While constructing the range set
> we cut off MSI-X out of it (make holes). But finally it only has the ranges that we
> need to map/unmap.

And then why is there a need to adjust start_gfn?

Jan
Oleksandr Andrushchenko Sept. 9, 2021, 9:53 a.m. UTC | #6
On 09.09.21 12:40, Jan Beulich wrote:
> On 09.09.2021 11:16, Oleksandr Andrushchenko wrote:
>> On 09.09.21 11:26, Jan Beulich wrote:
>>> On 09.09.2021 08:13, Oleksandr Andrushchenko wrote:
>>>> On 06.09.21 17:51, Jan Beulich wrote:
>>>>> On 03.09.2021 12:08, Oleksandr Andrushchenko wrote:
>>>>>> @@ -37,12 +41,28 @@ static int map_range(unsigned long s, unsigned long e, void *data,
>>>>>>                          unsigned long *c)
>>>>>>     {
>>>>>>         const struct map_data *map = data;
>>>>>> +    gfn_t start_gfn;
>>>>>>         int rc;
>>>>>>     
>>>>>>         for ( ; ; )
>>>>>>         {
>>>>>>             unsigned long size = e - s + 1;
>>>>>>     
>>>>>> +        /*
>>>>>> +         * Any BAR may have holes in its memory we want to map, e.g.
>>>>>> +         * we don't want to map MSI regions which may be a part of that BAR,
>>>>>> +         * e.g. when a single BAR is used for both MMIO and MSI.
>>>>>> +         * In this case MSI regions are subtracted from the mapping, but
>>>>>> +         * map->start_gfn still points to the very beginning of the BAR.
>>>>>> +         * So if there is a hole present then we need to adjust start_gfn
>>>>>> +         * to reflect the fact of that substraction.
>>>>>> +         */
>>>>>> +        start_gfn = gfn_add(map->start_gfn, s - mfn_x(map->start_mfn));
>>>>> I may be missing something, but don't you need to adjust "size" then
>>>>> as well?
>>>> No, as range sets get consumed we have e and s updated accordingly,
>>>> so each time size represents the right value.
>>> It feels like something's wrong with the rangeset construction then:
>>> Either it represents _all_ holes (including degenerate ones at the
>>> start of end of a range), or none of them.
>> The resulting range set only has the MMIOs in it. While constructing the range set
>> we cut off MSI-X out of it (make holes). But finally it only has the ranges that we
>> need to map/unmap.
> And then why is there a need to adjust start_gfn?

Because of the holes: the range set's private data can only hold BARs start MFN

and start GFN. It doesn't have a list of start_{mfn|gfn} which describe each range,

but only the start_{mfn|gfn} of the whole range set, e.g. where the BAR starts

So, because of the holes we need to adjust the starting addresses:

0. MMIO0 <- we pass start_mfn and start_gfn pointing to the BAR start

1. MSI-X <- hole

2. MMIO1 <- need to adjust start_mfn and start_gfn with respect to the hole above

>
> Jan
>
Thank you,

Oleksandr
diff mbox series

Patch

diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 7f54199a3894..7416ef1e1e06 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -30,6 +30,10 @@ 
 
 struct map_data {
     struct domain *d;
+    /* Start address of the BAR as seen by the guest. */
+    gfn_t start_gfn;
+    /* Physical start address of the BAR. */
+    mfn_t start_mfn;
     bool map;
 };
 
@@ -37,12 +41,28 @@  static int map_range(unsigned long s, unsigned long e, void *data,
                      unsigned long *c)
 {
     const struct map_data *map = data;
+    gfn_t start_gfn;
     int rc;
 
     for ( ; ; )
     {
         unsigned long size = e - s + 1;
 
+        /*
+         * Any BAR may have holes in its memory we want to map, e.g.
+         * we don't want to map MSI regions which may be a part of that BAR,
+         * e.g. when a single BAR is used for both MMIO and MSI.
+         * In this case MSI regions are subtracted from the mapping, but
+         * map->start_gfn still points to the very beginning of the BAR.
+         * So if there is a hole present then we need to adjust start_gfn
+         * to reflect the fact of that substraction.
+         */
+        start_gfn = gfn_add(map->start_gfn, s - mfn_x(map->start_mfn));
+
+        printk(XENLOG_G_DEBUG
+               "%smap [%lx, %lx] -> %#"PRI_gfn" for d%d\n",
+               map->map ? "" : "un", s, e, gfn_x(start_gfn),
+               map->d->domain_id);
         /*
          * ARM TODOs:
          * - On ARM whether the memory is prefetchable or not should be passed
@@ -52,8 +72,10 @@  static int map_range(unsigned long s, unsigned long e, void *data,
          * - {un}map_mmio_regions doesn't support preemption.
          */
 
-        rc = map->map ? map_mmio_regions(map->d, _gfn(s), size, _mfn(s))
-                      : unmap_mmio_regions(map->d, _gfn(s), size, _mfn(s));
+        rc = map->map ? map_mmio_regions(map->d, start_gfn,
+                                         size, _mfn(s))
+                      : unmap_mmio_regions(map->d, start_gfn,
+                                           size, _mfn(s));
         if ( rc == 0 )
         {
             *c += size;
@@ -69,6 +91,7 @@  static int map_range(unsigned long s, unsigned long e, void *data,
         ASSERT(rc < size);
         *c += rc;
         s += rc;
+        gfn_add(map->start_gfn, rc);
         if ( general_preempt_check() )
                 return -ERESTART;
     }
@@ -149,6 +172,10 @@  bool vpci_process_pending(struct vcpu *v)
             if ( !bar->mem )
                 continue;
 
+            data.start_gfn = is_hardware_domain(v->vpci.pdev->domain) ?
+                _gfn(PFN_DOWN(bar->addr)) :
+                _gfn(PFN_DOWN(bar->guest_addr));
+            data.start_mfn = _mfn(PFN_DOWN(bar->addr));
             rc = rangeset_consume_ranges(bar->mem, map_range, &data);
 
             if ( rc == -ERESTART )
@@ -194,6 +221,10 @@  static int __init apply_map(struct domain *d, const struct pci_dev *pdev,
         if ( !bar->mem )
             continue;
 
+        data.start_gfn = is_hardware_domain(d) ?
+            _gfn(PFN_DOWN(bar->addr)) :
+            _gfn(PFN_DOWN(bar->guest_addr));
+        data.start_mfn = _mfn(PFN_DOWN(bar->addr));
         while ( (rc = rangeset_consume_ranges(bar->mem, map_range,
                                               &data)) == -ERESTART )
             process_pending_softirqs();