diff mbox series

[v2] xen/arm: fix gnttab_need_iommu_mapping

Message ID 20210208184932.23468-1-sstabellini@kernel.org (mailing list archive)
State New, archived
Headers show
Series [v2] xen/arm: fix gnttab_need_iommu_mapping | expand

Commit Message

Stefano Stabellini Feb. 8, 2021, 6:49 p.m. UTC
Commit 91d4eca7add broke gnttab_need_iommu_mapping on ARM.
The offending chunk is:

 #define gnttab_need_iommu_mapping(d)                    \
-    (is_domain_direct_mapped(d) && need_iommu(d))
+    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))

On ARM we need gnttab_need_iommu_mapping to be true for dom0 when it is
directly mapped and IOMMU is enabled for the domain, like the old check
did, but the new check is always false.

In fact, need_iommu_pt_sync is defined as dom_iommu(d)->need_sync and
need_sync is set as:

    if ( !is_hardware_domain(d) || iommu_hwdom_strict )
        hd->need_sync = !iommu_use_hap_pt(d);

iommu_use_hap_pt(d) means that the page-table used by the IOMMU is the
P2M. It is true on ARM. need_sync means that you have a separate IOMMU
page-table and it needs to be updated for every change. need_sync is set
to false on ARM. Hence, gnttab_need_iommu_mapping(d) is false too,
which is wrong.

As a consequence, when using PV network from a domU on a system where
IOMMU is on from Dom0, I get:

(XEN) smmu: /smmu@fd800000: Unhandled context fault: fsr=0x402, iova=0x8424cb148, fsynr=0xb0001, cb=0
[   68.290307] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK

The fix is to go back to something along the lines of the old
implementation of gnttab_need_iommu_mapping.

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Fixes: 91d4eca7add
Backport: 4.12+

---

Given the severity of the bug, I would like to request this patch to be
backported to 4.12 too, even if 4.12 is security-fixes only since Oct
2020.

For the 4.12 backport, we can use iommu_enabled() instead of
is_iommu_enabled() in the implementation of gnttab_need_iommu_mapping.

Changes in v2:
- improve commit message
- add is_iommu_enabled(d) to the check
---
 xen/include/asm-arm/grant_table.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Julien Grall Feb. 8, 2021, 8:02 p.m. UTC | #1
(+ Jan and Ian for RM/stable decision)

Hi Jan,

On 08/02/2021 18:49, Stefano Stabellini wrote:
> Commit 91d4eca7add broke gnttab_need_iommu_mapping on ARM.
> The offending chunk is:
> 
>   #define gnttab_need_iommu_mapping(d)                    \
> -    (is_domain_direct_mapped(d) && need_iommu(d))
> +    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
> 
> On ARM we need gnttab_need_iommu_mapping to be true for dom0 when it is
> directly mapped and IOMMU is enabled for the domain, like the old check
> did, but the new check is always false.
> 
> In fact, need_iommu_pt_sync is defined as dom_iommu(d)->need_sync and
> need_sync is set as:
> 
>      if ( !is_hardware_domain(d) || iommu_hwdom_strict )
>          hd->need_sync = !iommu_use_hap_pt(d);
> 
> iommu_use_hap_pt(d) means that the page-table used by the IOMMU is the
> P2M. It is true on ARM. need_sync means that you have a separate IOMMU
> page-table and it needs to be updated for every change. need_sync is set
> to false on ARM. Hence, gnttab_need_iommu_mapping(d) is false too,
> which is wrong.
> 
> As a consequence, when using PV network from a domU on a system where
> IOMMU is on from Dom0, I get:
> 
> (XEN) smmu: /smmu@fd800000: Unhandled context fault: fsr=0x402, iova=0x8424cb148, fsynr=0xb0001, cb=0
> [   68.290307] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK
> 
> The fix is to go back to something along the lines of the old
> implementation of gnttab_need_iommu_mapping.
> 
> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> Fixes: 91d4eca7add

The format for fixes tag is:

Fixes: sha ("<commit title>")

For staging fix:

Reviewed-by: Julien Grall <jgrall@amazon.com>

@Ian, I think this wants to go in 4.15. Without it, Xen may receive an 
IOMMU fault for DMA transaction using granted page.

> Backport: 4.12+
> 
> ---
> 
> Given the severity of the bug, I would like to request this patch to be
> backported to 4.12 too, even if 4.12 is security-fixes only since Oct
> 2020.

I would agree that the bug is bad, but it is not clear to me why this 
would be warrant for an exception for backporting. Can you outline 
what's the worse that can happen?

Correct me if I am wrong, if one can hit this error, then it should be 
pretty reliable. Therefore, anyone wanted to use 4.12 in production 
should have seen if the error on there setup by now (4.12 has been out 
for nearly two years). If not, then they are most likely not affected.

Any new users of Xen should use the latest stable rather than starting 
with an old version.

Other than the seriousness of the bug, I think there is also a fairness 
concern.

So far our rules says there is only security support backport allowed. 
If we start granting exception, then we need a way to prevent abuse of 
it. To take an extreme example, why one couldn't ask backport for 4.2?

That said, I vaguely remember this topic was brought up a few time on 
security@. So maybe it is time to have a new discussion about stable tree.

Cheers,
Stefano Stabellini Feb. 8, 2021, 8:24 p.m. UTC | #2
On Mon, 8 Feb 2021, Julien Grall wrote:
> (+ Jan and Ian for RM/stable decision)
> 
> Hi Jan,
> 
> On 08/02/2021 18:49, Stefano Stabellini wrote:
> > Commit 91d4eca7add broke gnttab_need_iommu_mapping on ARM.
> > The offending chunk is:
> > 
> >   #define gnttab_need_iommu_mapping(d)                    \
> > -    (is_domain_direct_mapped(d) && need_iommu(d))
> > +    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
> > 
> > On ARM we need gnttab_need_iommu_mapping to be true for dom0 when it is
> > directly mapped and IOMMU is enabled for the domain, like the old check
> > did, but the new check is always false.
> > 
> > In fact, need_iommu_pt_sync is defined as dom_iommu(d)->need_sync and
> > need_sync is set as:
> > 
> >      if ( !is_hardware_domain(d) || iommu_hwdom_strict )
> >          hd->need_sync = !iommu_use_hap_pt(d);
> > 
> > iommu_use_hap_pt(d) means that the page-table used by the IOMMU is the
> > P2M. It is true on ARM. need_sync means that you have a separate IOMMU
> > page-table and it needs to be updated for every change. need_sync is set
> > to false on ARM. Hence, gnttab_need_iommu_mapping(d) is false too,
> > which is wrong.
> > 
> > As a consequence, when using PV network from a domU on a system where
> > IOMMU is on from Dom0, I get:
> > 
> > (XEN) smmu: /smmu@fd800000: Unhandled context fault: fsr=0x402,
> > iova=0x8424cb148, fsynr=0xb0001, cb=0
> > [   68.290307] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK
> > 
> > The fix is to go back to something along the lines of the old
> > implementation of gnttab_need_iommu_mapping.
> > 
> > Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> > Fixes: 91d4eca7add
> 
> The format for fixes tag is:
> 
> Fixes: sha ("<commit title>")

Oops. Can be fixed on commit.


> For staging fix:
> 
> Reviewed-by: Julien Grall <jgrall@amazon.com>

Thank you!


> @Ian, I think this wants to go in 4.15. Without it, Xen may receive an IOMMU
> fault for DMA transaction using granted page.
> 
> > Backport: 4.12+
> > 
> > ---
> > 
> > Given the severity of the bug, I would like to request this patch to be
> > backported to 4.12 too, even if 4.12 is security-fixes only since Oct
> > 2020.
> 
> I would agree that the bug is bad, but it is not clear to me why this would be
> warrant for an exception for backporting. Can you outline what's the worse
> that can happen?
> 
> Correct me if I am wrong, if one can hit this error, then it should be pretty
> reliable. Therefore, anyone wanted to use 4.12 in production should have seen
> if the error on there setup by now (4.12 has been out for nearly two years).
> If not, then they are most likely not affected.
> 
> Any new users of Xen should use the latest stable rather than starting with an
> old version.

Yes, the bug reproduces reliably but it takes more than a smoke test to
find it. That's why it wasn't found by OSSTest and also our internal
CI-loop at Xilinx.

Users can be very slow at upgrading, so I am worried that 4.12 might still
be picked by somebody, especially given that it is still security
supported for a while.


> Other than the seriousness of the bug, I think there is also a fairness
> concern.
> 
> So far our rules says there is only security support backport allowed. If we
> start granting exception, then we need a way to prevent abuse of it. To take
> an extreme example, why one couldn't ask backport for 4.2?
> 
> That said, I vaguely remember this topic was brought up a few time on
> security@. So maybe it is time to have a new discussion about stable tree.

I wouldn't consider a backport for a tree that is closed even for
security backports. So in your example, I'd say no to a backport to 4.2
or 4.10.

I think there is a valid question for trees that are still open to
security fixes but not general backports.

For these cases, I would just follow a simple rule of thumb:
- is the submitter willing to provide the backport?
- is the backport low-risk?
- is the underlying bug important?

If the answer to all is "yes" then I'd go with it.


In this case, given that the fix is a one-liner, and obviously correct,
I think it is worth considering.
Julien Grall Feb. 8, 2021, 9:23 p.m. UTC | #3
On Mon, 8 Feb 2021 at 20:24, Stefano Stabellini <sstabellini@kernel.org> wrote:
> > @Ian, I think this wants to go in 4.15. Without it, Xen may receive an IOMMU
> > fault for DMA transaction using granted page.
> >
> > > Backport: 4.12+
> > >
> > > ---
> > >
> > > Given the severity of the bug, I would like to request this patch to be
> > > backported to 4.12 too, even if 4.12 is security-fixes only since Oct
> > > 2020.
> >
> > I would agree that the bug is bad, but it is not clear to me why this would be
> > warrant for an exception for backporting. Can you outline what's the worse
> > that can happen?
> >
> > Correct me if I am wrong, if one can hit this error, then it should be pretty
> > reliable. Therefore, anyone wanted to use 4.12 in production should have seen
> > if the error on there setup by now (4.12 has been out for nearly two years).
> > If not, then they are most likely not affected.
> >
> > Any new users of Xen should use the latest stable rather than starting with an
> > old version.
>
> Yes, the bug reproduces reliably but it takes more than a smoke test to
> find it. That's why it wasn't found by OSSTest and also our internal
> CI-loop at Xilinx.

Ok. So a user should be able to catch it during testing, is that correct?

>
> Users can be very slow at upgrading, so I am worried that 4.12 might still
> be picked by somebody, especially given that it is still security
> supported for a while.

Don't tell me about upgrading Xen... ;) But I am a bit confused, are
you worried about existing users or new users?

>
> > Other than the seriousness of the bug, I think there is also a fairness
> > concern.
> >
> > So far our rules says there is only security support backport allowed. If we
> > start granting exception, then we need a way to prevent abuse of it. To take
> > an extreme example, why one couldn't ask backport for 4.2?
> >
> > That said, I vaguely remember this topic was brought up a few time on
> > security@. So maybe it is time to have a new discussion about stable tree.
>
> I wouldn't consider a backport for a tree that is closed even for
> security backports. So in your example, I'd say no to a backport to 4.2
> or 4.10.
>
> I think there is a valid question for trees that are still open to
> security fixes but not general backports.
>
> For these cases, I would just follow a simple rule of thumb:

Aren't those rules already used for stable trees?

> - is the submitter willing to provide the backport?
> - is the backport low-risk?
> - is the underlying bug important?

You wrote multiple times that this is serious but it is still not
clear what's the worse that can happen...

>
> If the answer to all is "yes" then I'd go with it.
>
>
> In this case, given that the fix is a one-liner, and obviously correct,

I have seen one-liners that introduced XSA in the past ;).

Cheers,
Stefano Stabellini Feb. 9, 2021, 1:57 a.m. UTC | #4
On Mon, 8 Feb 2021, Julien Grall wrote:
> On Mon, 8 Feb 2021 at 20:24, Stefano Stabellini <sstabellini@kernel.org> wrote:
> > > @Ian, I think this wants to go in 4.15. Without it, Xen may receive an IOMMU
> > > fault for DMA transaction using granted page.
> > >
> > > > Backport: 4.12+
> > > >
> > > > ---
> > > >
> > > > Given the severity of the bug, I would like to request this patch to be
> > > > backported to 4.12 too, even if 4.12 is security-fixes only since Oct
> > > > 2020.
> > >
> > > I would agree that the bug is bad, but it is not clear to me why this would be
> > > warrant for an exception for backporting. Can you outline what's the worse
> > > that can happen?
> > >
> > > Correct me if I am wrong, if one can hit this error, then it should be pretty
> > > reliable. Therefore, anyone wanted to use 4.12 in production should have seen
> > > if the error on there setup by now (4.12 has been out for nearly two years).
> > > If not, then they are most likely not affected.
> > >
> > > Any new users of Xen should use the latest stable rather than starting with an
> > > old version.
> >
> > Yes, the bug reproduces reliably but it takes more than a smoke test to
> > find it. That's why it wasn't found by OSSTest and also our internal
> > CI-loop at Xilinx.
> 
> Ok. So a user should be able to catch it during testing, is that correct?

Yes, probably. The failure is that PV drivers do not work (they trigger
the IOMMU fault), specifically PV network and block, maybe others too.

I think it is unlikely but possible that an hardware update would also
trigger the bug. For instance, a change of the network card might
trigger the bug, if the previous network card driver was always bouncing
requests on bounce buffers, while the new drivers uses the provided
memory pages directly. I don't know how realistic this scenario is.


> > Users can be very slow at upgrading, so I am worried that 4.12 might still
> > be picked by somebody, especially given that it is still security
> > supported for a while.
> 
> Don't tell me about upgrading Xen... ;) But I am a bit confused, are
> you worried about existing users or new users?

I am mostly worried about people that start using 4.12.

If a user was already on 4.12 and not seeing any errors, they are
unlikely to see this error. It would only happen if:
- they didn't use PV drivers before, and they want to start using PV
  drivers now
- they are upgrading hardware (not sure how likely to happen, see above)


> > > Other than the seriousness of the bug, I think there is also a fairness
> > > concern.
> > >
> > > So far our rules says there is only security support backport allowed. If we
> > > start granting exception, then we need a way to prevent abuse of it. To take
> > > an extreme example, why one couldn't ask backport for 4.2?
> > >
> > > That said, I vaguely remember this topic was brought up a few time on
> > > security@. So maybe it is time to have a new discussion about stable tree.
> >
> > I wouldn't consider a backport for a tree that is closed even for
> > security backports. So in your example, I'd say no to a backport to 4.2
> > or 4.10.
> >
> > I think there is a valid question for trees that are still open to
> > security fixes but not general backports.
> >
> > For these cases, I would just follow a simple rule of thumb:
> 
> Aren't those rules already used for stable trees?

No, I don't think so. Backports are done by Jan and me, not by the
submitter (in this case I am the submitter but it is a coincidence :-)
If a commit is fixing a genuine bug and the backport doesn't cause
issues, then it is typically done. Here the bar should be certainly
higher, both in terms of low-risk, and importance of the bug to fix.



> > - is the submitter willing to provide the backport?
> > - is the backport low-risk?
> > - is the underlying bug important?
> 
> You wrote multiple times that this is serious but it is still not
> clear what's the worse that can happen...

PV drivers don't work: each data transfer involving granted pages causes
an IOMMU fault.


> > If the answer to all is "yes" then I'd go with it.
> >
> >
> > In this case, given that the fix is a one-liner, and obviously correct,
> 
> I have seen one-liners that introduced XSA in the past ;).
 
Sure but this is a revert to the pre-4.12 implementation.
Julien Grall Feb. 9, 2021, 8:44 a.m. UTC | #5
Hi Stefano,

On 09/02/2021 01:57, Stefano Stabellini wrote:
> On Mon, 8 Feb 2021, Julien Grall wrote:
>> On Mon, 8 Feb 2021 at 20:24, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>>> @Ian, I think this wants to go in 4.15. Without it, Xen may receive an IOMMU
>>>> fault for DMA transaction using granted page.
>>>>
>>>>> Backport: 4.12+
>>>>>
>>>>> ---
>>>>>
>>>>> Given the severity of the bug, I would like to request this patch to be
>>>>> backported to 4.12 too, even if 4.12 is security-fixes only since Oct
>>>>> 2020.
>>>>
>>>> I would agree that the bug is bad, but it is not clear to me why this would be
>>>> warrant for an exception for backporting. Can you outline what's the worse
>>>> that can happen?
>>>>
>>>> Correct me if I am wrong, if one can hit this error, then it should be pretty
>>>> reliable. Therefore, anyone wanted to use 4.12 in production should have seen
>>>> if the error on there setup by now (4.12 has been out for nearly two years).
>>>> If not, then they are most likely not affected.
>>>>
>>>> Any new users of Xen should use the latest stable rather than starting with an
>>>> old version.
>>>
>>> Yes, the bug reproduces reliably but it takes more than a smoke test to
>>> find it. That's why it wasn't found by OSSTest and also our internal
>>> CI-loop at Xilinx.
>>
>> Ok. So a user should be able to catch it during testing, is that correct?
> 
> Yes, probably. The failure is that PV drivers do not work (they trigger
> the IOMMU fault), specifically PV network and block, maybe others too.
> 
> I think it is unlikely but possible that an hardware update would also
> trigger the bug. For instance, a change of the network card might
> trigger the bug, if the previous network card driver was always bouncing
> requests on bounce buffers, while the new drivers uses the provided
> memory pages directly. I don't know how realistic this scenario is.
> 
> 
>>> Users can be very slow at upgrading, so I am worried that 4.12 might still
>>> be picked by somebody, especially given that it is still security
>>> supported for a while.
>>
>> Don't tell me about upgrading Xen... ;) But I am a bit confused, are
>> you worried about existing users or new users?
> 
> I am mostly worried about people that start using 4.12.

I think it would be a big mistake for anyone to start using 4.12 now. I 
can already cite a few bugs (including in the SMMU driver) that haven't 
been backport to 4.12 . This is only going to be worse as it is not 
stable anymore.

It is also not clear why someone would decide to use 4.12 when 4.13/4.14 
are still supported and will also come with an extra 1 year and half 
security support.

> 
> If a user was already on 4.12 and not seeing any errors, they are
> unlikely to see this error. It would only happen if:
> - they didn't use PV drivers before, and they want to start using PV
>    drivers now
> - they are upgrading hardware (not sure how likely to happen, see above)

Right, if you decide to switch device or upgrade HW, then you may also 
face other issues either in Xen or Linux.

Once a tree is out of support, we make no promise that it will work on 
new setup (including dom0 software). We only promise that it will 
continue to work on existing setup and we will address security issue.

>>> - is the submitter willing to provide the backport?
>>> - is the backport low-risk?
>>> - is the underlying bug important?
>>
>> You wrote multiple times that this is serious but it is still not
>> clear what's the worse that can happen...
> 
> PV drivers don't work: each data transfer involving granted pages causes
> an IOMMU fault.
Based on all the information you provided, this is not a fix I would 
recommend to backport to 4.12 because it is only impacting new/upgraded 
system (software or HW).

Cheers,
Jan Beulich Feb. 9, 2021, 10:02 a.m. UTC | #6
On 08.02.2021 21:24, Stefano Stabellini wrote:
> On Mon, 8 Feb 2021, Julien Grall wrote:
>> On 08/02/2021 18:49, Stefano Stabellini wrote:
>>> Given the severity of the bug, I would like to request this patch to be
>>> backported to 4.12 too, even if 4.12 is security-fixes only since Oct
>>> 2020.
>>
>> I would agree that the bug is bad, but it is not clear to me why this would be
>> warrant for an exception for backporting. Can you outline what's the worse
>> that can happen?
>>
>> Correct me if I am wrong, if one can hit this error, then it should be pretty
>> reliable. Therefore, anyone wanted to use 4.12 in production should have seen
>> if the error on there setup by now (4.12 has been out for nearly two years).
>> If not, then they are most likely not affected.
>>
>> Any new users of Xen should use the latest stable rather than starting with an
>> old version.
> 
> Yes, the bug reproduces reliably but it takes more than a smoke test to
> find it. That's why it wasn't found by OSSTest and also our internal
> CI-loop at Xilinx.
> 
> Users can be very slow at upgrading, so I am worried that 4.12 might still
> be picked by somebody, especially given that it is still security
> supported for a while.
> 
> 
>> Other than the seriousness of the bug, I think there is also a fairness
>> concern.
>>
>> So far our rules says there is only security support backport allowed. If we
>> start granting exception, then we need a way to prevent abuse of it. To take
>> an extreme example, why one couldn't ask backport for 4.2?
>>
>> That said, I vaguely remember this topic was brought up a few time on
>> security@. So maybe it is time to have a new discussion about stable tree.
> 
> I wouldn't consider a backport for a tree that is closed even for
> security backports. So in your example, I'd say no to a backport to 4.2
> or 4.10.
> 
> I think there is a valid question for trees that are still open to
> security fixes but not general backports.
> 
> For these cases, I would just follow a simple rule of thumb:
> - is the submitter willing to provide the backport?
> - is the backport low-risk?
> - is the underlying bug important?
> 
> If the answer to all is "yes" then I'd go with it.

Personally I disagree, for the very simple reason of the question
going to become "Where do we draw the line?" The only non-security
backports that I consider acceptable are low-risk changes to allow
building with newer tool chains. I know other backports have
occurred in the past, and I did voice my disagreement with this
having happened.

But this is a community decision, so my opinion counts as just a
single vote.

Jan
Rahul Singh Feb. 9, 2021, 1:13 p.m. UTC | #7
Hello Stefano,

> On 8 Feb 2021, at 6:49 pm, Stefano Stabellini <sstabellini@kernel.org> wrote:
> 
> Commit 91d4eca7add broke gnttab_need_iommu_mapping on ARM.
> The offending chunk is:
> 
> #define gnttab_need_iommu_mapping(d)                    \
> -    (is_domain_direct_mapped(d) && need_iommu(d))
> +    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
> 
> On ARM we need gnttab_need_iommu_mapping to be true for dom0 when it is
> directly mapped and IOMMU is enabled for the domain, like the old check
> did, but the new check is always false.
> 
> In fact, need_iommu_pt_sync is defined as dom_iommu(d)->need_sync and
> need_sync is set as:
> 
>    if ( !is_hardware_domain(d) || iommu_hwdom_strict )
>        hd->need_sync = !iommu_use_hap_pt(d);
> 
> iommu_use_hap_pt(d) means that the page-table used by the IOMMU is the
> P2M. It is true on ARM. need_sync means that you have a separate IOMMU
> page-table and it needs to be updated for every change. need_sync is set
> to false on ARM. Hence, gnttab_need_iommu_mapping(d) is false too,
> which is wrong.
> 
> As a consequence, when using PV network from a domU on a system where
> IOMMU is on from Dom0, I get:
> 
> (XEN) smmu: /smmu@fd800000: Unhandled context fault: fsr=0x402, iova=0x8424cb148, fsynr=0xb0001, cb=0
> [   68.290307] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK
> 
> The fix is to go back to something along the lines of the old
> implementation of gnttab_need_iommu_mapping.
> 
> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> Fixes: 91d4eca7add
> Backport: 4.12+
> 
> ---
> 
> Given the severity of the bug, I would like to request this patch to be
> backported to 4.12 too, even if 4.12 is security-fixes only since Oct
> 2020.
> 
> For the 4.12 backport, we can use iommu_enabled() instead of
> is_iommu_enabled() in the implementation of gnttab_need_iommu_mapping.
> 
> Changes in v2:
> - improve commit message
> - add is_iommu_enabled(d) to the check
> ---
> xen/include/asm-arm/grant_table.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/xen/include/asm-arm/grant_table.h b/xen/include/asm-arm/grant_table.h
> index 6f585b1538..0ce77f9a1c 100644
> --- a/xen/include/asm-arm/grant_table.h
> +++ b/xen/include/asm-arm/grant_table.h
> @@ -89,7 +89,7 @@ int replace_grant_host_mapping(unsigned long gpaddr, mfn_t mfn,
>     (((i) >= nr_status_frames(t)) ? INVALID_GFN : (t)->arch.status_gfn[i])
> 
> #define gnttab_need_iommu_mapping(d)                    \
> -    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
> +    (is_domain_direct_mapped(d) && is_iommu_enabled(d))
> 
> #endif /* __ASM_GRANT_TABLE_H__ */

I tested the patch and while creating the guest I observed the below warning from Linux for block device.
https://elixir.bootlin.com/linux/v4.3/source/drivers/block/xen-blkback/xenbus.c#L258

I did initial debugging and found out that there are many calls to iommu_legacy_{,un}map() while creating the guest but when iommu_legacy_unmap() function unmap the pages something is written  wrong in page tables because of that when next time same page is mapped via create_grant_host_mapping() we observed below warning. 
 

[  138.639934] xen-blkback: backend/vbd/0/51712: using 4 queues, protocol 1 (arm-abi) persistent grants
(XEN) gnttab_mark_dirty not implemented yet
[  138.659702] xen-blkback: backend/vbd/0/51712: using 4 queues, protocol 1 (arm-abi) persistent grants
[  138.669827] vbd vbd-0-51712: 9 mapping in shared page 8 from domain 0
[  138.676636] vbd vbd-0-51712: 9 mapping ring-ref port 5
[  138.682089] ------------[ cut here ]------------
[  138.686605] WARNING: CPU: 2 PID: 37 at drivers/block/xen-blkback/xenbus.c:296 xen_blkif_disconnect+0x20c/0x230
[  138.696668] Modules linked in: bridge stp llc ipv6 nf_defrag_ipv6
[  138.702833] CPU: 2 PID: 37 Comm: xenwatch Not tainted 5.4.0-yocto-standard #1
[  138.710037] Hardware name: Arm Neoverse N1 System Development Platform (DT)
[  138.717067] pstate: 80c00005 (Nzcv daif +PAN +UAO)
[  138.721927] pc : xen_blkif_disconnect+0x20c/0x230
[  138.726701] lr : xen_blkif_disconnect+0xbc/0x230
[  138.731388] sp : ffff800011cb3c80
[  138.734773] x29: ffff800011cb3c80 x28: ffff00005b6da940 
[  138.740156] x27: 0000000000000000 x26: 0000000000000000 
[  138.745536] x25: ffff000029755060 x24: 0000000000000170 
[  138.750919] x23: ffff000029755040 x22: ffff000059c72000 
[  138.756299] x21: 0000000000000000 x20: ffff000029755000 
[  138.761681] x19: 0000000000000001 x18: 0000000000000000 
[  138.767063] x17: 0000000000000000 x16: 0000000000000000 
[  138.772444] x15: 0000000000000000 x14: 0000000000000000 
[  138.777826] x13: 0000000000000000 x12: 0000000000000000 
[  138.783207] x11: 0000000000000001 x10: 0000000000000990 
[  138.788589] x9 : 0000000000000001 x8 : 0000000000210d00 
[  138.793971] x7 : 0000000000000018 x6 : ffff00005ddf72a0 
[  138.799352] x5 : ffff800011cb3c28 x4 : 0000000000000000 
[  138.804734] x3 : ffff000029755118 x2 : 0000000000000000 
[  138.810117] x1 : ffff000029755120 x0 : 0000000000000001 
[  138.815497] Call trace:
[  138.818015]  xen_blkif_disconnect+0x20c/0x230
[  138.822442]  frontend_changed+0x1b0/0x54c
[  138.826523]  xenbus_otherend_changed+0x80/0xb0
[  138.831035]  frontend_changed+0x10/0x20
[  138.834941]  xenwatch_thread+0x80/0x144
[  138.838849]  kthread+0x118/0x120
[  138.842147]  ret_from_fork+0x10/0x18
[  138.845791] ---[ end trace fb9f0a3b3b48a55f ]—

Regards,
Rahul

> /*
> -- 
> 2.17.1
>
Ian Jackson Feb. 9, 2021, 2:03 p.m. UTC | #8
Jan Beulich writes ("Re: [PATCH v2] xen/arm: fix gnttab_need_iommu_mapping"):
> On 08.02.2021 21:24, Stefano Stabellini wrote:
...
> > For these cases, I would just follow a simple rule of thumb:
> > - is the submitter willing to provide the backport?
> > - is the backport low-risk?
> > - is the underlying bug important?
> > 
> > If the answer to all is "yes" then I'd go with it.
> 
> Personally I disagree, for the very simple reason of the question
> going to become "Where do we draw the line?" The only non-security
> backports that I consider acceptable are low-risk changes to allow
> building with newer tool chains. I know other backports have
> occurred in the past, and I did voice my disagreement with this
> having happened.

I think I take a more relaxed view than Jan, but still a much more
firm line than Stefano.  My opinion is that we should make exceptions
for only bugs of exceptional severity.

I don't think I have seen an argument that this bug is exceptionally
severe.

For me the fact that you can only experience this bug if you upgrade
the hardware or significantly change the configuration, means that
this isn't so serious a bug.

The downside of accepting this backport is not only the slippery
slope.  It is also shipping the risk of a mistake in it to people who
are using 4.12 and expect it to remain almost unchanged.

The alternative, which I think is reasonable, is to ask people who are
substantially changing their hardware and/or configuration, to also
upgrade to a newer hypervisor.  I don't see that this would be a
significant imposition, but maybe I am missing some reason why
upgrading to a newer hypervisor would add a lot of difficulty to an
upgrade which is already significant in other ways.

(I definitely agree with Jan that backports to fix build problems with
newer tools are serious because they are so blocking, so will
generally need an exception to the "security fixes only" rule.)

> But this is a community decision, so my opinion counts as just a
> single vote.

I think on this particular patch, with the information I have so far,
I am with Jan and Julien.

Thanks,
Ian.
Stefano Stabellini Feb. 9, 2021, 5:31 p.m. UTC | #9
On Tue, 9 Feb 2021, Ian Jackson wrote:
> Jan Beulich writes ("Re: [PATCH v2] xen/arm: fix gnttab_need_iommu_mapping"):
> > On 08.02.2021 21:24, Stefano Stabellini wrote:
> ...
> > > For these cases, I would just follow a simple rule of thumb:
> > > - is the submitter willing to provide the backport?
> > > - is the backport low-risk?
> > > - is the underlying bug important?
> > > 
> > > If the answer to all is "yes" then I'd go with it.
> > 
> > Personally I disagree, for the very simple reason of the question
> > going to become "Where do we draw the line?" The only non-security
> > backports that I consider acceptable are low-risk changes to allow
> > building with newer tool chains. I know other backports have
> > occurred in the past, and I did voice my disagreement with this
> > having happened.
> 
> I think I take a more relaxed view than Jan, but still a much more
> firm line than Stefano.  My opinion is that we should make exceptions
> for only bugs of exceptional severity.
> 
> I don't think I have seen an argument that this bug is exceptionally
> severe.
> 
> For me the fact that you can only experience this bug if you upgrade
> the hardware or significantly change the configuration, means that
> this isn't so serious a bug.

Yeah, I think that's really the core of this issue. If somebody is
already using 4.12 happily, there is really no reason for them to take
the fix. If somebody is about to use 4.12, then it is a severe issue.

The view of the group is that nobody should be switching to 4.12 now
because there are newer releases out there. I don't know if that is
true.

I didn't realize we had a policy or even a recommendation of always
choosing the latest among the many releases available with
security-support. I went through the website and SUPPORT.md but couldn't
find it spelled out anywhere. See:

https://xenproject.org/downloads/
https://xenproject.org/downloads/xen-project-archives/xen-project-4-12-series/
https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=SUPPORT.md;h=52f25fa85af41fa3b38288ab7e172408b77dc779;hb=97b7b5567fba6918a656ad349051b5343b5dea2e

At most we have:

    Supported-Until: 2020-10-02
    Security-Support-Until: 2022-04-02

Anecdotally, if I go to https://www.kernel.org/ to download a kernel
tarball, I expect all tarballs to have all the major functionalities. I
wouldn't imagine that I cannot get one entire Linux subsystem (e.g.
ethernet or SATA) to work if I don't pick the latest.

Maybe it would make sense to clarify which releases are discouraged from
being used on https://xenproject.org/downloads/ at least?
Stefano Stabellini Feb. 9, 2021, 8:36 p.m. UTC | #10
On Tue, 9 Feb 2021, Rahul Singh wrote:
> > On 8 Feb 2021, at 6:49 pm, Stefano Stabellini <sstabellini@kernel.org> wrote:
> > 
> > Commit 91d4eca7add broke gnttab_need_iommu_mapping on ARM.
> > The offending chunk is:
> > 
> > #define gnttab_need_iommu_mapping(d)                    \
> > -    (is_domain_direct_mapped(d) && need_iommu(d))
> > +    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
> > 
> > On ARM we need gnttab_need_iommu_mapping to be true for dom0 when it is
> > directly mapped and IOMMU is enabled for the domain, like the old check
> > did, but the new check is always false.
> > 
> > In fact, need_iommu_pt_sync is defined as dom_iommu(d)->need_sync and
> > need_sync is set as:
> > 
> >    if ( !is_hardware_domain(d) || iommu_hwdom_strict )
> >        hd->need_sync = !iommu_use_hap_pt(d);
> > 
> > iommu_use_hap_pt(d) means that the page-table used by the IOMMU is the
> > P2M. It is true on ARM. need_sync means that you have a separate IOMMU
> > page-table and it needs to be updated for every change. need_sync is set
> > to false on ARM. Hence, gnttab_need_iommu_mapping(d) is false too,
> > which is wrong.
> > 
> > As a consequence, when using PV network from a domU on a system where
> > IOMMU is on from Dom0, I get:
> > 
> > (XEN) smmu: /smmu@fd800000: Unhandled context fault: fsr=0x402, iova=0x8424cb148, fsynr=0xb0001, cb=0
> > [   68.290307] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK
> > 
> > The fix is to go back to something along the lines of the old
> > implementation of gnttab_need_iommu_mapping.
> > 
> > Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> > Fixes: 91d4eca7add
> > Backport: 4.12+
> > 
> > ---
> > 
> > Given the severity of the bug, I would like to request this patch to be
> > backported to 4.12 too, even if 4.12 is security-fixes only since Oct
> > 2020.
> > 
> > For the 4.12 backport, we can use iommu_enabled() instead of
> > is_iommu_enabled() in the implementation of gnttab_need_iommu_mapping.
> > 
> > Changes in v2:
> > - improve commit message
> > - add is_iommu_enabled(d) to the check
> > ---
> > xen/include/asm-arm/grant_table.h | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/xen/include/asm-arm/grant_table.h b/xen/include/asm-arm/grant_table.h
> > index 6f585b1538..0ce77f9a1c 100644
> > --- a/xen/include/asm-arm/grant_table.h
> > +++ b/xen/include/asm-arm/grant_table.h
> > @@ -89,7 +89,7 @@ int replace_grant_host_mapping(unsigned long gpaddr, mfn_t mfn,
> >     (((i) >= nr_status_frames(t)) ? INVALID_GFN : (t)->arch.status_gfn[i])
> > 
> > #define gnttab_need_iommu_mapping(d)                    \
> > -    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
> > +    (is_domain_direct_mapped(d) && is_iommu_enabled(d))
> > 
> > #endif /* __ASM_GRANT_TABLE_H__ */
> 
> I tested the patch and while creating the guest I observed the below warning from Linux for block device.
> https://elixir.bootlin.com/linux/v4.3/source/drivers/block/xen-blkback/xenbus.c#L258

So you are creating a guest with "xl create" in dom0 and you see the
warnings below printed by the Dom0 kernel? I imagine the domU has a
virtual "disk" of some sort.


> I did initial debugging and found out that there are many calls to iommu_legacy_{,un}map() while creating the guest but when iommu_legacy_unmap() function unmap the pages something is written  wrong in page tables because of that when next time same page is mapped via create_grant_host_mapping() we observed below warning. 

By "while creating a guest", do you mean before the domU is even
unpaused? Hence, the calls are a result of dom0 operations? Or after
domU is unpaused, hence, the calls are a result of domU operations
(probably the domU simply trying to access its virtual disk)?
Please note that you can start a guest paused with xl create -p.

Looking at the logs, it is probably the latter. The following line
should be printed when the domU PV block frontend connects to the
backend in dom0:

[  138.639934] xen-blkback: backend/vbd/0/51712: using 4 queues, protocol 1 (arm-abi) persistent grants

I'll see if I can repro the issue here.

 
> [  138.639934] xen-blkback: backend/vbd/0/51712: using 4 queues, protocol 1 (arm-abi) persistent grants
> (XEN) gnttab_mark_dirty not implemented yet
> [  138.659702] xen-blkback: backend/vbd/0/51712: using 4 queues, protocol 1 (arm-abi) persistent grants
> [  138.669827] vbd vbd-0-51712: 9 mapping in shared page 8 from domain 0
> [  138.676636] vbd vbd-0-51712: 9 mapping ring-ref port 5
> [  138.682089] ------------[ cut here ]------------
> [  138.686605] WARNING: CPU: 2 PID: 37 at drivers/block/xen-blkback/xenbus.c:296 xen_blkif_disconnect+0x20c/0x230
> [  138.696668] Modules linked in: bridge stp llc ipv6 nf_defrag_ipv6
> [  138.702833] CPU: 2 PID: 37 Comm: xenwatch Not tainted 5.4.0-yocto-standard #1
> [  138.710037] Hardware name: Arm Neoverse N1 System Development Platform (DT)
> [  138.717067] pstate: 80c00005 (Nzcv daif +PAN +UAO)
> [  138.721927] pc : xen_blkif_disconnect+0x20c/0x230
> [  138.726701] lr : xen_blkif_disconnect+0xbc/0x230
> [  138.731388] sp : ffff800011cb3c80
> [  138.734773] x29: ffff800011cb3c80 x28: ffff00005b6da940 
> [  138.740156] x27: 0000000000000000 x26: 0000000000000000 
> [  138.745536] x25: ffff000029755060 x24: 0000000000000170 
> [  138.750919] x23: ffff000029755040 x22: ffff000059c72000 
> [  138.756299] x21: 0000000000000000 x20: ffff000029755000 
> [  138.761681] x19: 0000000000000001 x18: 0000000000000000 
> [  138.767063] x17: 0000000000000000 x16: 0000000000000000 
> [  138.772444] x15: 0000000000000000 x14: 0000000000000000 
> [  138.777826] x13: 0000000000000000 x12: 0000000000000000 
> [  138.783207] x11: 0000000000000001 x10: 0000000000000990 
> [  138.788589] x9 : 0000000000000001 x8 : 0000000000210d00 
> [  138.793971] x7 : 0000000000000018 x6 : ffff00005ddf72a0 
> [  138.799352] x5 : ffff800011cb3c28 x4 : 0000000000000000 
> [  138.804734] x3 : ffff000029755118 x2 : 0000000000000000 
> [  138.810117] x1 : ffff000029755120 x0 : 0000000000000001 
> [  138.815497] Call trace:
> [  138.818015]  xen_blkif_disconnect+0x20c/0x230
> [  138.822442]  frontend_changed+0x1b0/0x54c
> [  138.826523]  xenbus_otherend_changed+0x80/0xb0
> [  138.831035]  frontend_changed+0x10/0x20
> [  138.834941]  xenwatch_thread+0x80/0x144
> [  138.838849]  kthread+0x118/0x120
> [  138.842147]  ret_from_fork+0x10/0x18
> [  138.845791] ---[ end trace fb9f0a3b3b48a55f ]—
Julien Grall Feb. 9, 2021, 9:04 p.m. UTC | #11
On Tue, 9 Feb 2021 at 17:31, Stefano Stabellini <sstabellini@kernel.org> wrote:
>
> On Tue, 9 Feb 2021, Ian Jackson wrote:
> > Jan Beulich writes ("Re: [PATCH v2] xen/arm: fix gnttab_need_iommu_mapping"):
> > > On 08.02.2021 21:24, Stefano Stabellini wrote:
> > ...
> > > > For these cases, I would just follow a simple rule of thumb:
> > > > - is the submitter willing to provide the backport?
> > > > - is the backport low-risk?
> > > > - is the underlying bug important?
> > > >
> > > > If the answer to all is "yes" then I'd go with it.
> > >
> > > Personally I disagree, for the very simple reason of the question
> > > going to become "Where do we draw the line?" The only non-security
> > > backports that I consider acceptable are low-risk changes to allow
> > > building with newer tool chains. I know other backports have
> > > occurred in the past, and I did voice my disagreement with this
> > > having happened.
> >
> > I think I take a more relaxed view than Jan, but still a much more
> > firm line than Stefano.  My opinion is that we should make exceptions
> > for only bugs of exceptional severity.
> >
> > I don't think I have seen an argument that this bug is exceptionally
> > severe.
> >
> > For me the fact that you can only experience this bug if you upgrade
> > the hardware or significantly change the configuration, means that
> > this isn't so serious a bug.
>
> Yeah, I think that's really the core of this issue. If somebody is
> already using 4.12 happily, there is really no reason for them to take
> the fix. If somebody is about to use 4.12, then it is a severe issue.

If somebody is about to use 4.12, then it is most likely going to
encounter a serious blocker as there is no support for generic SMMU
bindings. I would be surprised if there are a lot of DT still using
the old bindings, at which point such users would want to switch to
4.15 + your patches to add support.

>
> The view of the group is that nobody should be switching to 4.12 now
> because there are newer releases out there. I don't know if that is
> true.

This is mostly based on the definition of supported vs security
supported. When a tree is only security supported, then there is no
promise the code will run on any systems.

>
> I didn't realize we had a policy or even a recommendation of always
> choosing the latest among the many releases available with
> security-support. I went through the website and SUPPORT.md but couldn't
> find it spelled out anywhere. See:

May I ask, what sort of users would want to start a
development/product based on a soon to be abandoned version?

For any new development, I have always advised to switch to the latest
Xen (or at least stable) because it will contain the latest fixes,
features, and better support because the code is still in mind...

>
> https://xenproject.org/downloads/
> https://xenproject.org/downloads/xen-project-archives/xen-project-4-12-series/
> https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=SUPPORT.md;h=52f25fa85af41fa3b38288ab7e172408b77dc779;hb=97b7b5567fba6918a656ad349051b5343b5dea2e
>
> At most we have:
>
>     Supported-Until: 2020-10-02
>     Security-Support-Until: 2022-04-02
>
> Anecdotally, if I go to https://www.kernel.org/ to download a kernel
> tarball, I expect all tarballs to have all the major functionalities. I
> wouldn't imagine that I cannot get one entire Linux subsystem (e.g.
> ethernet or SATA) to work if I don't pick the latest.

I think this is an unrealistic expectation... You can't pick any
version of stable Linux and expect it to work on your shiny HW. There
might be missing drivers, workaround (including in core subsystems)...

>
> Maybe it would make sense to clarify which releases are discouraged from
> being used on https://xenproject.org/downloads/ at least?

Feel free to suggest a wording that can be discussed here.

Cheers,
Stefano Stabellini Feb. 9, 2021, 9:15 p.m. UTC | #12
On Tue, 9 Feb 2021, Stefano Stabellini wrote:
> On Tue, 9 Feb 2021, Rahul Singh wrote:
> > > On 8 Feb 2021, at 6:49 pm, Stefano Stabellini <sstabellini@kernel.org> wrote:
> > > 
> > > Commit 91d4eca7add broke gnttab_need_iommu_mapping on ARM.
> > > The offending chunk is:
> > > 
> > > #define gnttab_need_iommu_mapping(d)                    \
> > > -    (is_domain_direct_mapped(d) && need_iommu(d))
> > > +    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
> > > 
> > > On ARM we need gnttab_need_iommu_mapping to be true for dom0 when it is
> > > directly mapped and IOMMU is enabled for the domain, like the old check
> > > did, but the new check is always false.
> > > 
> > > In fact, need_iommu_pt_sync is defined as dom_iommu(d)->need_sync and
> > > need_sync is set as:
> > > 
> > >    if ( !is_hardware_domain(d) || iommu_hwdom_strict )
> > >        hd->need_sync = !iommu_use_hap_pt(d);
> > > 
> > > iommu_use_hap_pt(d) means that the page-table used by the IOMMU is the
> > > P2M. It is true on ARM. need_sync means that you have a separate IOMMU
> > > page-table and it needs to be updated for every change. need_sync is set
> > > to false on ARM. Hence, gnttab_need_iommu_mapping(d) is false too,
> > > which is wrong.
> > > 
> > > As a consequence, when using PV network from a domU on a system where
> > > IOMMU is on from Dom0, I get:
> > > 
> > > (XEN) smmu: /smmu@fd800000: Unhandled context fault: fsr=0x402, iova=0x8424cb148, fsynr=0xb0001, cb=0
> > > [   68.290307] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK
> > > 
> > > The fix is to go back to something along the lines of the old
> > > implementation of gnttab_need_iommu_mapping.
> > > 
> > > Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> > > Fixes: 91d4eca7add
> > > Backport: 4.12+
> > > 
> > > ---
> > > 
> > > Given the severity of the bug, I would like to request this patch to be
> > > backported to 4.12 too, even if 4.12 is security-fixes only since Oct
> > > 2020.
> > > 
> > > For the 4.12 backport, we can use iommu_enabled() instead of
> > > is_iommu_enabled() in the implementation of gnttab_need_iommu_mapping.
> > > 
> > > Changes in v2:
> > > - improve commit message
> > > - add is_iommu_enabled(d) to the check
> > > ---
> > > xen/include/asm-arm/grant_table.h | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/xen/include/asm-arm/grant_table.h b/xen/include/asm-arm/grant_table.h
> > > index 6f585b1538..0ce77f9a1c 100644
> > > --- a/xen/include/asm-arm/grant_table.h
> > > +++ b/xen/include/asm-arm/grant_table.h
> > > @@ -89,7 +89,7 @@ int replace_grant_host_mapping(unsigned long gpaddr, mfn_t mfn,
> > >     (((i) >= nr_status_frames(t)) ? INVALID_GFN : (t)->arch.status_gfn[i])
> > > 
> > > #define gnttab_need_iommu_mapping(d)                    \
> > > -    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
> > > +    (is_domain_direct_mapped(d) && is_iommu_enabled(d))
> > > 
> > > #endif /* __ASM_GRANT_TABLE_H__ */
> > 
> > I tested the patch and while creating the guest I observed the below warning from Linux for block device.
> > https://elixir.bootlin.com/linux/v4.3/source/drivers/block/xen-blkback/xenbus.c#L258
> 
> So you are creating a guest with "xl create" in dom0 and you see the
> warnings below printed by the Dom0 kernel? I imagine the domU has a
> virtual "disk" of some sort.
> 
> 
> > I did initial debugging and found out that there are many calls to iommu_legacy_{,un}map() while creating the guest but when iommu_legacy_unmap() function unmap the pages something is written  wrong in page tables because of that when next time same page is mapped via create_grant_host_mapping() we observed below warning. 
> 
> By "while creating a guest", do you mean before the domU is even
> unpaused? Hence, the calls are a result of dom0 operations? Or after
> domU is unpaused, hence, the calls are a result of domU operations
> (probably the domU simply trying to access its virtual disk)?
> Please note that you can start a guest paused with xl create -p.
> 
> Looking at the logs, it is probably the latter. The following line
> should be printed when the domU PV block frontend connects to the
> backend in dom0:
> 
> [  138.639934] xen-blkback: backend/vbd/0/51712: using 4 queues, protocol 1 (arm-abi) persistent grants
> 
> I'll see if I can repro the issue here.

I cannot repro the issue at my end. If there is anything interesting in
your setup that is necessary to reproduce the problem please share the
details.

Looking at the code, xen_blkif_disconnect is called by frontend_changed
on XenbusStateConnected. I don't know for sure why we need to call
xen_blkif_disconnect() before connecting, but I imagine it is to make sure
we are starting from a clean slate.

In the case of domU creation, I'd imagine the xen_blkif_disconnect() is
unnecessary. And, in fact, in my setup blkif->nr_rings is 0 so the loop
at the beginning of xen_blkif_disconnect is not even entered.

I wonder why it is not the same at your end. I don't know how it could
be that blkif->nr_rings is not 0 if you are just starting the VM for the
first time. Also, are you sure that this is related to this patch?

  
> > [  138.639934] xen-blkback: backend/vbd/0/51712: using 4 queues, protocol 1 (arm-abi) persistent grants
> > (XEN) gnttab_mark_dirty not implemented yet
> > [  138.659702] xen-blkback: backend/vbd/0/51712: using 4 queues, protocol 1 (arm-abi) persistent grants
> > [  138.669827] vbd vbd-0-51712: 9 mapping in shared page 8 from domain 0
> > [  138.676636] vbd vbd-0-51712: 9 mapping ring-ref port 5
> > [  138.682089] ------------[ cut here ]------------
> > [  138.686605] WARNING: CPU: 2 PID: 37 at drivers/block/xen-blkback/xenbus.c:296 xen_blkif_disconnect+0x20c/0x230
> > [  138.696668] Modules linked in: bridge stp llc ipv6 nf_defrag_ipv6
> > [  138.702833] CPU: 2 PID: 37 Comm: xenwatch Not tainted 5.4.0-yocto-standard #1
> > [  138.710037] Hardware name: Arm Neoverse N1 System Development Platform (DT)
> > [  138.717067] pstate: 80c00005 (Nzcv daif +PAN +UAO)
> > [  138.721927] pc : xen_blkif_disconnect+0x20c/0x230
> > [  138.726701] lr : xen_blkif_disconnect+0xbc/0x230
> > [  138.731388] sp : ffff800011cb3c80
> > [  138.734773] x29: ffff800011cb3c80 x28: ffff00005b6da940 
> > [  138.740156] x27: 0000000000000000 x26: 0000000000000000 
> > [  138.745536] x25: ffff000029755060 x24: 0000000000000170 
> > [  138.750919] x23: ffff000029755040 x22: ffff000059c72000 
> > [  138.756299] x21: 0000000000000000 x20: ffff000029755000 
> > [  138.761681] x19: 0000000000000001 x18: 0000000000000000 
> > [  138.767063] x17: 0000000000000000 x16: 0000000000000000 
> > [  138.772444] x15: 0000000000000000 x14: 0000000000000000 
> > [  138.777826] x13: 0000000000000000 x12: 0000000000000000 
> > [  138.783207] x11: 0000000000000001 x10: 0000000000000990 
> > [  138.788589] x9 : 0000000000000001 x8 : 0000000000210d00 
> > [  138.793971] x7 : 0000000000000018 x6 : ffff00005ddf72a0 
> > [  138.799352] x5 : ffff800011cb3c28 x4 : 0000000000000000 
> > [  138.804734] x3 : ffff000029755118 x2 : 0000000000000000 
> > [  138.810117] x1 : ffff000029755120 x0 : 0000000000000001 
> > [  138.815497] Call trace:
> > [  138.818015]  xen_blkif_disconnect+0x20c/0x230
> > [  138.822442]  frontend_changed+0x1b0/0x54c
> > [  138.826523]  xenbus_otherend_changed+0x80/0xb0
> > [  138.831035]  frontend_changed+0x10/0x20
> > [  138.834941]  xenwatch_thread+0x80/0x144
> > [  138.838849]  kthread+0x118/0x120
> > [  138.842147]  ret_from_fork+0x10/0x18
> > [  138.845791] ---[ end trace fb9f0a3b3b48a55f ]—
>
Stefano Stabellini Feb. 10, 2021, 1:35 a.m. UTC | #13
On Tue, 9 Feb 2021, Julien Grall wrote:
> On Tue, 9 Feb 2021 at 17:31, Stefano Stabellini <sstabellini@kernel.org> wrote:
> >
> > On Tue, 9 Feb 2021, Ian Jackson wrote:
> > > Jan Beulich writes ("Re: [PATCH v2] xen/arm: fix gnttab_need_iommu_mapping"):
> > > > On 08.02.2021 21:24, Stefano Stabellini wrote:
> > > ...
> > > > > For these cases, I would just follow a simple rule of thumb:
> > > > > - is the submitter willing to provide the backport?
> > > > > - is the backport low-risk?
> > > > > - is the underlying bug important?
> > > > >
> > > > > If the answer to all is "yes" then I'd go with it.
> > > >
> > > > Personally I disagree, for the very simple reason of the question
> > > > going to become "Where do we draw the line?" The only non-security
> > > > backports that I consider acceptable are low-risk changes to allow
> > > > building with newer tool chains. I know other backports have
> > > > occurred in the past, and I did voice my disagreement with this
> > > > having happened.
> > >
> > > I think I take a more relaxed view than Jan, but still a much more
> > > firm line than Stefano.  My opinion is that we should make exceptions
> > > for only bugs of exceptional severity.
> > >
> > > I don't think I have seen an argument that this bug is exceptionally
> > > severe.
> > >
> > > For me the fact that you can only experience this bug if you upgrade
> > > the hardware or significantly change the configuration, means that
> > > this isn't so serious a bug.
> >
> > Yeah, I think that's really the core of this issue. If somebody is
> > already using 4.12 happily, there is really no reason for them to take
> > the fix. If somebody is about to use 4.12, then it is a severe issue.
> 
> If somebody is about to use 4.12, then it is most likely going to
> encounter a serious blocker as there is no support for generic SMMU
> bindings. I would be surprised if there are a lot of DT still using
> the old bindings, at which point such users would want to switch to
> 4.15 + your patches to add support.
> 
> >
> > The view of the group is that nobody should be switching to 4.12 now
> > because there are newer releases out there. I don't know if that is
> > true.
> 
> This is mostly based on the definition of supported vs security
> supported. When a tree is only security supported, then there is no
> promise the code will run on any systems.
> 
> >
> > I didn't realize we had a policy or even a recommendation of always
> > choosing the latest among the many releases available with
> > security-support. I went through the website and SUPPORT.md but couldn't
> > find it spelled out anywhere. See:
> 
> May I ask, what sort of users would want to start a
> development/product based on a soon to be abandoned version?
> 
> For any new development, I have always advised to switch to the latest
> Xen (or at least stable) because it will contain the latest fixes,
> features, and better support because the code is still in mind...

I don't have an answer -- I hope nobody.


> > https://xenproject.org/downloads/
> > https://xenproject.org/downloads/xen-project-archives/xen-project-4-12-series/
> > https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=SUPPORT.md;h=52f25fa85af41fa3b38288ab7e172408b77dc779;hb=97b7b5567fba6918a656ad349051b5343b5dea2e
> >
> > At most we have:
> >
> >     Supported-Until: 2020-10-02
> >     Security-Support-Until: 2022-04-02
> >
> > Anecdotally, if I go to https://www.kernel.org/ to download a kernel
> > tarball, I expect all tarballs to have all the major functionalities. I
> > wouldn't imagine that I cannot get one entire Linux subsystem (e.g.
> > ethernet or SATA) to work if I don't pick the latest.
> 
> I think this is an unrealistic expectation... You can't pick any
> version of stable Linux and expect it to work on your shiny HW. There
> might be missing drivers, workaround (including in core subsystems)...

I don't mean to insist as of course I accept the group decision and I
don't care about 4.12 that much (we don't use it as Xilinx). But this is
not about new hardware. This regression affects old hardware too. So yes
if an old Linux version worked on my HW I expect any of the slightly
newer (but still old) tarballs on kernel.org to work on the same HW.
That said I noticed that kernel.org seems to have only supported
releases on https://www.kernel.org/, while we have a mix.


> > Maybe it would make sense to clarify which releases are discouraged from
> > being used on https://xenproject.org/downloads/ at least?
> 
> Feel free to suggest a wording that can be discussed here.

Distinguishing between supported and not supported releases would be a
start. Maybe with a one line statement to recommend people to always use
supported (not just security-supported, fully supported) releases.
George Dunlap Feb. 10, 2021, 11:04 a.m. UTC | #14
> On Feb 9, 2021, at 5:31 PM, Stefano Stabellini <sstabellini@kernel.org> wrote:
> 
> On Tue, 9 Feb 2021, Ian Jackson wrote:
>> Jan Beulich writes ("Re: [PATCH v2] xen/arm: fix gnttab_need_iommu_mapping"):
>>> On 08.02.2021 21:24, Stefano Stabellini wrote:
>> ...
>>>> For these cases, I would just follow a simple rule of thumb:
>>>> - is the submitter willing to provide the backport?
>>>> - is the backport low-risk?
>>>> - is the underlying bug important?
>>>> 
>>>> If the answer to all is "yes" then I'd go with it.
>>> 
>>> Personally I disagree, for the very simple reason of the question
>>> going to become "Where do we draw the line?" The only non-security
>>> backports that I consider acceptable are low-risk changes to allow
>>> building with newer tool chains. I know other backports have
>>> occurred in the past, and I did voice my disagreement with this
>>> having happened.
>> 
>> I think I take a more relaxed view than Jan, but still a much more
>> firm line than Stefano.  My opinion is that we should make exceptions
>> for only bugs of exceptional severity.
>> 
>> I don't think I have seen an argument that this bug is exceptionally
>> severe.
>> 
>> For me the fact that you can only experience this bug if you upgrade
>> the hardware or significantly change the configuration, means that
>> this isn't so serious a bug.
> 
> Yeah, I think that's really the core of this issue. If somebody is
> already using 4.12 happily, there is really no reason for them to take
> the fix. If somebody is about to use 4.12, then it is a severe issue.
> 
> The view of the group is that nobody should be switching to 4.12 now
> because there are newer releases out there. I don't know if that is
> true.

I don’t think we have to tell people what to do; rather, we need to tell people what *we* are going to do.  We’ve told people that we’ll provide normal bug-fixes until 2020-10-02, and security bugfixes until 2022-04-02. Anyone is welcome to use 4.12 whenever they want to; but if they do, they take the risk that there will be a “normal” bug that doesn’t get fixed by upstream.  The patch is available, anyone who wants to can apply it themselves (at their own risk, of course).

We don’t seem to have anything explicitly written down anywhere, but whenever someone has come to us using some obsolete version of Xen, we’ve recommended they use the most recent stable release.

The one reason someone might use 4.12 over a newer version is if that’s still the one packaged by their distro.  In that case, the distro should either update to a newer version, or take on the effort of supporting the old version by backporting bugfixes.  If such a distro chooses not to do so, then a user should consider switching to a different distro.

The purpose of having such a policy is to avoid needing to have this discussion about every single patch that comes up; and the number was chosen to balance effort for testing and maintainers against value to downstreams.  Certainly backposting Just This One patch won’t be a huge amount of work; but I completely sympathize with Jan’s point of view that if we start to make exceptions, people will begin to expect them.  We’ve said what we’re going to do, we should just stick with it.

Now maybe the 18-month “full-support" window is too small.  We seem to have lots of downstreams (at least SuSE and XenServer) who support much older versions; coordinating on versions to designate LTS releases to avoid duplication of effort and share the benefits of that effort with the wider community might be something worth considering.  But that’s a completely different discussion.

 -George
Rahul Singh Feb. 10, 2021, 3:06 p.m. UTC | #15
Hello Stefano,

> On 9 Feb 2021, at 8:36 pm, Stefano Stabellini <sstabellini@kernel.org> wrote:
> 
> On Tue, 9 Feb 2021, Rahul Singh wrote:
>>> On 8 Feb 2021, at 6:49 pm, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>> 
>>> Commit 91d4eca7add broke gnttab_need_iommu_mapping on ARM.
>>> The offending chunk is:
>>> 
>>> #define gnttab_need_iommu_mapping(d)                    \
>>> -    (is_domain_direct_mapped(d) && need_iommu(d))
>>> +    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
>>> 
>>> On ARM we need gnttab_need_iommu_mapping to be true for dom0 when it is
>>> directly mapped and IOMMU is enabled for the domain, like the old check
>>> did, but the new check is always false.
>>> 
>>> In fact, need_iommu_pt_sync is defined as dom_iommu(d)->need_sync and
>>> need_sync is set as:
>>> 
>>>   if ( !is_hardware_domain(d) || iommu_hwdom_strict )
>>>       hd->need_sync = !iommu_use_hap_pt(d);
>>> 
>>> iommu_use_hap_pt(d) means that the page-table used by the IOMMU is the
>>> P2M. It is true on ARM. need_sync means that you have a separate IOMMU
>>> page-table and it needs to be updated for every change. need_sync is set
>>> to false on ARM. Hence, gnttab_need_iommu_mapping(d) is false too,
>>> which is wrong.
>>> 
>>> As a consequence, when using PV network from a domU on a system where
>>> IOMMU is on from Dom0, I get:
>>> 
>>> (XEN) smmu: /smmu@fd800000: Unhandled context fault: fsr=0x402, iova=0x8424cb148, fsynr=0xb0001, cb=0
>>> [   68.290307] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK
>>> 
>>> The fix is to go back to something along the lines of the old
>>> implementation of gnttab_need_iommu_mapping.
>>> 
>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
>>> Fixes: 91d4eca7add
>>> Backport: 4.12+
>>> 
>>> ---
>>> 
>>> Given the severity of the bug, I would like to request this patch to be
>>> backported to 4.12 too, even if 4.12 is security-fixes only since Oct
>>> 2020.
>>> 
>>> For the 4.12 backport, we can use iommu_enabled() instead of
>>> is_iommu_enabled() in the implementation of gnttab_need_iommu_mapping.
>>> 
>>> Changes in v2:
>>> - improve commit message
>>> - add is_iommu_enabled(d) to the check
>>> ---
>>> xen/include/asm-arm/grant_table.h | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>> 
>>> diff --git a/xen/include/asm-arm/grant_table.h b/xen/include/asm-arm/grant_table.h
>>> index 6f585b1538..0ce77f9a1c 100644
>>> --- a/xen/include/asm-arm/grant_table.h
>>> +++ b/xen/include/asm-arm/grant_table.h
>>> @@ -89,7 +89,7 @@ int replace_grant_host_mapping(unsigned long gpaddr, mfn_t mfn,
>>>    (((i) >= nr_status_frames(t)) ? INVALID_GFN : (t)->arch.status_gfn[i])
>>> 
>>> #define gnttab_need_iommu_mapping(d)                    \
>>> -    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
>>> +    (is_domain_direct_mapped(d) && is_iommu_enabled(d))
>>> 
>>> #endif /* __ASM_GRANT_TABLE_H__ */
>> 
>> I tested the patch and while creating the guest I observed the below warning from Linux for block device.
>> https://elixir.bootlin.com/linux/v4.3/source/drivers/block/xen-blkback/xenbus.c#L258
> 
> So you are creating a guest with "xl create" in dom0 and you see the
> warnings below printed by the Dom0 kernel? I imagine the domU has a
> virtual "disk" of some sort.

Yes you are right I am trying to create the guest with "xl create” and before that, I created the logical volume and trying to attach the logical volume
block to the domain with “xl block-attach”. I observed this error with the "xl block-attach” command.

This issue occurs after applying this patch as what I observed this patch introduce the calls to iommu_legacy_{, un}map() to map the grant pages for
IOMMU that touches the page-tables. I am not sure but what I observed is that something is written wrong when iomm_unmap calls unmap the pages
because of that issue is observed.

You can reproduce the issue by just creating the dummy image file and try to attach it with “xl block-attach”

dd if=/dev/zero of=test.img bs=1024 count=0 seek=1024
xl block-attach 0 phy:test.img xvda w

Sequence of command that I follow is as follows to reproduce the issue:  

lvs vg-xen/myguest
lvcreate -y -L 4G -n myguest vg-xen
parted -s /dev/vg-xen/myguest mklabel msdos
parted -s /dev/vg-xen/myguest unit MB mkpart primary 1 4096
sync
xl block-attach 0 phy:/dev/vg-xen/myguest xvda w

libxl: error: libxl_xshelp.c:201:libxl__xs_read_mandatory: xenstore read failed: `/libxl/0/type': No such file or directory
libxl: warning: libxl_dom.c:51:libxl__domain_type: unable to get domain type for domid=0, assuming HVM
[  162.632232] xen-blkback: backend/vbd/0/51712: using 4 queues, protocol 1 (arm-abi) persistent grants
[  162.764847] vbd vbd-0-51712: 9 mapping in shared page 8 from domain 0
[  162.771740] vbd vbd-0-51712: 9 mapping ring-ref port 5
[  162.777650] ------------[ cut here ]------------
[  162.782167] WARNING: CPU: 2 PID: 37 at drivers/block/xen-blkback/xenbus.c:296 xen_blkif_disconnect+0x20c/0x230
[  162.792230] Modules linked in: bridge stp llc ipv6 nf_defrag_ipv6
[  162.798394] CPU: 2 PID: 37 Comm: xenwatch Not tainted 5.4.0-yocto-standard #1
[  162.805597] Hardware name: Arm Neoverse N1 System Development Platform (DT)
[  162.812630] pstate: 80c00005 (Nzcv daif +PAN +UAO)
[  162.817489] pc : xen_blkif_disconnect+0x20c/0x230
[  162.822262] lr : xen_blkif_disconnect+0xbc/0x230
[  162.826949] sp : ffff800011cb3c80
…….

Regards,
Rahul

> 
> 
>> I did initial debugging and found out that there are many calls to iommu_legacy_{,un}map() while creating the guest but when iommu_legacy_unmap() function unmap the pages something is written  wrong in page tables because of that when next time same page is mapped via create_grant_host_mapping() we observed below warning. 
> 
> By "while creating a guest", do you mean before the domU is even
> unpaused? Hence, the calls are a result of dom0 operations?
> Or after
> domU is unpaused, hence, the calls are a result of domU operations
> (probably the domU simply trying to access its virtual disk)?
> Please note that you can start a guest paused with xl create -p.
> 
> Looking at the logs, it is probably the latter. The following line
> should be printed when the domU PV block frontend connects to the
> backend in dom0:
> 
> [  138.639934] xen-blkback: backend/vbd/0/51712: using 4 queues, protocol 1 (arm-abi) persistent grants
> 
> I'll see if I can repro the issue here.
> 
> 
>> [  138.639934] xen-blkback: backend/vbd/0/51712: using 4 queues, protocol 1 (arm-abi) persistent grants
>> (XEN) gnttab_mark_dirty not implemented yet
>> [  138.659702] xen-blkback: backend/vbd/0/51712: using 4 queues, protocol 1 (arm-abi) persistent grants
>> [  138.669827] vbd vbd-0-51712: 9 mapping in shared page 8 from domain 0
>> [  138.676636] vbd vbd-0-51712: 9 mapping ring-ref port 5
>> [  138.682089] ------------[ cut here ]------------
>> [  138.686605] WARNING: CPU: 2 PID: 37 at drivers/block/xen-blkback/xenbus.c:296 xen_blkif_disconnect+0x20c/0x230
>> [  138.696668] Modules linked in: bridge stp llc ipv6 nf_defrag_ipv6
>> [  138.702833] CPU: 2 PID: 37 Comm: xenwatch Not tainted 5.4.0-yocto-standard #1
>> [  138.710037] Hardware name: Arm Neoverse N1 System Development Platform (DT)
>> [  138.717067] pstate: 80c00005 (Nzcv daif +PAN +UAO)
>> [  138.721927] pc : xen_blkif_disconnect+0x20c/0x230
>> [  138.726701] lr : xen_blkif_disconnect+0xbc/0x230
>> [  138.731388] sp : ffff800011cb3c80
>> [  138.734773] x29: ffff800011cb3c80 x28: ffff00005b6da940 
>> [  138.740156] x27: 0000000000000000 x26: 0000000000000000 
>> [  138.745536] x25: ffff000029755060 x24: 0000000000000170 
>> [  138.750919] x23: ffff000029755040 x22: ffff000059c72000 
>> [  138.756299] x21: 0000000000000000 x20: ffff000029755000 
>> [  138.761681] x19: 0000000000000001 x18: 0000000000000000 
>> [  138.767063] x17: 0000000000000000 x16: 0000000000000000 
>> [  138.772444] x15: 0000000000000000 x14: 0000000000000000 
>> [  138.777826] x13: 0000000000000000 x12: 0000000000000000 
>> [  138.783207] x11: 0000000000000001 x10: 0000000000000990 
>> [  138.788589] x9 : 0000000000000001 x8 : 0000000000210d00 
>> [  138.793971] x7 : 0000000000000018 x6 : ffff00005ddf72a0 
>> [  138.799352] x5 : ffff800011cb3c28 x4 : 0000000000000000 
>> [  138.804734] x3 : ffff000029755118 x2 : 0000000000000000 
>> [  138.810117] x1 : ffff000029755120 x0 : 0000000000000001 
>> [  138.815497] Call trace:
>> [  138.818015]  xen_blkif_disconnect+0x20c/0x230
>> [  138.822442]  frontend_changed+0x1b0/0x54c
>> [  138.826523]  xenbus_otherend_changed+0x80/0xb0
>> [  138.831035]  frontend_changed+0x10/0x20
>> [  138.834941]  xenwatch_thread+0x80/0x144
>> [  138.838849]  kthread+0x118/0x120
>> [  138.842147]  ret_from_fork+0x10/0x18
>> [  138.845791] ---[ end trace fb9f0a3b3b48a55f ]—
Julien Grall Feb. 10, 2021, 5:34 p.m. UTC | #16
Hi,

On 10/02/2021 15:06, Rahul Singh wrote:
>> On 9 Feb 2021, at 8:36 pm, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>
>> On Tue, 9 Feb 2021, Rahul Singh wrote:
>>>> On 8 Feb 2021, at 6:49 pm, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>>>
>>>> Commit 91d4eca7add broke gnttab_need_iommu_mapping on ARM.
>>>> The offending chunk is:
>>>>
>>>> #define gnttab_need_iommu_mapping(d)                    \
>>>> -    (is_domain_direct_mapped(d) && need_iommu(d))
>>>> +    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
>>>>
>>>> On ARM we need gnttab_need_iommu_mapping to be true for dom0 when it is
>>>> directly mapped and IOMMU is enabled for the domain, like the old check
>>>> did, but the new check is always false.
>>>>
>>>> In fact, need_iommu_pt_sync is defined as dom_iommu(d)->need_sync and
>>>> need_sync is set as:
>>>>
>>>>    if ( !is_hardware_domain(d) || iommu_hwdom_strict )
>>>>        hd->need_sync = !iommu_use_hap_pt(d);
>>>>
>>>> iommu_use_hap_pt(d) means that the page-table used by the IOMMU is the
>>>> P2M. It is true on ARM. need_sync means that you have a separate IOMMU
>>>> page-table and it needs to be updated for every change. need_sync is set
>>>> to false on ARM. Hence, gnttab_need_iommu_mapping(d) is false too,
>>>> which is wrong.
>>>>
>>>> As a consequence, when using PV network from a domU on a system where
>>>> IOMMU is on from Dom0, I get:
>>>>
>>>> (XEN) smmu: /smmu@fd800000: Unhandled context fault: fsr=0x402, iova=0x8424cb148, fsynr=0xb0001, cb=0
>>>> [   68.290307] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK
>>>>
>>>> The fix is to go back to something along the lines of the old
>>>> implementation of gnttab_need_iommu_mapping.
>>>>
>>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
>>>> Fixes: 91d4eca7add
>>>> Backport: 4.12+
>>>>
>>>> ---
>>>>
>>>> Given the severity of the bug, I would like to request this patch to be
>>>> backported to 4.12 too, even if 4.12 is security-fixes only since Oct
>>>> 2020.
>>>>
>>>> For the 4.12 backport, we can use iommu_enabled() instead of
>>>> is_iommu_enabled() in the implementation of gnttab_need_iommu_mapping.
>>>>
>>>> Changes in v2:
>>>> - improve commit message
>>>> - add is_iommu_enabled(d) to the check
>>>> ---
>>>> xen/include/asm-arm/grant_table.h | 2 +-
>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/xen/include/asm-arm/grant_table.h b/xen/include/asm-arm/grant_table.h
>>>> index 6f585b1538..0ce77f9a1c 100644
>>>> --- a/xen/include/asm-arm/grant_table.h
>>>> +++ b/xen/include/asm-arm/grant_table.h
>>>> @@ -89,7 +89,7 @@ int replace_grant_host_mapping(unsigned long gpaddr, mfn_t mfn,
>>>>     (((i) >= nr_status_frames(t)) ? INVALID_GFN : (t)->arch.status_gfn[i])
>>>>
>>>> #define gnttab_need_iommu_mapping(d)                    \
>>>> -    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
>>>> +    (is_domain_direct_mapped(d) && is_iommu_enabled(d))
>>>>
>>>> #endif /* __ASM_GRANT_TABLE_H__ */
>>>
>>> I tested the patch and while creating the guest I observed the below warning from Linux for block device.
>>> https://elixir.bootlin.com/linux/v4.3/source/drivers/block/xen-blkback/xenbus.c#L258
>>
>> So you are creating a guest with "xl create" in dom0 and you see the
>> warnings below printed by the Dom0 kernel? I imagine the domU has a
>> virtual "disk" of some sort.
> 
> Yes you are right I am trying to create the guest with "xl create” and before that, I created the logical volume and trying to attach the logical volume
> block to the domain with “xl block-attach”. I observed this error with the "xl block-attach” command.
> 
> This issue occurs after applying this patch as what I observed this patch introduce the calls to iommu_legacy_{, un}map() to map the grant pages for
> IOMMU that touches the page-tables. I am not sure but what I observed is that something is written wrong when iomm_unmap calls unmap the pages
> because of that issue is observed.

Can you clarify what you mean by "written wrong"? What sort of error do 
you see in the iommu_unmap()?

> 
> You can reproduce the issue by just creating the dummy image file and try to attach it with “xl block-attach”
> 
> dd if=/dev/zero of=test.img bs=1024 count=0 seek=1024
> xl block-attach 0 phy:test.img xvda w
> 
> Sequence of command that I follow is as follows to reproduce the issue:
> 
> lvs vg-xen/myguest
> lvcreate -y -L 4G -n myguest vg-xen
> parted -s /dev/vg-xen/myguest mklabel msdos
> parted -s /dev/vg-xen/myguest unit MB mkpart primary 1 4096
> sync
> xl block-attach 0 phy:/dev/vg-xen/myguest xvda w
> 
> libxl: error: libxl_xshelp.c:201:libxl__xs_read_mandatory: xenstore read failed: `/libxl/0/type': No such file or directory
> libxl: warning: libxl_dom.c:51:libxl__domain_type: unable to get domain type for domid=0, assuming HVM
> [  162.632232] xen-blkback: backend/vbd/0/51712: using 4 queues, protocol 1 (arm-abi) persistent grants
> [  162.764847] vbd vbd-0-51712: 9 mapping in shared page 8 from domain 0
> [  162.771740] vbd vbd-0-51712: 9 mapping ring-ref port 5
> [  162.777650] ------------[ cut here ]------------
> [  162.782167] WARNING: CPU: 2 PID: 37 at drivers/block/xen-blkback/xenbus.c:296 xen_blkif_disconnect+0x20c/0x230

Just to confirm, this splat comes from:

WARN_ON(i != (XEN_BLKIF_REQS_PER_PAGE * blkif->nr_ring_pages));

If so, what are the values for i and blkif->nr_ring_pages?

Cheers,
Rahul Singh Feb. 10, 2021, 6:08 p.m. UTC | #17
Hello Julien,

> On 10 Feb 2021, at 5:34 pm, Julien Grall <julien@xen.org> wrote:
> 
> Hi,
> 
> On 10/02/2021 15:06, Rahul Singh wrote:
>>> On 9 Feb 2021, at 8:36 pm, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>> 
>>> On Tue, 9 Feb 2021, Rahul Singh wrote:
>>>>> On 8 Feb 2021, at 6:49 pm, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>>>> 
>>>>> Commit 91d4eca7add broke gnttab_need_iommu_mapping on ARM.
>>>>> The offending chunk is:
>>>>> 
>>>>> #define gnttab_need_iommu_mapping(d)                    \
>>>>> -    (is_domain_direct_mapped(d) && need_iommu(d))
>>>>> +    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
>>>>> 
>>>>> On ARM we need gnttab_need_iommu_mapping to be true for dom0 when it is
>>>>> directly mapped and IOMMU is enabled for the domain, like the old check
>>>>> did, but the new check is always false.
>>>>> 
>>>>> In fact, need_iommu_pt_sync is defined as dom_iommu(d)->need_sync and
>>>>> need_sync is set as:
>>>>> 
>>>>>   if ( !is_hardware_domain(d) || iommu_hwdom_strict )
>>>>>       hd->need_sync = !iommu_use_hap_pt(d);
>>>>> 
>>>>> iommu_use_hap_pt(d) means that the page-table used by the IOMMU is the
>>>>> P2M. It is true on ARM. need_sync means that you have a separate IOMMU
>>>>> page-table and it needs to be updated for every change. need_sync is set
>>>>> to false on ARM. Hence, gnttab_need_iommu_mapping(d) is false too,
>>>>> which is wrong.
>>>>> 
>>>>> As a consequence, when using PV network from a domU on a system where
>>>>> IOMMU is on from Dom0, I get:
>>>>> 
>>>>> (XEN) smmu: /smmu@fd800000: Unhandled context fault: fsr=0x402, iova=0x8424cb148, fsynr=0xb0001, cb=0
>>>>> [   68.290307] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK
>>>>> 
>>>>> The fix is to go back to something along the lines of the old
>>>>> implementation of gnttab_need_iommu_mapping.
>>>>> 
>>>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
>>>>> Fixes: 91d4eca7add
>>>>> Backport: 4.12+
>>>>> 
>>>>> ---
>>>>> 
>>>>> Given the severity of the bug, I would like to request this patch to be
>>>>> backported to 4.12 too, even if 4.12 is security-fixes only since Oct
>>>>> 2020.
>>>>> 
>>>>> For the 4.12 backport, we can use iommu_enabled() instead of
>>>>> is_iommu_enabled() in the implementation of gnttab_need_iommu_mapping.
>>>>> 
>>>>> Changes in v2:
>>>>> - improve commit message
>>>>> - add is_iommu_enabled(d) to the check
>>>>> ---
>>>>> xen/include/asm-arm/grant_table.h | 2 +-
>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>> 
>>>>> diff --git a/xen/include/asm-arm/grant_table.h b/xen/include/asm-arm/grant_table.h
>>>>> index 6f585b1538..0ce77f9a1c 100644
>>>>> --- a/xen/include/asm-arm/grant_table.h
>>>>> +++ b/xen/include/asm-arm/grant_table.h
>>>>> @@ -89,7 +89,7 @@ int replace_grant_host_mapping(unsigned long gpaddr, mfn_t mfn,
>>>>>    (((i) >= nr_status_frames(t)) ? INVALID_GFN : (t)->arch.status_gfn[i])
>>>>> 
>>>>> #define gnttab_need_iommu_mapping(d)                    \
>>>>> -    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
>>>>> +    (is_domain_direct_mapped(d) && is_iommu_enabled(d))
>>>>> 
>>>>> #endif /* __ASM_GRANT_TABLE_H__ */
>>>> 
>>>> I tested the patch and while creating the guest I observed the below warning from Linux for block device.
>>>> https://elixir.bootlin.com/linux/v4.3/source/drivers/block/xen-blkback/xenbus.c#L258
>>> 
>>> So you are creating a guest with "xl create" in dom0 and you see the
>>> warnings below printed by the Dom0 kernel? I imagine the domU has a
>>> virtual "disk" of some sort.
>> Yes you are right I am trying to create the guest with "xl create” and before that, I created the logical volume and trying to attach the logical volume
>> block to the domain with “xl block-attach”. I observed this error with the "xl block-attach” command.
>> This issue occurs after applying this patch as what I observed this patch introduce the calls to iommu_legacy_{, un}map() to map the grant pages for
>> IOMMU that touches the page-tables. I am not sure but what I observed is that something is written wrong when iomm_unmap calls unmap the pages
>> because of that issue is observed.
> 
> Can you clarify what you mean by "written wrong"? What sort of error do you see in the iommu_unmap()?


I might be wrong as per my understanding for ARM we are sharing the P2M between CPU and IOMMU always and the map_grant_ref() function is written in such a way that we have to call iommu_legacy_{, un}map() only if P2M is not shared. 

As we are sharing the P2M when we call the iommu_map() function it will overwrite the existing GFN -> MFN ( For DOM0 GFN is same as MFN) entry and when we call iommu_unmap() it will unmap the  (GFN -> MFN ) entry from the page-table. Next time when map_grant_ref() tries to map the same frame it will try to get the corresponding GFN but there will no entry in P2M as iommu_unmap() already unmapped the GFN because of that this error might be observed.

Sequence of events that results the issue :

gnttab_map_grant_ref (GFN=a99fb MFN=a99fb)
arm_iommu_map_page() DOMID:0 dfn = a99fb mfn = a99fb 

gnttab_map_grant_ref ( GFN=d9913 MFN=d9913)
arm_iommu_map_page() DOMID:0 dfn = d9913 mfn = d9913

gnttab_map_grant_ref ( GFN=d9846 MFN=d9846)
arm_iommu_map_page() DOMID:0 dfn = d9846 mfn = d9846 

gnttab_map_grant_ref (GFN=a8474 MFN=a8474)
arm_iommu_map_page() DOMID:0 dfn = a8474 mfn = a8474

arm_iommu_unmap_page() DOMID:0 dfn = a99fb
arm_iommu_unmap_page() DOMID:0 dfn = d9913
arm_iommu_unmap_page() DOMID:0 dfn = d9846
arm_iommu_unmap_page() DOMID:0 dfn = a8474

Try to map the same frame that is unmapped earlier by iommu_unmap call()
gnttab_map_grant_ref (GFN=a99fb MFN=0xffffffff).-> Not able to find the GFN in p2m error is observed after that. 

> 
>> You can reproduce the issue by just creating the dummy image file and try to attach it with “xl block-attach”
>> dd if=/dev/zero of=test.img bs=1024 count=0 seek=1024
>> xl block-attach 0 phy:test.img xvda w
>> Sequence of command that I follow is as follows to reproduce the issue:
>> lvs vg-xen/myguest
>> lvcreate -y -L 4G -n myguest vg-xen
>> parted -s /dev/vg-xen/myguest mklabel msdos
>> parted -s /dev/vg-xen/myguest unit MB mkpart primary 1 4096
>> sync
>> xl block-attach 0 phy:/dev/vg-xen/myguest xvda w
>> libxl: error: libxl_xshelp.c:201:libxl__xs_read_mandatory: xenstore read failed: `/libxl/0/type': No such file or directory
>> libxl: warning: libxl_dom.c:51:libxl__domain_type: unable to get domain type for domid=0, assuming HVM
>> [  162.632232] xen-blkback: backend/vbd/0/51712: using 4 queues, protocol 1 (arm-abi) persistent grants
>> [  162.764847] vbd vbd-0-51712: 9 mapping in shared page 8 from domain 0
>> [  162.771740] vbd vbd-0-51712: 9 mapping ring-ref port 5
>> [  162.777650] ------------[ cut here ]------------
>> [  162.782167] WARNING: CPU: 2 PID: 37 at drivers/block/xen-blkback/xenbus.c:296 xen_blkif_disconnect+0x20c/0x230
> 
> Just to confirm, this splat comes from:
> 
> WARN_ON(i != (XEN_BLKIF_REQS_PER_PAGE * blkif->nr_ring_pages));
> 
> If so, what are the values for i and blkif->nr_ring_pages?

Let me find out and will share with you.

Regards,
Rahul
> 
> Cheers,
> 
> -- 
> Julien Grall
Julien Grall Feb. 10, 2021, 7:52 p.m. UTC | #18
On 10/02/2021 18:08, Rahul Singh wrote:
> Hello Julien,
> 
>> On 10 Feb 2021, at 5:34 pm, Julien Grall <julien@xen.org> wrote:
>>
>> Hi,
>>
>> On 10/02/2021 15:06, Rahul Singh wrote:
>>>> On 9 Feb 2021, at 8:36 pm, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>>>
>>>> On Tue, 9 Feb 2021, Rahul Singh wrote:
>>>>>> On 8 Feb 2021, at 6:49 pm, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>>>>>
>>>>>> Commit 91d4eca7add broke gnttab_need_iommu_mapping on ARM.
>>>>>> The offending chunk is:
>>>>>>
>>>>>> #define gnttab_need_iommu_mapping(d)                    \
>>>>>> -    (is_domain_direct_mapped(d) && need_iommu(d))
>>>>>> +    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
>>>>>>
>>>>>> On ARM we need gnttab_need_iommu_mapping to be true for dom0 when it is
>>>>>> directly mapped and IOMMU is enabled for the domain, like the old check
>>>>>> did, but the new check is always false.
>>>>>>
>>>>>> In fact, need_iommu_pt_sync is defined as dom_iommu(d)->need_sync and
>>>>>> need_sync is set as:
>>>>>>
>>>>>>    if ( !is_hardware_domain(d) || iommu_hwdom_strict )
>>>>>>        hd->need_sync = !iommu_use_hap_pt(d);
>>>>>>
>>>>>> iommu_use_hap_pt(d) means that the page-table used by the IOMMU is the
>>>>>> P2M. It is true on ARM. need_sync means that you have a separate IOMMU
>>>>>> page-table and it needs to be updated for every change. need_sync is set
>>>>>> to false on ARM. Hence, gnttab_need_iommu_mapping(d) is false too,
>>>>>> which is wrong.
>>>>>>
>>>>>> As a consequence, when using PV network from a domU on a system where
>>>>>> IOMMU is on from Dom0, I get:
>>>>>>
>>>>>> (XEN) smmu: /smmu@fd800000: Unhandled context fault: fsr=0x402, iova=0x8424cb148, fsynr=0xb0001, cb=0
>>>>>> [   68.290307] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK
>>>>>>
>>>>>> The fix is to go back to something along the lines of the old
>>>>>> implementation of gnttab_need_iommu_mapping.
>>>>>>
>>>>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
>>>>>> Fixes: 91d4eca7add
>>>>>> Backport: 4.12+
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> Given the severity of the bug, I would like to request this patch to be
>>>>>> backported to 4.12 too, even if 4.12 is security-fixes only since Oct
>>>>>> 2020.
>>>>>>
>>>>>> For the 4.12 backport, we can use iommu_enabled() instead of
>>>>>> is_iommu_enabled() in the implementation of gnttab_need_iommu_mapping.
>>>>>>
>>>>>> Changes in v2:
>>>>>> - improve commit message
>>>>>> - add is_iommu_enabled(d) to the check
>>>>>> ---
>>>>>> xen/include/asm-arm/grant_table.h | 2 +-
>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/xen/include/asm-arm/grant_table.h b/xen/include/asm-arm/grant_table.h
>>>>>> index 6f585b1538..0ce77f9a1c 100644
>>>>>> --- a/xen/include/asm-arm/grant_table.h
>>>>>> +++ b/xen/include/asm-arm/grant_table.h
>>>>>> @@ -89,7 +89,7 @@ int replace_grant_host_mapping(unsigned long gpaddr, mfn_t mfn,
>>>>>>     (((i) >= nr_status_frames(t)) ? INVALID_GFN : (t)->arch.status_gfn[i])
>>>>>>
>>>>>> #define gnttab_need_iommu_mapping(d)                    \
>>>>>> -    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
>>>>>> +    (is_domain_direct_mapped(d) && is_iommu_enabled(d))
>>>>>>
>>>>>> #endif /* __ASM_GRANT_TABLE_H__ */
>>>>>
>>>>> I tested the patch and while creating the guest I observed the below warning from Linux for block device.
>>>>> https://elixir.bootlin.com/linux/v4.3/source/drivers/block/xen-blkback/xenbus.c#L258
>>>>
>>>> So you are creating a guest with "xl create" in dom0 and you see the
>>>> warnings below printed by the Dom0 kernel? I imagine the domU has a
>>>> virtual "disk" of some sort.
>>> Yes you are right I am trying to create the guest with "xl create” and before that, I created the logical volume and trying to attach the logical volume
>>> block to the domain with “xl block-attach”. I observed this error with the "xl block-attach” command.
>>> This issue occurs after applying this patch as what I observed this patch introduce the calls to iommu_legacy_{, un}map() to map the grant pages for
>>> IOMMU that touches the page-tables. I am not sure but what I observed is that something is written wrong when iomm_unmap calls unmap the pages
>>> because of that issue is observed.
>>
>> Can you clarify what you mean by "written wrong"? What sort of error do you see in the iommu_unmap()?
> 
> 
> I might be wrong as per my understanding for ARM we are sharing the P2M between CPU and IOMMU always and the map_grant_ref() function is written in such a way that we have to call iommu_legacy_{, un}map() only if P2M is not shared.

map_grant_ref() will call the IOMMU if gnttab_need_iommu_mapping() 
returns true. I don't really see where this is assuming the P2M is not 
shared.

In fact, on x86, this will always be false for HVM domain (they support 
both shared and separate page-tables).

> 
> As we are sharing the P2M when we call the iommu_map() function it will overwrite the existing GFN -> MFN ( For DOM0 GFN is same as MFN) entry and when we call iommu_unmap() it will unmap the  (GFN -> MFN ) entry from the page-table.
AFAIK, there should be nothing mapped at that GFN because the page 
belongs to the guest. At worse, we would overwrite a mapping that is the 
same.

Although, I agree that we may end up to remove the entry early and 
therefore we could get an IOMMU fault (this is not your case here). But 
that's not an Arm-only problem.

> Next time when map_grant_ref() tries to map the same frame it will try to get the corresponding GFN but there will no entry in P2M as iommu_unmap() already unmapped the GFN because of that this error might be observed.
I am afraid, I don't understand what you mean by "try to get the 
corresponding GFN".  Can you give a pointer to the code?

> Sequence of events that results the issue :
> 
> gnttab_map_grant_ref (GFN=a99fb MFN=a99fb)

Can you clarify whether the GFN is from the local domain (e.g. 
dom0/backend) or the remote domain (e.g. the frontend)?

> arm_iommu_map_page() DOMID:0 dfn = a99fb mfn = a99fb
> 
> gnttab_map_grant_ref ( GFN=d9913 MFN=d9913)
> arm_iommu_map_page() DOMID:0 dfn = d9913 mfn = d9913
> 
> gnttab_map_grant_ref ( GFN=d9846 MFN=d9846)
> arm_iommu_map_page() DOMID:0 dfn = d9846 mfn = d9846
> 
> gnttab_map_grant_ref (GFN=a8474 MFN=a8474)
> arm_iommu_map_page() DOMID:0 dfn = a8474 mfn = a8474
> 
> arm_iommu_unmap_page() DOMID:0 dfn = a99fb
> arm_iommu_unmap_page() DOMID:0 dfn = d9913
> arm_iommu_unmap_page() DOMID:0 dfn = d9846
> arm_iommu_unmap_page() DOMID:0 dfn = a8474
> 
> Try to map the same frame that is unmapped earlier by iommu_unmap call()
> gnttab_map_grant_ref (GFN=a99fb MFN=0xffffffff).-> Not able to find the GFN in p2m error is observed after that.

The iommu_map()/iommu_unmap() should only modify the dom0 P2M. It should 
not modify the guest P2M.

When Dom0 issue a request to map a grant we will:
   1) Look-up the guest P2M to find the corresponding MFN
   2) Do all the sanity check
   3) Map the page in dom0's P2M at the address requested (host_addr)
   4) Issue iommu_map() to get a 1:1 mapping in the P2M

So are you saying that the guest P2M walk has failed?

Cheers,
Stefano Stabellini Feb. 10, 2021, 9:13 p.m. UTC | #19
On Wed, 10 Feb 2021, Rahul Singh wrote:
> > On 9 Feb 2021, at 8:36 pm, Stefano Stabellini <sstabellini@kernel.org> wrote:
> > On Tue, 9 Feb 2021, Rahul Singh wrote:
> >>> On 8 Feb 2021, at 6:49 pm, Stefano Stabellini <sstabellini@kernel.org> wrote:
> >>> 
> >>> Commit 91d4eca7add broke gnttab_need_iommu_mapping on ARM.
> >>> The offending chunk is:
> >>> 
> >>> #define gnttab_need_iommu_mapping(d)                    \
> >>> -    (is_domain_direct_mapped(d) && need_iommu(d))
> >>> +    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
> >>> 
> >>> On ARM we need gnttab_need_iommu_mapping to be true for dom0 when it is
> >>> directly mapped and IOMMU is enabled for the domain, like the old check
> >>> did, but the new check is always false.
> >>> 
> >>> In fact, need_iommu_pt_sync is defined as dom_iommu(d)->need_sync and
> >>> need_sync is set as:
> >>> 
> >>>   if ( !is_hardware_domain(d) || iommu_hwdom_strict )
> >>>       hd->need_sync = !iommu_use_hap_pt(d);
> >>> 
> >>> iommu_use_hap_pt(d) means that the page-table used by the IOMMU is the
> >>> P2M. It is true on ARM. need_sync means that you have a separate IOMMU
> >>> page-table and it needs to be updated for every change. need_sync is set
> >>> to false on ARM. Hence, gnttab_need_iommu_mapping(d) is false too,
> >>> which is wrong.
> >>> 
> >>> As a consequence, when using PV network from a domU on a system where
> >>> IOMMU is on from Dom0, I get:
> >>> 
> >>> (XEN) smmu: /smmu@fd800000: Unhandled context fault: fsr=0x402, iova=0x8424cb148, fsynr=0xb0001, cb=0
> >>> [   68.290307] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK
> >>> 
> >>> The fix is to go back to something along the lines of the old
> >>> implementation of gnttab_need_iommu_mapping.
> >>> 
> >>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> >>> Fixes: 91d4eca7add
> >>> Backport: 4.12+
> >>> 
> >>> ---
> >>> 
> >>> Given the severity of the bug, I would like to request this patch to be
> >>> backported to 4.12 too, even if 4.12 is security-fixes only since Oct
> >>> 2020.
> >>> 
> >>> For the 4.12 backport, we can use iommu_enabled() instead of
> >>> is_iommu_enabled() in the implementation of gnttab_need_iommu_mapping.
> >>> 
> >>> Changes in v2:
> >>> - improve commit message
> >>> - add is_iommu_enabled(d) to the check
> >>> ---
> >>> xen/include/asm-arm/grant_table.h | 2 +-
> >>> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>> 
> >>> diff --git a/xen/include/asm-arm/grant_table.h b/xen/include/asm-arm/grant_table.h
> >>> index 6f585b1538..0ce77f9a1c 100644
> >>> --- a/xen/include/asm-arm/grant_table.h
> >>> +++ b/xen/include/asm-arm/grant_table.h
> >>> @@ -89,7 +89,7 @@ int replace_grant_host_mapping(unsigned long gpaddr, mfn_t mfn,
> >>>    (((i) >= nr_status_frames(t)) ? INVALID_GFN : (t)->arch.status_gfn[i])
> >>> 
> >>> #define gnttab_need_iommu_mapping(d)                    \
> >>> -    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
> >>> +    (is_domain_direct_mapped(d) && is_iommu_enabled(d))
> >>> 
> >>> #endif /* __ASM_GRANT_TABLE_H__ */
> >> 
> >> I tested the patch and while creating the guest I observed the below warning from Linux for block device.
> >> https://elixir.bootlin.com/linux/v4.3/source/drivers/block/xen-blkback/xenbus.c#L258
> > 
> > So you are creating a guest with "xl create" in dom0 and you see the
> > warnings below printed by the Dom0 kernel? I imagine the domU has a
> > virtual "disk" of some sort.
> 
> Yes you are right I am trying to create the guest with "xl create” and before that, I created the logical volume and trying to attach the logical volume
> block to the domain with “xl block-attach”. I observed this error with the "xl block-attach” command.
> 
> This issue occurs after applying this patch as what I observed this patch introduce the calls to iommu_legacy_{, un}map() to map the grant pages for
> IOMMU that touches the page-tables. I am not sure but what I observed is that something is written wrong when iomm_unmap calls unmap the pages
> because of that issue is observed.
> 
> You can reproduce the issue by just creating the dummy image file and try to attach it with “xl block-attach”
> 
> dd if=/dev/zero of=test.img bs=1024 count=0 seek=1024
> xl block-attach 0 phy:test.img xvda w
> 
> Sequence of command that I follow is as follows to reproduce the issue:  
> 
> lvs vg-xen/myguest
> lvcreate -y -L 4G -n myguest vg-xen
> parted -s /dev/vg-xen/myguest mklabel msdos
> parted -s /dev/vg-xen/myguest unit MB mkpart primary 1 4096
> sync
> xl block-attach 0 phy:/dev/vg-xen/myguest xvda w

Ah! You are block-attaching a device in dom0 to dom0! So frontend and
backend are both in the same domain, dom0. Yeah that is not supposed to
work, and if it did before it was just by coincidence :-)

There are a number of checks in Linux that wouldn't work as intended if
the page is coming from itself. This is not an ARM-only issue either.

I tried it with dom0, like you did, and I am seeing the same warning. I
did try to do block-attach to a domU and it works fine.

I don't think this is a concern.
Rahul Singh Feb. 11, 2021, 12:03 p.m. UTC | #20
Hello Stefano,

> On 10 Feb 2021, at 9:13 pm, Stefano Stabellini <sstabellini@kernel.org> wrote:
> 
> On Wed, 10 Feb 2021, Rahul Singh wrote:
>>> On 9 Feb 2021, at 8:36 pm, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>> On Tue, 9 Feb 2021, Rahul Singh wrote:
>>>>> On 8 Feb 2021, at 6:49 pm, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>>>> 
>>>>> Commit 91d4eca7add broke gnttab_need_iommu_mapping on ARM.
>>>>> The offending chunk is:
>>>>> 
>>>>> #define gnttab_need_iommu_mapping(d)                    \
>>>>> -    (is_domain_direct_mapped(d) && need_iommu(d))
>>>>> +    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
>>>>> 
>>>>> On ARM we need gnttab_need_iommu_mapping to be true for dom0 when it is
>>>>> directly mapped and IOMMU is enabled for the domain, like the old check
>>>>> did, but the new check is always false.
>>>>> 
>>>>> In fact, need_iommu_pt_sync is defined as dom_iommu(d)->need_sync and
>>>>> need_sync is set as:
>>>>> 
>>>>>  if ( !is_hardware_domain(d) || iommu_hwdom_strict )
>>>>>      hd->need_sync = !iommu_use_hap_pt(d);
>>>>> 
>>>>> iommu_use_hap_pt(d) means that the page-table used by the IOMMU is the
>>>>> P2M. It is true on ARM. need_sync means that you have a separate IOMMU
>>>>> page-table and it needs to be updated for every change. need_sync is set
>>>>> to false on ARM. Hence, gnttab_need_iommu_mapping(d) is false too,
>>>>> which is wrong.
>>>>> 
>>>>> As a consequence, when using PV network from a domU on a system where
>>>>> IOMMU is on from Dom0, I get:
>>>>> 
>>>>> (XEN) smmu: /smmu@fd800000: Unhandled context fault: fsr=0x402, iova=0x8424cb148, fsynr=0xb0001, cb=0
>>>>> [   68.290307] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK
>>>>> 
>>>>> The fix is to go back to something along the lines of the old
>>>>> implementation of gnttab_need_iommu_mapping.
>>>>> 
>>>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
>>>>> Fixes: 91d4eca7add
>>>>> Backport: 4.12+
>>>>> 
>>>>> ---
>>>>> 
>>>>> Given the severity of the bug, I would like to request this patch to be
>>>>> backported to 4.12 too, even if 4.12 is security-fixes only since Oct
>>>>> 2020.
>>>>> 
>>>>> For the 4.12 backport, we can use iommu_enabled() instead of
>>>>> is_iommu_enabled() in the implementation of gnttab_need_iommu_mapping.
>>>>> 
>>>>> Changes in v2:
>>>>> - improve commit message
>>>>> - add is_iommu_enabled(d) to the check
>>>>> ---
>>>>> xen/include/asm-arm/grant_table.h | 2 +-
>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>> 
>>>>> diff --git a/xen/include/asm-arm/grant_table.h b/xen/include/asm-arm/grant_table.h
>>>>> index 6f585b1538..0ce77f9a1c 100644
>>>>> --- a/xen/include/asm-arm/grant_table.h
>>>>> +++ b/xen/include/asm-arm/grant_table.h
>>>>> @@ -89,7 +89,7 @@ int replace_grant_host_mapping(unsigned long gpaddr, mfn_t mfn,
>>>>>   (((i) >= nr_status_frames(t)) ? INVALID_GFN : (t)->arch.status_gfn[i])
>>>>> 
>>>>> #define gnttab_need_iommu_mapping(d)                    \
>>>>> -    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
>>>>> +    (is_domain_direct_mapped(d) && is_iommu_enabled(d))
>>>>> 
>>>>> #endif /* __ASM_GRANT_TABLE_H__ */
>>>> 
>>>> I tested the patch and while creating the guest I observed the below warning from Linux for block device.
>>>> https://elixir.bootlin.com/linux/v4.3/source/drivers/block/xen-blkback/xenbus.c#L258
>>> 
>>> So you are creating a guest with "xl create" in dom0 and you see the
>>> warnings below printed by the Dom0 kernel? I imagine the domU has a
>>> virtual "disk" of some sort.
>> 
>> Yes you are right I am trying to create the guest with "xl create” and before that, I created the logical volume and trying to attach the logical volume
>> block to the domain with “xl block-attach”. I observed this error with the "xl block-attach” command.
>> 
>> This issue occurs after applying this patch as what I observed this patch introduce the calls to iommu_legacy_{, un}map() to map the grant pages for
>> IOMMU that touches the page-tables. I am not sure but what I observed is that something is written wrong when iomm_unmap calls unmap the pages
>> because of that issue is observed.
>> 
>> You can reproduce the issue by just creating the dummy image file and try to attach it with “xl block-attach”
>> 
>> dd if=/dev/zero of=test.img bs=1024 count=0 seek=1024
>> xl block-attach 0 phy:test.img xvda w
>> 
>> Sequence of command that I follow is as follows to reproduce the issue:  
>> 
>> lvs vg-xen/myguest
>> lvcreate -y -L 4G -n myguest vg-xen
>> parted -s /dev/vg-xen/myguest mklabel msdos
>> parted -s /dev/vg-xen/myguest unit MB mkpart primary 1 4096
>> sync
>> xl block-attach 0 phy:/dev/vg-xen/myguest xvda w
> 
> Ah! You are block-attaching a device in dom0 to dom0! So frontend and
> backend are both in the same domain, dom0. Yeah that is not supposed to
> work, and if it did before it was just by coincidence :-)

I think this is very common to attach a block device to DOM0 where frontend/backend is in same domain. 
I found two reference that use the same concept

https://wiki.xenproject.org/wiki/Access_a_LVM-based_DomU_disk_outside_of_the_domU
https://www.suse.com/support/kb/doc/?id=000016154

Regards,
Rahul

> 
> There are a number of checks in Linux that wouldn't work as intended if
> the page is coming from itself. This is not an ARM-only issue either.
> 
> I tried it with dom0, like you did, and I am seeing the same warning.
> I
> did try to do block-attach to a domU and it works fine.
> 
> I don't think this is a concern.
Rahul Singh Feb. 11, 2021, 1:20 p.m. UTC | #21
Hello Julien,

> On 10 Feb 2021, at 7:52 pm, Julien Grall <julien@xen.org> wrote:
> 
> 
> 
> On 10/02/2021 18:08, Rahul Singh wrote:
>> Hello Julien,
>>> On 10 Feb 2021, at 5:34 pm, Julien Grall <julien@xen.org> wrote:
>>> 
>>> Hi,
>>> 
>>> On 10/02/2021 15:06, Rahul Singh wrote:
>>>>> On 9 Feb 2021, at 8:36 pm, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>>>> 
>>>>> On Tue, 9 Feb 2021, Rahul Singh wrote:
>>>>>>> On 8 Feb 2021, at 6:49 pm, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>>>>>> 
>>>>>>> Commit 91d4eca7add broke gnttab_need_iommu_mapping on ARM.
>>>>>>> The offending chunk is:
>>>>>>> 
>>>>>>> #define gnttab_need_iommu_mapping(d)                    \
>>>>>>> -    (is_domain_direct_mapped(d) && need_iommu(d))
>>>>>>> +    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
>>>>>>> 
>>>>>>> On ARM we need gnttab_need_iommu_mapping to be true for dom0 when it is
>>>>>>> directly mapped and IOMMU is enabled for the domain, like the old check
>>>>>>> did, but the new check is always false.
>>>>>>> 
>>>>>>> In fact, need_iommu_pt_sync is defined as dom_iommu(d)->need_sync and
>>>>>>> need_sync is set as:
>>>>>>> 
>>>>>>>   if ( !is_hardware_domain(d) || iommu_hwdom_strict )
>>>>>>>       hd->need_sync = !iommu_use_hap_pt(d);
>>>>>>> 
>>>>>>> iommu_use_hap_pt(d) means that the page-table used by the IOMMU is the
>>>>>>> P2M. It is true on ARM. need_sync means that you have a separate IOMMU
>>>>>>> page-table and it needs to be updated for every change. need_sync is set
>>>>>>> to false on ARM. Hence, gnttab_need_iommu_mapping(d) is false too,
>>>>>>> which is wrong.
>>>>>>> 
>>>>>>> As a consequence, when using PV network from a domU on a system where
>>>>>>> IOMMU is on from Dom0, I get:
>>>>>>> 
>>>>>>> (XEN) smmu: /smmu@fd800000: Unhandled context fault: fsr=0x402, iova=0x8424cb148, fsynr=0xb0001, cb=0
>>>>>>> [   68.290307] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK
>>>>>>> 
>>>>>>> The fix is to go back to something along the lines of the old
>>>>>>> implementation of gnttab_need_iommu_mapping.
>>>>>>> 
>>>>>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
>>>>>>> Fixes: 91d4eca7add
>>>>>>> Backport: 4.12+
>>>>>>> 
>>>>>>> ---
>>>>>>> 
>>>>>>> Given the severity of the bug, I would like to request this patch to be
>>>>>>> backported to 4.12 too, even if 4.12 is security-fixes only since Oct
>>>>>>> 2020.
>>>>>>> 
>>>>>>> For the 4.12 backport, we can use iommu_enabled() instead of
>>>>>>> is_iommu_enabled() in the implementation of gnttab_need_iommu_mapping.
>>>>>>> 
>>>>>>> Changes in v2:
>>>>>>> - improve commit message
>>>>>>> - add is_iommu_enabled(d) to the check
>>>>>>> ---
>>>>>>> xen/include/asm-arm/grant_table.h | 2 +-
>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>> 
>>>>>>> diff --git a/xen/include/asm-arm/grant_table.h b/xen/include/asm-arm/grant_table.h
>>>>>>> index 6f585b1538..0ce77f9a1c 100644
>>>>>>> --- a/xen/include/asm-arm/grant_table.h
>>>>>>> +++ b/xen/include/asm-arm/grant_table.h
>>>>>>> @@ -89,7 +89,7 @@ int replace_grant_host_mapping(unsigned long gpaddr, mfn_t mfn,
>>>>>>>    (((i) >= nr_status_frames(t)) ? INVALID_GFN : (t)->arch.status_gfn[i])
>>>>>>> 
>>>>>>> #define gnttab_need_iommu_mapping(d)                    \
>>>>>>> -    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
>>>>>>> +    (is_domain_direct_mapped(d) && is_iommu_enabled(d))
>>>>>>> 
>>>>>>> #endif /* __ASM_GRANT_TABLE_H__ */
>>>>>> 
>>>>>> I tested the patch and while creating the guest I observed the below warning from Linux for block device.
>>>>>> https://elixir.bootlin.com/linux/v4.3/source/drivers/block/xen-blkback/xenbus.c#L258
>>>>> 
>>>>> So you are creating a guest with "xl create" in dom0 and you see the
>>>>> warnings below printed by the Dom0 kernel? I imagine the domU has a
>>>>> virtual "disk" of some sort.
>>>> Yes you are right I am trying to create the guest with "xl create” and before that, I created the logical volume and trying to attach the logical volume
>>>> block to the domain with “xl block-attach”. I observed this error with the "xl block-attach” command.
>>>> This issue occurs after applying this patch as what I observed this patch introduce the calls to iommu_legacy_{, un}map() to map the grant pages for
>>>> IOMMU that touches the page-tables. I am not sure but what I observed is that something is written wrong when iomm_unmap calls unmap the pages
>>>> because of that issue is observed.
>>> 
>>> Can you clarify what you mean by "written wrong"? What sort of error do you see in the iommu_unmap()?
>> I might be wrong as per my understanding for ARM we are sharing the P2M between CPU and IOMMU always and the map_grant_ref() function is written in such a way that we have to call iommu_legacy_{, un}map() only if P2M is not shared.
> 
> map_grant_ref() will call the IOMMU if gnttab_need_iommu_mapping() returns true. I don't really see where this is assuming the P2M is not shared.
> 
> In fact, on x86, this will always be false for HVM domain (they support both shared and separate page-tables).
> 
>> As we are sharing the P2M when we call the iommu_map() function it will overwrite the existing GFN -> MFN ( For DOM0 GFN is same as MFN) entry and when we call iommu_unmap() it will unmap the  (GFN -> MFN ) entry from the page-table.
> AFAIK, there should be nothing mapped at that GFN because the page belongs to the guest. At worse, we would overwrite a mapping that is the same.

Sorry I should have mention before backend/frontend is dom0 in this case and GFN is mapped. I am trying to attach the block device to DOM0

dd if=/dev/zero of=test.img bs=1024 count=0 seek=1024
xl block-attach 0 phy:test.img xvda w

> 
> Although, I agree that we may end up to remove the entry early and therefore we could get an IOMMU fault (this is not your case here). But that's not an Arm-only problem.
> 
>> Next time when map_grant_ref() tries to map the same frame it will try to get the corresponding GFN but there will no entry in P2M as iommu_unmap() already unmapped the GFN because of that this error might be observed.
> I am afraid, I don't understand what you mean by "try to get the corresponding GFN".  Can you give a pointer to the code?

I mean to say to get the GFN from frame "struct grant_entry_v1 { …. .frame } 
http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/common/grant_table.c;h=280b7969b6d5821010675182eaf0ea29fc332dd4;hb=HEAD#l1124

> 
>> Sequence of events that results the issue :
>> gnttab_map_grant_ref (GFN=a99fb MFN=a99fb)
> 
> Can you clarify whether the GFN is from the local domain (e.g. dom0/backend) or the remote domain (e.g. the frontend)?

Both backend/frontend in the same domain dom0 .
> 
>> arm_iommu_map_page() DOMID:0 dfn = a99fb mfn = a99fb
>> gnttab_map_grant_ref ( GFN=d9913 MFN=d9913)
>> arm_iommu_map_page() DOMID:0 dfn = d9913 mfn = d9913
>> gnttab_map_grant_ref ( GFN=d9846 MFN=d9846)
>> arm_iommu_map_page() DOMID:0 dfn = d9846 mfn = d9846
>> gnttab_map_grant_ref (GFN=a8474 MFN=a8474)
>> arm_iommu_map_page() DOMID:0 dfn = a8474 mfn = a8474
>> arm_iommu_unmap_page() DOMID:0 dfn = a99fb
>> arm_iommu_unmap_page() DOMID:0 dfn = d9913
>> arm_iommu_unmap_page() DOMID:0 dfn = d9846
>> arm_iommu_unmap_page() DOMID:0 dfn = a8474
>> Try to map the same frame that is unmapped earlier by iommu_unmap call()
>> gnttab_map_grant_ref (GFN=a99fb MFN=0xffffffff).-> Not able to find the GFN in p2m error is observed after that.
> 
> The iommu_map()/iommu_unmap() should only modify the dom0 P2M. It should not modify the guest P2M.

Yes as in this case backend/frontend in same dom0 that why P2M is modified.
> 
> When Dom0 issue a request to map a grant we will:
>  1) Look-up the guest P2M to find the corresponding MFN
>  2) Do all the sanity check
>  3) Map the page in dom0's P2M at the address requested (host_addr)
>  4) Issue iommu_map() to get a 1:1 mapping in the P2M
> 
> So are you saying that the guest P2M walk has failed?

No. I mean to say GFN is unmapped by iommu_unmap()  earlier and later point of time when we get the request to map the same frame that is failed as we will not be able to get GFN from frame get_paged_frame() in function map_grant_ref().

Regards,
Rahul
> Cheers,
> 
> -- 
> Julien Grall
Julien Grall Feb. 11, 2021, 1:52 p.m. UTC | #22
On 11/02/2021 13:20, Rahul Singh wrote:
> Hello Julien,

Hi Rahul,

>> On 10 Feb 2021, at 7:52 pm, Julien Grall <julien@xen.org> wrote:
>>
>>
>>
>> On 10/02/2021 18:08, Rahul Singh wrote:
>>> Hello Julien,
>>>> On 10 Feb 2021, at 5:34 pm, Julien Grall <julien@xen.org> wrote:
>>>>
>>>> Hi,
>>>>
>>>> On 10/02/2021 15:06, Rahul Singh wrote:
>>>>>> On 9 Feb 2021, at 8:36 pm, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>>>>>
>>>>>> On Tue, 9 Feb 2021, Rahul Singh wrote:
>>>>>>>> On 8 Feb 2021, at 6:49 pm, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>>>>>>>
>>>>>>>> Commit 91d4eca7add broke gnttab_need_iommu_mapping on ARM.
>>>>>>>> The offending chunk is:
>>>>>>>>
>>>>>>>> #define gnttab_need_iommu_mapping(d)                    \
>>>>>>>> -    (is_domain_direct_mapped(d) && need_iommu(d))
>>>>>>>> +    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
>>>>>>>>
>>>>>>>> On ARM we need gnttab_need_iommu_mapping to be true for dom0 when it is
>>>>>>>> directly mapped and IOMMU is enabled for the domain, like the old check
>>>>>>>> did, but the new check is always false.
>>>>>>>>
>>>>>>>> In fact, need_iommu_pt_sync is defined as dom_iommu(d)->need_sync and
>>>>>>>> need_sync is set as:
>>>>>>>>
>>>>>>>>    if ( !is_hardware_domain(d) || iommu_hwdom_strict )
>>>>>>>>        hd->need_sync = !iommu_use_hap_pt(d);
>>>>>>>>
>>>>>>>> iommu_use_hap_pt(d) means that the page-table used by the IOMMU is the
>>>>>>>> P2M. It is true on ARM. need_sync means that you have a separate IOMMU
>>>>>>>> page-table and it needs to be updated for every change. need_sync is set
>>>>>>>> to false on ARM. Hence, gnttab_need_iommu_mapping(d) is false too,
>>>>>>>> which is wrong.
>>>>>>>>
>>>>>>>> As a consequence, when using PV network from a domU on a system where
>>>>>>>> IOMMU is on from Dom0, I get:
>>>>>>>>
>>>>>>>> (XEN) smmu: /smmu@fd800000: Unhandled context fault: fsr=0x402, iova=0x8424cb148, fsynr=0xb0001, cb=0
>>>>>>>> [   68.290307] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK
>>>>>>>>
>>>>>>>> The fix is to go back to something along the lines of the old
>>>>>>>> implementation of gnttab_need_iommu_mapping.
>>>>>>>>
>>>>>>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
>>>>>>>> Fixes: 91d4eca7add
>>>>>>>> Backport: 4.12+
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> Given the severity of the bug, I would like to request this patch to be
>>>>>>>> backported to 4.12 too, even if 4.12 is security-fixes only since Oct
>>>>>>>> 2020.
>>>>>>>>
>>>>>>>> For the 4.12 backport, we can use iommu_enabled() instead of
>>>>>>>> is_iommu_enabled() in the implementation of gnttab_need_iommu_mapping.
>>>>>>>>
>>>>>>>> Changes in v2:
>>>>>>>> - improve commit message
>>>>>>>> - add is_iommu_enabled(d) to the check
>>>>>>>> ---
>>>>>>>> xen/include/asm-arm/grant_table.h | 2 +-
>>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/xen/include/asm-arm/grant_table.h b/xen/include/asm-arm/grant_table.h
>>>>>>>> index 6f585b1538..0ce77f9a1c 100644
>>>>>>>> --- a/xen/include/asm-arm/grant_table.h
>>>>>>>> +++ b/xen/include/asm-arm/grant_table.h
>>>>>>>> @@ -89,7 +89,7 @@ int replace_grant_host_mapping(unsigned long gpaddr, mfn_t mfn,
>>>>>>>>     (((i) >= nr_status_frames(t)) ? INVALID_GFN : (t)->arch.status_gfn[i])
>>>>>>>>
>>>>>>>> #define gnttab_need_iommu_mapping(d)                    \
>>>>>>>> -    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
>>>>>>>> +    (is_domain_direct_mapped(d) && is_iommu_enabled(d))
>>>>>>>>
>>>>>>>> #endif /* __ASM_GRANT_TABLE_H__ */
>>>>>>>
>>>>>>> I tested the patch and while creating the guest I observed the below warning from Linux for block device.
>>>>>>> https://elixir.bootlin.com/linux/v4.3/source/drivers/block/xen-blkback/xenbus.c#L258
>>>>>>
>>>>>> So you are creating a guest with "xl create" in dom0 and you see the
>>>>>> warnings below printed by the Dom0 kernel? I imagine the domU has a
>>>>>> virtual "disk" of some sort.
>>>>> Yes you are right I am trying to create the guest with "xl create” and before that, I created the logical volume and trying to attach the logical volume
>>>>> block to the domain with “xl block-attach”. I observed this error with the "xl block-attach” command.
>>>>> This issue occurs after applying this patch as what I observed this patch introduce the calls to iommu_legacy_{, un}map() to map the grant pages for
>>>>> IOMMU that touches the page-tables. I am not sure but what I observed is that something is written wrong when iomm_unmap calls unmap the pages
>>>>> because of that issue is observed.
>>>>
>>>> Can you clarify what you mean by "written wrong"? What sort of error do you see in the iommu_unmap()?
>>> I might be wrong as per my understanding for ARM we are sharing the P2M between CPU and IOMMU always and the map_grant_ref() function is written in such a way that we have to call iommu_legacy_{, un}map() only if P2M is not shared.
>>
>> map_grant_ref() will call the IOMMU if gnttab_need_iommu_mapping() returns true. I don't really see where this is assuming the P2M is not shared.
>>
>> In fact, on x86, this will always be false for HVM domain (they support both shared and separate page-tables).
>>
>>> As we are sharing the P2M when we call the iommu_map() function it will overwrite the existing GFN -> MFN ( For DOM0 GFN is same as MFN) entry and when we call iommu_unmap() it will unmap the  (GFN -> MFN ) entry from the page-table.
>> AFAIK, there should be nothing mapped at that GFN because the page belongs to the guest. At worse, we would overwrite a mapping that is the same.
>  > Sorry I should have mention before backend/frontend is dom0 in this 
case and GFN is mapped. I am trying to attach the block device to DOM0

Ah, your log makes a lot more sense now. Thank you for the clarification!

So yes, I agree that iommu_{,un}map() will do the wrong thing if the 
frontend and backend in the same domain.

I don't know what the state in Linux, but from Xen PoV it should be 
possible to have the backend/frontend in the same domain.

I think we want to ignore the IOMMU mapping request when the domain is 
the same. Can you try this small untested patch:

diff --git a/xen/drivers/passthrough/arm/iommu_helpers.c 
b/xen/drivers/passthrough/arm/iommu_helpers.c
index a36e2b8e6c42..7bad13593146 100644
--- a/xen/drivers/passthrough/arm/iommu_helpers.c
+++ b/xen/drivers/passthrough/arm/iommu_helpers.c
@@ -53,6 +53,9 @@ int __must_check arm_iommu_map_page(struct domain *d, 
dfn_t dfn, mfn_t mfn,

      t = (flags & IOMMUF_writable) ? p2m_iommu_map_rw : p2m_iommu_map_ro;

+    if ( page_get_owner(mfn_to_page(mfn)) == d )
+        return 0;
+
      /*
       * The function guest_physmap_add_entry replaces the current mapping
       * if there is already one...
@@ -71,6 +74,9 @@ int __must_check arm_iommu_unmap_page(struct domain 
*d, dfn_t dfn,
      if ( !is_domain_direct_mapped(d) )
          return -EINVAL;

+    if ( page_get_owner(mfn_to_page(mfn)) == d )
+        return 0;
+
      return guest_physmap_remove_page(d, _gfn(dfn_x(dfn)), 
_mfn(dfn_x(dfn)), 0);
  }

Cheers,
Rahul Singh Feb. 11, 2021, 4:05 p.m. UTC | #23
Hello Julien,

> On 11 Feb 2021, at 1:52 pm, Julien Grall <julien@xen.org> wrote:
> 
> 
> 
> On 11/02/2021 13:20, Rahul Singh wrote:
>> Hello Julien,
> 
> Hi Rahul,
> 
>>> On 10 Feb 2021, at 7:52 pm, Julien Grall <julien@xen.org> wrote:
>>> 
>>> 
>>> 
>>> On 10/02/2021 18:08, Rahul Singh wrote:
>>>> Hello Julien,
>>>>> On 10 Feb 2021, at 5:34 pm, Julien Grall <julien@xen.org> wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> On 10/02/2021 15:06, Rahul Singh wrote:
>>>>>>> On 9 Feb 2021, at 8:36 pm, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>>>>>> 
>>>>>>> On Tue, 9 Feb 2021, Rahul Singh wrote:
>>>>>>>>> On 8 Feb 2021, at 6:49 pm, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>>>>>>>> 
>>>>>>>>> Commit 91d4eca7add broke gnttab_need_iommu_mapping on ARM.
>>>>>>>>> The offending chunk is:
>>>>>>>>> 
>>>>>>>>> #define gnttab_need_iommu_mapping(d)                    \
>>>>>>>>> -    (is_domain_direct_mapped(d) && need_iommu(d))
>>>>>>>>> +    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
>>>>>>>>> 
>>>>>>>>> On ARM we need gnttab_need_iommu_mapping to be true for dom0 when it is
>>>>>>>>> directly mapped and IOMMU is enabled for the domain, like the old check
>>>>>>>>> did, but the new check is always false.
>>>>>>>>> 
>>>>>>>>> In fact, need_iommu_pt_sync is defined as dom_iommu(d)->need_sync and
>>>>>>>>> need_sync is set as:
>>>>>>>>> 
>>>>>>>>>   if ( !is_hardware_domain(d) || iommu_hwdom_strict )
>>>>>>>>>       hd->need_sync = !iommu_use_hap_pt(d);
>>>>>>>>> 
>>>>>>>>> iommu_use_hap_pt(d) means that the page-table used by the IOMMU is the
>>>>>>>>> P2M. It is true on ARM. need_sync means that you have a separate IOMMU
>>>>>>>>> page-table and it needs to be updated for every change. need_sync is set
>>>>>>>>> to false on ARM. Hence, gnttab_need_iommu_mapping(d) is false too,
>>>>>>>>> which is wrong.
>>>>>>>>> 
>>>>>>>>> As a consequence, when using PV network from a domU on a system where
>>>>>>>>> IOMMU is on from Dom0, I get:
>>>>>>>>> 
>>>>>>>>> (XEN) smmu: /smmu@fd800000: Unhandled context fault: fsr=0x402, iova=0x8424cb148, fsynr=0xb0001, cb=0
>>>>>>>>> [   68.290307] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK
>>>>>>>>> 
>>>>>>>>> The fix is to go back to something along the lines of the old
>>>>>>>>> implementation of gnttab_need_iommu_mapping.
>>>>>>>>> 
>>>>>>>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
>>>>>>>>> Fixes: 91d4eca7add
>>>>>>>>> Backport: 4.12+
>>>>>>>>> 
>>>>>>>>> ---
>>>>>>>>> 
>>>>>>>>> Given the severity of the bug, I would like to request this patch to be
>>>>>>>>> backported to 4.12 too, even if 4.12 is security-fixes only since Oct
>>>>>>>>> 2020.
>>>>>>>>> 
>>>>>>>>> For the 4.12 backport, we can use iommu_enabled() instead of
>>>>>>>>> is_iommu_enabled() in the implementation of gnttab_need_iommu_mapping.
>>>>>>>>> 
>>>>>>>>> Changes in v2:
>>>>>>>>> - improve commit message
>>>>>>>>> - add is_iommu_enabled(d) to the check
>>>>>>>>> ---
>>>>>>>>> xen/include/asm-arm/grant_table.h | 2 +-
>>>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>>> 
>>>>>>>>> diff --git a/xen/include/asm-arm/grant_table.h b/xen/include/asm-arm/grant_table.h
>>>>>>>>> index 6f585b1538..0ce77f9a1c 100644
>>>>>>>>> --- a/xen/include/asm-arm/grant_table.h
>>>>>>>>> +++ b/xen/include/asm-arm/grant_table.h
>>>>>>>>> @@ -89,7 +89,7 @@ int replace_grant_host_mapping(unsigned long gpaddr, mfn_t mfn,
>>>>>>>>>    (((i) >= nr_status_frames(t)) ? INVALID_GFN : (t)->arch.status_gfn[i])
>>>>>>>>> 
>>>>>>>>> #define gnttab_need_iommu_mapping(d)                    \
>>>>>>>>> -    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
>>>>>>>>> +    (is_domain_direct_mapped(d) && is_iommu_enabled(d))
>>>>>>>>> 
>>>>>>>>> #endif /* __ASM_GRANT_TABLE_H__ */
>>>>>>>> 
>>>>>>>> I tested the patch and while creating the guest I observed the below warning from Linux for block device.
>>>>>>>> https://elixir.bootlin.com/linux/v4.3/source/drivers/block/xen-blkback/xenbus.c#L258
>>>>>>> 
>>>>>>> So you are creating a guest with "xl create" in dom0 and you see the
>>>>>>> warnings below printed by the Dom0 kernel? I imagine the domU has a
>>>>>>> virtual "disk" of some sort.
>>>>>> Yes you are right I am trying to create the guest with "xl create” and before that, I created the logical volume and trying to attach the logical volume
>>>>>> block to the domain with “xl block-attach”. I observed this error with the "xl block-attach” command.
>>>>>> This issue occurs after applying this patch as what I observed this patch introduce the calls to iommu_legacy_{, un}map() to map the grant pages for
>>>>>> IOMMU that touches the page-tables. I am not sure but what I observed is that something is written wrong when iomm_unmap calls unmap the pages
>>>>>> because of that issue is observed.
>>>>> 
>>>>> Can you clarify what you mean by "written wrong"? What sort of error do you see in the iommu_unmap()?
>>>> I might be wrong as per my understanding for ARM we are sharing the P2M between CPU and IOMMU always and the map_grant_ref() function is written in such a way that we have to call iommu_legacy_{, un}map() only if P2M is not shared.
>>> 
>>> map_grant_ref() will call the IOMMU if gnttab_need_iommu_mapping() returns true. I don't really see where this is assuming the P2M is not shared.
>>> 
>>> In fact, on x86, this will always be false for HVM domain (they support both shared and separate page-tables).
>>> 
>>>> As we are sharing the P2M when we call the iommu_map() function it will overwrite the existing GFN -> MFN ( For DOM0 GFN is same as MFN) entry and when we call iommu_unmap() it will unmap the  (GFN -> MFN ) entry from the page-table.
>>> AFAIK, there should be nothing mapped at that GFN because the page belongs to the guest. At worse, we would overwrite a mapping that is the same.
>> > Sorry I should have mention before backend/frontend is dom0 in this 
> case and GFN is mapped. I am trying to attach the block device to DOM0
> 
> Ah, your log makes a lot more sense now. Thank you for the clarification!
> 
> So yes, I agree that iommu_{,un}map() will do the wrong thing if the frontend and backend in the same domain.
> 
> I don't know what the state in Linux, but from Xen PoV it should be possible to have the backend/frontend in the same domain.
> 
> I think we want to ignore the IOMMU mapping request when the domain is the same. Can you try this small untested patch:

I tested the patch and it is working fine for both dom0/domU. I am able to attach the block device to dom0/domu.
Also I didn’t observe the IOMMU fault also for block device that we have behind IOMMU on our system and attached to domU.

Regards,
Rahul 
> 
> diff --git a/xen/drivers/passthrough/arm/iommu_helpers.c b/xen/drivers/passthrough/arm/iommu_helpers.c
> index a36e2b8e6c42..7bad13593146 100644
> --- a/xen/drivers/passthrough/arm/iommu_helpers.c
> +++ b/xen/drivers/passthrough/arm/iommu_helpers.c
> @@ -53,6 +53,9 @@ int __must_check arm_iommu_map_page(struct domain *d, dfn_t dfn, mfn_t mfn,
> 
>     t = (flags & IOMMUF_writable) ? p2m_iommu_map_rw : p2m_iommu_map_ro;
> 
> +    if ( page_get_owner(mfn_to_page(mfn)) == d )
> +        return 0;
> +
>     /*
>      * The function guest_physmap_add_entry replaces the current mapping
>      * if there is already one...
> @@ -71,6 +74,9 @@ int __must_check arm_iommu_unmap_page(struct domain *d, dfn_t dfn,
>     if ( !is_domain_direct_mapped(d) )
>         return -EINVAL;
> 
> +    if ( page_get_owner(mfn_to_page(mfn)) == d )
> +        return 0;
> +
>     return guest_physmap_remove_page(d, _gfn(dfn_x(dfn)), _mfn(dfn_x(dfn)), 0);
> }
> 
> Cheers,
> 
> -- 
> Julien Grall
Stefano Stabellini Feb. 11, 2021, 8:55 p.m. UTC | #24
On Thu, 11 Feb 2021, Julien Grall wrote:
> On 11/02/2021 13:20, Rahul Singh wrote:
> > > On 10 Feb 2021, at 7:52 pm, Julien Grall <julien@xen.org> wrote:
> > > On 10/02/2021 18:08, Rahul Singh wrote:
> > > > Hello Julien,
> > > > > On 10 Feb 2021, at 5:34 pm, Julien Grall <julien@xen.org> wrote:
> > > > > On 10/02/2021 15:06, Rahul Singh wrote:
> > > > > > > On 9 Feb 2021, at 8:36 pm, Stefano Stabellini
> > > > > > > <sstabellini@kernel.org> wrote:
> > > > > > > 
> > > > > > > On Tue, 9 Feb 2021, Rahul Singh wrote:
> > > > > > > > > On 8 Feb 2021, at 6:49 pm, Stefano Stabellini
> > > > > > > > > <sstabellini@kernel.org> wrote:
> > > > > > > > > 
> > > > > > > > > Commit 91d4eca7add broke gnttab_need_iommu_mapping on ARM.
> > > > > > > > > The offending chunk is:
> > > > > > > > > 
> > > > > > > > > #define gnttab_need_iommu_mapping(d)                    \
> > > > > > > > > -    (is_domain_direct_mapped(d) && need_iommu(d))
> > > > > > > > > +    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
> > > > > > > > > 
> > > > > > > > > On ARM we need gnttab_need_iommu_mapping to be true for dom0
> > > > > > > > > when it is
> > > > > > > > > directly mapped and IOMMU is enabled for the domain, like the
> > > > > > > > > old check
> > > > > > > > > did, but the new check is always false.
> > > > > > > > > 
> > > > > > > > > In fact, need_iommu_pt_sync is defined as
> > > > > > > > > dom_iommu(d)->need_sync and
> > > > > > > > > need_sync is set as:
> > > > > > > > > 
> > > > > > > > >    if ( !is_hardware_domain(d) || iommu_hwdom_strict )
> > > > > > > > >        hd->need_sync = !iommu_use_hap_pt(d);
> > > > > > > > > 
> > > > > > > > > iommu_use_hap_pt(d) means that the page-table used by the
> > > > > > > > > IOMMU is the
> > > > > > > > > P2M. It is true on ARM. need_sync means that you have a
> > > > > > > > > separate IOMMU
> > > > > > > > > page-table and it needs to be updated for every change.
> > > > > > > > > need_sync is set
> > > > > > > > > to false on ARM. Hence, gnttab_need_iommu_mapping(d) is false
> > > > > > > > > too,
> > > > > > > > > which is wrong.
> > > > > > > > > 
> > > > > > > > > As a consequence, when using PV network from a domU on a
> > > > > > > > > system where
> > > > > > > > > IOMMU is on from Dom0, I get:
> > > > > > > > > 
> > > > > > > > > (XEN) smmu: /smmu@fd800000: Unhandled context fault:
> > > > > > > > > fsr=0x402, iova=0x8424cb148, fsynr=0xb0001, cb=0
> > > > > > > > > [   68.290307] macb ff0e0000.ethernet eth0: DMA bus error:
> > > > > > > > > HRESP not OK
> > > > > > > > > 
> > > > > > > > > The fix is to go back to something along the lines of the old
> > > > > > > > > implementation of gnttab_need_iommu_mapping.
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Stefano Stabellini
> > > > > > > > > <stefano.stabellini@xilinx.com>
> > > > > > > > > Fixes: 91d4eca7add
> > > > > > > > > Backport: 4.12+
> > > > > > > > > 
> > > > > > > > > ---
> > > > > > > > > 
> > > > > > > > > Given the severity of the bug, I would like to request this
> > > > > > > > > patch to be
> > > > > > > > > backported to 4.12 too, even if 4.12 is security-fixes only
> > > > > > > > > since Oct
> > > > > > > > > 2020.
> > > > > > > > > 
> > > > > > > > > For the 4.12 backport, we can use iommu_enabled() instead of
> > > > > > > > > is_iommu_enabled() in the implementation of
> > > > > > > > > gnttab_need_iommu_mapping.
> > > > > > > > > 
> > > > > > > > > Changes in v2:
> > > > > > > > > - improve commit message
> > > > > > > > > - add is_iommu_enabled(d) to the check
> > > > > > > > > ---
> > > > > > > > > xen/include/asm-arm/grant_table.h | 2 +-
> > > > > > > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > > > > 
> > > > > > > > > diff --git a/xen/include/asm-arm/grant_table.h
> > > > > > > > > b/xen/include/asm-arm/grant_table.h
> > > > > > > > > index 6f585b1538..0ce77f9a1c 100644
> > > > > > > > > --- a/xen/include/asm-arm/grant_table.h
> > > > > > > > > +++ b/xen/include/asm-arm/grant_table.h
> > > > > > > > > @@ -89,7 +89,7 @@ int replace_grant_host_mapping(unsigned long
> > > > > > > > > gpaddr, mfn_t mfn,
> > > > > > > > >     (((i) >= nr_status_frames(t)) ? INVALID_GFN :
> > > > > > > > > (t)->arch.status_gfn[i])
> > > > > > > > > 
> > > > > > > > > #define gnttab_need_iommu_mapping(d)                    \
> > > > > > > > > -    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
> > > > > > > > > +    (is_domain_direct_mapped(d) && is_iommu_enabled(d))
> > > > > > > > > 
> > > > > > > > > #endif /* __ASM_GRANT_TABLE_H__ */
> > > > > > > > 
> > > > > > > > I tested the patch and while creating the guest I observed the
> > > > > > > > below warning from Linux for block device.
> > > > > > > > https://elixir.bootlin.com/linux/v4.3/source/drivers/block/xen-blkback/xenbus.c#L258
> > > > > > > 
> > > > > > > So you are creating a guest with "xl create" in dom0 and you see
> > > > > > > the
> > > > > > > warnings below printed by the Dom0 kernel? I imagine the domU has
> > > > > > > a
> > > > > > > virtual "disk" of some sort.
> > > > > > Yes you are right I am trying to create the guest with "xl create”
> > > > > > and before that, I created the logical volume and trying to attach
> > > > > > the logical volume
> > > > > > block to the domain with “xl block-attach”. I observed this error
> > > > > > with the "xl block-attach” command.
> > > > > > This issue occurs after applying this patch as what I observed this
> > > > > > patch introduce the calls to iommu_legacy_{, un}map() to map the
> > > > > > grant pages for
> > > > > > IOMMU that touches the page-tables. I am not sure but what I
> > > > > > observed is that something is written wrong when iomm_unmap calls
> > > > > > unmap the pages
> > > > > > because of that issue is observed.
> > > > > 
> > > > > Can you clarify what you mean by "written wrong"? What sort of error
> > > > > do you see in the iommu_unmap()?
> > > > I might be wrong as per my understanding for ARM we are sharing the P2M
> > > > between CPU and IOMMU always and the map_grant_ref() function is written
> > > > in such a way that we have to call iommu_legacy_{, un}map() only if P2M
> > > > is not shared.
> > > 
> > > map_grant_ref() will call the IOMMU if gnttab_need_iommu_mapping() returns
> > > true. I don't really see where this is assuming the P2M is not shared.
> > > 
> > > In fact, on x86, this will always be false for HVM domain (they support
> > > both shared and separate page-tables).
> > > 
> > > > As we are sharing the P2M when we call the iommu_map() function it will
> > > > overwrite the existing GFN -> MFN ( For DOM0 GFN is same as MFN) entry
> > > > and when we call iommu_unmap() it will unmap the  (GFN -> MFN ) entry
> > > > from the page-table.
> > > AFAIK, there should be nothing mapped at that GFN because the page belongs
> > > to the guest. At worse, we would overwrite a mapping that is the same.
> >  > Sorry I should have mention before backend/frontend is dom0 in this 
> case and GFN is mapped. I am trying to attach the block device to DOM0
> 
> Ah, your log makes a lot more sense now. Thank you for the clarification!
> 
> So yes, I agree that iommu_{,un}map() will do the wrong thing if the frontend
> and backend in the same domain.
> 
> I don't know what the state in Linux, but from Xen PoV it should be possible
> to have the backend/frontend in the same domain.
> 
> I think we want to ignore the IOMMU mapping request when the domain is the
> same. Can you try this small untested patch:

Given that all the pages already owned by the domain should already be
in the shared pagetable between MMU and IOMMU, there is no need to
create a second mapping. In fact it is guaranteed to overlap with an
existing mapping.

In theory, if guest_physmap_add_entry returned -EEXIST if a mapping
identical to the one we want to add is already in the pagetable, in this
instance we would see -EEXIST being returned.

Based on that, I cannot think of unwanted side-effects for this patch.
It looks OK to me.

Given that it solves a different issue, I think it should be a separate
patch from [1]. Julien, are you OK with that or would you rather merge
the two?

[1] https://marc.info/?l=xen-devel&m=161281017406202


> diff --git a/xen/drivers/passthrough/arm/iommu_helpers.c
> b/xen/drivers/passthrough/arm/iommu_helpers.c
> index a36e2b8e6c42..7bad13593146 100644
> --- a/xen/drivers/passthrough/arm/iommu_helpers.c
> +++ b/xen/drivers/passthrough/arm/iommu_helpers.c
> @@ -53,6 +53,9 @@ int __must_check arm_iommu_map_page(struct domain *d, dfn_t
> dfn, mfn_t mfn,
> 
>      t = (flags & IOMMUF_writable) ? p2m_iommu_map_rw : p2m_iommu_map_ro;
> 
> +    if ( page_get_owner(mfn_to_page(mfn)) == d )
> +        return 0;
> +
>      /*
>       * The function guest_physmap_add_entry replaces the current mapping
>       * if there is already one...
> @@ -71,6 +74,9 @@ int __must_check arm_iommu_unmap_page(struct domain *d,
> dfn_t dfn,
>      if ( !is_domain_direct_mapped(d) )
>          return -EINVAL;
> 
> +    if ( page_get_owner(mfn_to_page(mfn)) == d )
> +        return 0;
> +
>      return guest_physmap_remove_page(d, _gfn(dfn_x(dfn)), _mfn(dfn_x(dfn)),
> 0);
>  }
Julien Grall Feb. 12, 2021, 1:30 p.m. UTC | #25
Hi Stefano,

On 11/02/2021 20:55, Stefano Stabellini wrote:
> On Thu, 11 Feb 2021, Julien Grall wrote:
>> On 11/02/2021 13:20, Rahul Singh wrote:
>>>> On 10 Feb 2021, at 7:52 pm, Julien Grall <julien@xen.org> wrote:
>>>> On 10/02/2021 18:08, Rahul Singh wrote:
>>>>> Hello Julien,
>>>>>> On 10 Feb 2021, at 5:34 pm, Julien Grall <julien@xen.org> wrote:
>>>>>> On 10/02/2021 15:06, Rahul Singh wrote:
>>>>>>>> On 9 Feb 2021, at 8:36 pm, Stefano Stabellini
>>>>>>>> <sstabellini@kernel.org> wrote:
>>>>>>>>
>>>>>>>> On Tue, 9 Feb 2021, Rahul Singh wrote:
>>>>>>>>>> On 8 Feb 2021, at 6:49 pm, Stefano Stabellini
>>>>>>>>>> <sstabellini@kernel.org> wrote:
>>>>>>>>>>
>>>>>>>>>> Commit 91d4eca7add broke gnttab_need_iommu_mapping on ARM.
>>>>>>>>>> The offending chunk is:
>>>>>>>>>>
>>>>>>>>>> #define gnttab_need_iommu_mapping(d)                    \
>>>>>>>>>> -    (is_domain_direct_mapped(d) && need_iommu(d))
>>>>>>>>>> +    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
>>>>>>>>>>
>>>>>>>>>> On ARM we need gnttab_need_iommu_mapping to be true for dom0
>>>>>>>>>> when it is
>>>>>>>>>> directly mapped and IOMMU is enabled for the domain, like the
>>>>>>>>>> old check
>>>>>>>>>> did, but the new check is always false.
>>>>>>>>>>
>>>>>>>>>> In fact, need_iommu_pt_sync is defined as
>>>>>>>>>> dom_iommu(d)->need_sync and
>>>>>>>>>> need_sync is set as:
>>>>>>>>>>
>>>>>>>>>>     if ( !is_hardware_domain(d) || iommu_hwdom_strict )
>>>>>>>>>>         hd->need_sync = !iommu_use_hap_pt(d);
>>>>>>>>>>
>>>>>>>>>> iommu_use_hap_pt(d) means that the page-table used by the
>>>>>>>>>> IOMMU is the
>>>>>>>>>> P2M. It is true on ARM. need_sync means that you have a
>>>>>>>>>> separate IOMMU
>>>>>>>>>> page-table and it needs to be updated for every change.
>>>>>>>>>> need_sync is set
>>>>>>>>>> to false on ARM. Hence, gnttab_need_iommu_mapping(d) is false
>>>>>>>>>> too,
>>>>>>>>>> which is wrong.
>>>>>>>>>>
>>>>>>>>>> As a consequence, when using PV network from a domU on a
>>>>>>>>>> system where
>>>>>>>>>> IOMMU is on from Dom0, I get:
>>>>>>>>>>
>>>>>>>>>> (XEN) smmu: /smmu@fd800000: Unhandled context fault:
>>>>>>>>>> fsr=0x402, iova=0x8424cb148, fsynr=0xb0001, cb=0
>>>>>>>>>> [   68.290307] macb ff0e0000.ethernet eth0: DMA bus error:
>>>>>>>>>> HRESP not OK
>>>>>>>>>>
>>>>>>>>>> The fix is to go back to something along the lines of the old
>>>>>>>>>> implementation of gnttab_need_iommu_mapping.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Stefano Stabellini
>>>>>>>>>> <stefano.stabellini@xilinx.com>
>>>>>>>>>> Fixes: 91d4eca7add
>>>>>>>>>> Backport: 4.12+
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> Given the severity of the bug, I would like to request this
>>>>>>>>>> patch to be
>>>>>>>>>> backported to 4.12 too, even if 4.12 is security-fixes only
>>>>>>>>>> since Oct
>>>>>>>>>> 2020.
>>>>>>>>>>
>>>>>>>>>> For the 4.12 backport, we can use iommu_enabled() instead of
>>>>>>>>>> is_iommu_enabled() in the implementation of
>>>>>>>>>> gnttab_need_iommu_mapping.
>>>>>>>>>>
>>>>>>>>>> Changes in v2:
>>>>>>>>>> - improve commit message
>>>>>>>>>> - add is_iommu_enabled(d) to the check
>>>>>>>>>> ---
>>>>>>>>>> xen/include/asm-arm/grant_table.h | 2 +-
>>>>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/xen/include/asm-arm/grant_table.h
>>>>>>>>>> b/xen/include/asm-arm/grant_table.h
>>>>>>>>>> index 6f585b1538..0ce77f9a1c 100644
>>>>>>>>>> --- a/xen/include/asm-arm/grant_table.h
>>>>>>>>>> +++ b/xen/include/asm-arm/grant_table.h
>>>>>>>>>> @@ -89,7 +89,7 @@ int replace_grant_host_mapping(unsigned long
>>>>>>>>>> gpaddr, mfn_t mfn,
>>>>>>>>>>      (((i) >= nr_status_frames(t)) ? INVALID_GFN :
>>>>>>>>>> (t)->arch.status_gfn[i])
>>>>>>>>>>
>>>>>>>>>> #define gnttab_need_iommu_mapping(d)                    \
>>>>>>>>>> -    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
>>>>>>>>>> +    (is_domain_direct_mapped(d) && is_iommu_enabled(d))
>>>>>>>>>>
>>>>>>>>>> #endif /* __ASM_GRANT_TABLE_H__ */
>>>>>>>>>
>>>>>>>>> I tested the patch and while creating the guest I observed the
>>>>>>>>> below warning from Linux for block device.
>>>>>>>>> https://elixir.bootlin.com/linux/v4.3/source/drivers/block/xen-blkback/xenbus.c#L258
>>>>>>>>
>>>>>>>> So you are creating a guest with "xl create" in dom0 and you see
>>>>>>>> the
>>>>>>>> warnings below printed by the Dom0 kernel? I imagine the domU has
>>>>>>>> a
>>>>>>>> virtual "disk" of some sort.
>>>>>>> Yes you are right I am trying to create the guest with "xl create”
>>>>>>> and before that, I created the logical volume and trying to attach
>>>>>>> the logical volume
>>>>>>> block to the domain with “xl block-attach”. I observed this error
>>>>>>> with the "xl block-attach” command.
>>>>>>> This issue occurs after applying this patch as what I observed this
>>>>>>> patch introduce the calls to iommu_legacy_{, un}map() to map the
>>>>>>> grant pages for
>>>>>>> IOMMU that touches the page-tables. I am not sure but what I
>>>>>>> observed is that something is written wrong when iomm_unmap calls
>>>>>>> unmap the pages
>>>>>>> because of that issue is observed.
>>>>>>
>>>>>> Can you clarify what you mean by "written wrong"? What sort of error
>>>>>> do you see in the iommu_unmap()?
>>>>> I might be wrong as per my understanding for ARM we are sharing the P2M
>>>>> between CPU and IOMMU always and the map_grant_ref() function is written
>>>>> in such a way that we have to call iommu_legacy_{, un}map() only if P2M
>>>>> is not shared.
>>>>
>>>> map_grant_ref() will call the IOMMU if gnttab_need_iommu_mapping() returns
>>>> true. I don't really see where this is assuming the P2M is not shared.
>>>>
>>>> In fact, on x86, this will always be false for HVM domain (they support
>>>> both shared and separate page-tables).
>>>>
>>>>> As we are sharing the P2M when we call the iommu_map() function it will
>>>>> overwrite the existing GFN -> MFN ( For DOM0 GFN is same as MFN) entry
>>>>> and when we call iommu_unmap() it will unmap the  (GFN -> MFN ) entry
>>>>> from the page-table.
>>>> AFAIK, there should be nothing mapped at that GFN because the page belongs
>>>> to the guest. At worse, we would overwrite a mapping that is the same.
>>>   > Sorry I should have mention before backend/frontend is dom0 in this
>> case and GFN is mapped. I am trying to attach the block device to DOM0
>>
>> Ah, your log makes a lot more sense now. Thank you for the clarification!
>>
>> So yes, I agree that iommu_{,un}map() will do the wrong thing if the frontend
>> and backend in the same domain.
>>
>> I don't know what the state in Linux, but from Xen PoV it should be possible
>> to have the backend/frontend in the same domain.
>>
>> I think we want to ignore the IOMMU mapping request when the domain is the
>> same. Can you try this small untested patch:
> 
> Given that all the pages already owned by the domain should already be
> in the shared pagetable between MMU and IOMMU, there is no need to
> create a second mapping. In fact it is guaranteed to overlap with an
> existing mapping.

It is **almost** guaranteed :). I can see a few reasons for this to not 
be valid:
    - Using the domain shared info in a grant
    - With a good timing, it would be possible that a differente vCPU 
remove the mapping after the P2M walk

That said, I feel it is not an expected behavior for a domain guest. So 
it is not something we should care at least for now.

> In theory, if guest_physmap_add_entry returned -EEXIST if a mapping
> identical to the one we want to add is already in the pagetable, in this
> instance we would see -EEXIST being returned.

While I agree that the GFN and MFN would be the same, there mapping 
still not be identical because the P2M type (and potentially the 
permission) will differ.

However, guest_physmap_add_entry() doesn't do such check today. It will 
just happily replace any mapping. It would be good to harden the code 
P2M as this is not the first time we see report of mapping overwritten.

I actually have a task in my todo list but I never got the chance to 
spend time on it.

> 
> Based on that, I cannot think of unwanted side-effects for this patch.
> It looks OK to me.
> 
> Given that it solves a different issue, I think it should be a separate
> patch from [1]. Julien, are you OK with that or would you rather merge
> the two?

They are two distinct issues. In fact, the bug has always been present 
on Arm. I will send a separate patch.

Cheers,
Stefano Stabellini Feb. 12, 2021, 6:56 p.m. UTC | #26
On Fri, 12 Feb 2021, Julien Grall wrote:
> On 11/02/2021 20:55, Stefano Stabellini wrote:
> > On Thu, 11 Feb 2021, Julien Grall wrote:
> > > On 11/02/2021 13:20, Rahul Singh wrote:
> > > > > On 10 Feb 2021, at 7:52 pm, Julien Grall <julien@xen.org> wrote:
> > > > > On 10/02/2021 18:08, Rahul Singh wrote:
> > > > > > Hello Julien,
> > > > > > > On 10 Feb 2021, at 5:34 pm, Julien Grall <julien@xen.org> wrote:
> > > > > > > On 10/02/2021 15:06, Rahul Singh wrote:
> > > > > > > > > On 9 Feb 2021, at 8:36 pm, Stefano Stabellini
> > > > > > > > > <sstabellini@kernel.org> wrote:
> > > > > > > > > 
> > > > > > > > > On Tue, 9 Feb 2021, Rahul Singh wrote:
> > > > > > > > > > > On 8 Feb 2021, at 6:49 pm, Stefano Stabellini
> > > > > > > > > > > <sstabellini@kernel.org> wrote:
> > > > > > > > > > > 
> > > > > > > > > > > Commit 91d4eca7add broke gnttab_need_iommu_mapping on ARM.
> > > > > > > > > > > The offending chunk is:
> > > > > > > > > > > 
> > > > > > > > > > > #define gnttab_need_iommu_mapping(d)                    \
> > > > > > > > > > > -    (is_domain_direct_mapped(d) && need_iommu(d))
> > > > > > > > > > > +    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
> > > > > > > > > > > 
> > > > > > > > > > > On ARM we need gnttab_need_iommu_mapping to be true for
> > > > > > > > > > > dom0
> > > > > > > > > > > when it is
> > > > > > > > > > > directly mapped and IOMMU is enabled for the domain, like
> > > > > > > > > > > the
> > > > > > > > > > > old check
> > > > > > > > > > > did, but the new check is always false.
> > > > > > > > > > > 
> > > > > > > > > > > In fact, need_iommu_pt_sync is defined as
> > > > > > > > > > > dom_iommu(d)->need_sync and
> > > > > > > > > > > need_sync is set as:
> > > > > > > > > > > 
> > > > > > > > > > >     if ( !is_hardware_domain(d) || iommu_hwdom_strict )
> > > > > > > > > > >         hd->need_sync = !iommu_use_hap_pt(d);
> > > > > > > > > > > 
> > > > > > > > > > > iommu_use_hap_pt(d) means that the page-table used by the
> > > > > > > > > > > IOMMU is the
> > > > > > > > > > > P2M. It is true on ARM. need_sync means that you have a
> > > > > > > > > > > separate IOMMU
> > > > > > > > > > > page-table and it needs to be updated for every change.
> > > > > > > > > > > need_sync is set
> > > > > > > > > > > to false on ARM. Hence, gnttab_need_iommu_mapping(d) is
> > > > > > > > > > > false
> > > > > > > > > > > too,
> > > > > > > > > > > which is wrong.
> > > > > > > > > > > 
> > > > > > > > > > > As a consequence, when using PV network from a domU on a
> > > > > > > > > > > system where
> > > > > > > > > > > IOMMU is on from Dom0, I get:
> > > > > > > > > > > 
> > > > > > > > > > > (XEN) smmu: /smmu@fd800000: Unhandled context fault:
> > > > > > > > > > > fsr=0x402, iova=0x8424cb148, fsynr=0xb0001, cb=0
> > > > > > > > > > > [   68.290307] macb ff0e0000.ethernet eth0: DMA bus error:
> > > > > > > > > > > HRESP not OK
> > > > > > > > > > > 
> > > > > > > > > > > The fix is to go back to something along the lines of the
> > > > > > > > > > > old
> > > > > > > > > > > implementation of gnttab_need_iommu_mapping.
> > > > > > > > > > > 
> > > > > > > > > > > Signed-off-by: Stefano Stabellini
> > > > > > > > > > > <stefano.stabellini@xilinx.com>
> > > > > > > > > > > Fixes: 91d4eca7add
> > > > > > > > > > > Backport: 4.12+
> > > > > > > > > > > 
> > > > > > > > > > > ---
> > > > > > > > > > > 
> > > > > > > > > > > Given the severity of the bug, I would like to request
> > > > > > > > > > > this
> > > > > > > > > > > patch to be
> > > > > > > > > > > backported to 4.12 too, even if 4.12 is security-fixes
> > > > > > > > > > > only
> > > > > > > > > > > since Oct
> > > > > > > > > > > 2020.
> > > > > > > > > > > 
> > > > > > > > > > > For the 4.12 backport, we can use iommu_enabled() instead
> > > > > > > > > > > of
> > > > > > > > > > > is_iommu_enabled() in the implementation of
> > > > > > > > > > > gnttab_need_iommu_mapping.
> > > > > > > > > > > 
> > > > > > > > > > > Changes in v2:
> > > > > > > > > > > - improve commit message
> > > > > > > > > > > - add is_iommu_enabled(d) to the check
> > > > > > > > > > > ---
> > > > > > > > > > > xen/include/asm-arm/grant_table.h | 2 +-
> > > > > > > > > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > > > > > > 
> > > > > > > > > > > diff --git a/xen/include/asm-arm/grant_table.h
> > > > > > > > > > > b/xen/include/asm-arm/grant_table.h
> > > > > > > > > > > index 6f585b1538..0ce77f9a1c 100644
> > > > > > > > > > > --- a/xen/include/asm-arm/grant_table.h
> > > > > > > > > > > +++ b/xen/include/asm-arm/grant_table.h
> > > > > > > > > > > @@ -89,7 +89,7 @@ int replace_grant_host_mapping(unsigned
> > > > > > > > > > > long
> > > > > > > > > > > gpaddr, mfn_t mfn,
> > > > > > > > > > >      (((i) >= nr_status_frames(t)) ? INVALID_GFN :
> > > > > > > > > > > (t)->arch.status_gfn[i])
> > > > > > > > > > > 
> > > > > > > > > > > #define gnttab_need_iommu_mapping(d)                    \
> > > > > > > > > > > -    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
> > > > > > > > > > > +    (is_domain_direct_mapped(d) && is_iommu_enabled(d))
> > > > > > > > > > > 
> > > > > > > > > > > #endif /* __ASM_GRANT_TABLE_H__ */
> > > > > > > > > > 
> > > > > > > > > > I tested the patch and while creating the guest I observed
> > > > > > > > > > the
> > > > > > > > > > below warning from Linux for block device.
> > > > > > > > > > https://elixir.bootlin.com/linux/v4.3/source/drivers/block/xen-blkback/xenbus.c#L258
> > > > > > > > > 
> > > > > > > > > So you are creating a guest with "xl create" in dom0 and you
> > > > > > > > > see
> > > > > > > > > the
> > > > > > > > > warnings below printed by the Dom0 kernel? I imagine the domU
> > > > > > > > > has
> > > > > > > > > a
> > > > > > > > > virtual "disk" of some sort.
> > > > > > > > Yes you are right I am trying to create the guest with "xl
> > > > > > > > create”
> > > > > > > > and before that, I created the logical volume and trying to
> > > > > > > > attach
> > > > > > > > the logical volume
> > > > > > > > block to the domain with “xl block-attach”. I observed this
> > > > > > > > error
> > > > > > > > with the "xl block-attach” command.
> > > > > > > > This issue occurs after applying this patch as what I observed
> > > > > > > > this
> > > > > > > > patch introduce the calls to iommu_legacy_{, un}map() to map the
> > > > > > > > grant pages for
> > > > > > > > IOMMU that touches the page-tables. I am not sure but what I
> > > > > > > > observed is that something is written wrong when iomm_unmap
> > > > > > > > calls
> > > > > > > > unmap the pages
> > > > > > > > because of that issue is observed.
> > > > > > > 
> > > > > > > Can you clarify what you mean by "written wrong"? What sort of
> > > > > > > error
> > > > > > > do you see in the iommu_unmap()?
> > > > > > I might be wrong as per my understanding for ARM we are sharing the
> > > > > > P2M
> > > > > > between CPU and IOMMU always and the map_grant_ref() function is
> > > > > > written
> > > > > > in such a way that we have to call iommu_legacy_{, un}map() only if
> > > > > > P2M
> > > > > > is not shared.
> > > > > 
> > > > > map_grant_ref() will call the IOMMU if gnttab_need_iommu_mapping()
> > > > > returns
> > > > > true. I don't really see where this is assuming the P2M is not shared.
> > > > > 
> > > > > In fact, on x86, this will always be false for HVM domain (they
> > > > > support
> > > > > both shared and separate page-tables).
> > > > > 
> > > > > > As we are sharing the P2M when we call the iommu_map() function it
> > > > > > will
> > > > > > overwrite the existing GFN -> MFN ( For DOM0 GFN is same as MFN)
> > > > > > entry
> > > > > > and when we call iommu_unmap() it will unmap the  (GFN -> MFN )
> > > > > > entry
> > > > > > from the page-table.
> > > > > AFAIK, there should be nothing mapped at that GFN because the page
> > > > > belongs
> > > > > to the guest. At worse, we would overwrite a mapping that is the same.
> > > >   > Sorry I should have mention before backend/frontend is dom0 in this
> > > case and GFN is mapped. I am trying to attach the block device to DOM0
> > > 
> > > Ah, your log makes a lot more sense now. Thank you for the clarification!
> > > 
> > > So yes, I agree that iommu_{,un}map() will do the wrong thing if the
> > > frontend
> > > and backend in the same domain.
> > > 
> > > I don't know what the state in Linux, but from Xen PoV it should be
> > > possible
> > > to have the backend/frontend in the same domain.
> > > 
> > > I think we want to ignore the IOMMU mapping request when the domain is the
> > > same. Can you try this small untested patch:
> > 
> > Given that all the pages already owned by the domain should already be
> > in the shared pagetable between MMU and IOMMU, there is no need to
> > create a second mapping. In fact it is guaranteed to overlap with an
> > existing mapping.
> 
> It is **almost** guaranteed :). I can see a few reasons for this to not be
> valid:
>    - Using the domain shared info in a grant
>    - With a good timing, it would be possible that a differente vCPU remove
> the mapping after the P2M walk
> 
> That said, I feel it is not an expected behavior for a domain guest. So it is
> not something we should care at least for now.
> 
> > In theory, if guest_physmap_add_entry returned -EEXIST if a mapping
> > identical to the one we want to add is already in the pagetable, in this
> > instance we would see -EEXIST being returned.
> 
> While I agree that the GFN and MFN would be the same, there mapping still not
> be identical because the P2M type (and potentially the permission) will
> differ.
> 
> However, guest_physmap_add_entry() doesn't do such check today. It will just
> happily replace any mapping. It would be good to harden the code P2M as this
> is not the first time we see report of mapping overwritten.
> 
> I actually have a task in my todo list but I never got the chance to spend
> time on it.
> 
> > 
> > Based on that, I cannot think of unwanted side-effects for this patch.
> > It looks OK to me.
> > 
> > Given that it solves a different issue, I think it should be a separate
> > patch from [1]. Julien, are you OK with that or would you rather merge
> > the two?
> 
> They are two distinct issues. In fact, the bug has always been present on Arm.
> I will send a separate patch.

Excellent, thank you!
Julien Grall Feb. 14, 2021, 2:32 p.m. UTC | #27
Hi Rahul,

On 11/02/2021 16:05, Rahul Singh wrote:
>> On 11 Feb 2021, at 1:52 pm, Julien Grall <julien@xen.org> wrote:
>>
>>
>>
>> On 11/02/2021 13:20, Rahul Singh wrote:
>>> Hello Julien,
>>
>> Hi Rahul,
>>
>>>> On 10 Feb 2021, at 7:52 pm, Julien Grall <julien@xen.org> wrote:
>>>>
>>>>
>>>>
>>>> On 10/02/2021 18:08, Rahul Singh wrote:
>>>>> Hello Julien,
>>>>>> On 10 Feb 2021, at 5:34 pm, Julien Grall <julien@xen.org> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> On 10/02/2021 15:06, Rahul Singh wrote:
>>>>>>>> On 9 Feb 2021, at 8:36 pm, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>>>>>>>
>>>>>>>> On Tue, 9 Feb 2021, Rahul Singh wrote:
>>>>>>>>>> On 8 Feb 2021, at 6:49 pm, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>>>>>>>>>
>>>>>>>>>> Commit 91d4eca7add broke gnttab_need_iommu_mapping on ARM.
>>>>>>>>>> The offending chunk is:
>>>>>>>>>>
>>>>>>>>>> #define gnttab_need_iommu_mapping(d)                    \
>>>>>>>>>> -    (is_domain_direct_mapped(d) && need_iommu(d))
>>>>>>>>>> +    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
>>>>>>>>>>
>>>>>>>>>> On ARM we need gnttab_need_iommu_mapping to be true for dom0 when it is
>>>>>>>>>> directly mapped and IOMMU is enabled for the domain, like the old check
>>>>>>>>>> did, but the new check is always false.
>>>>>>>>>>
>>>>>>>>>> In fact, need_iommu_pt_sync is defined as dom_iommu(d)->need_sync and
>>>>>>>>>> need_sync is set as:
>>>>>>>>>>
>>>>>>>>>>    if ( !is_hardware_domain(d) || iommu_hwdom_strict )
>>>>>>>>>>        hd->need_sync = !iommu_use_hap_pt(d);
>>>>>>>>>>
>>>>>>>>>> iommu_use_hap_pt(d) means that the page-table used by the IOMMU is the
>>>>>>>>>> P2M. It is true on ARM. need_sync means that you have a separate IOMMU
>>>>>>>>>> page-table and it needs to be updated for every change. need_sync is set
>>>>>>>>>> to false on ARM. Hence, gnttab_need_iommu_mapping(d) is false too,
>>>>>>>>>> which is wrong.
>>>>>>>>>>
>>>>>>>>>> As a consequence, when using PV network from a domU on a system where
>>>>>>>>>> IOMMU is on from Dom0, I get:
>>>>>>>>>>
>>>>>>>>>> (XEN) smmu: /smmu@fd800000: Unhandled context fault: fsr=0x402, iova=0x8424cb148, fsynr=0xb0001, cb=0
>>>>>>>>>> [   68.290307] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK
>>>>>>>>>>
>>>>>>>>>> The fix is to go back to something along the lines of the old
>>>>>>>>>> implementation of gnttab_need_iommu_mapping.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
>>>>>>>>>> Fixes: 91d4eca7add
>>>>>>>>>> Backport: 4.12+
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> Given the severity of the bug, I would like to request this patch to be
>>>>>>>>>> backported to 4.12 too, even if 4.12 is security-fixes only since Oct
>>>>>>>>>> 2020.
>>>>>>>>>>
>>>>>>>>>> For the 4.12 backport, we can use iommu_enabled() instead of
>>>>>>>>>> is_iommu_enabled() in the implementation of gnttab_need_iommu_mapping.
>>>>>>>>>>
>>>>>>>>>> Changes in v2:
>>>>>>>>>> - improve commit message
>>>>>>>>>> - add is_iommu_enabled(d) to the check
>>>>>>>>>> ---
>>>>>>>>>> xen/include/asm-arm/grant_table.h | 2 +-
>>>>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/xen/include/asm-arm/grant_table.h b/xen/include/asm-arm/grant_table.h
>>>>>>>>>> index 6f585b1538..0ce77f9a1c 100644
>>>>>>>>>> --- a/xen/include/asm-arm/grant_table.h
>>>>>>>>>> +++ b/xen/include/asm-arm/grant_table.h
>>>>>>>>>> @@ -89,7 +89,7 @@ int replace_grant_host_mapping(unsigned long gpaddr, mfn_t mfn,
>>>>>>>>>>     (((i) >= nr_status_frames(t)) ? INVALID_GFN : (t)->arch.status_gfn[i])
>>>>>>>>>>
>>>>>>>>>> #define gnttab_need_iommu_mapping(d)                    \
>>>>>>>>>> -    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
>>>>>>>>>> +    (is_domain_direct_mapped(d) && is_iommu_enabled(d))
>>>>>>>>>>
>>>>>>>>>> #endif /* __ASM_GRANT_TABLE_H__ */
>>>>>>>>>
>>>>>>>>> I tested the patch and while creating the guest I observed the below warning from Linux for block device.
>>>>>>>>> https://elixir.bootlin.com/linux/v4.3/source/drivers/block/xen-blkback/xenbus.c#L258
>>>>>>>>
>>>>>>>> So you are creating a guest with "xl create" in dom0 and you see the
>>>>>>>> warnings below printed by the Dom0 kernel? I imagine the domU has a
>>>>>>>> virtual "disk" of some sort.
>>>>>>> Yes you are right I am trying to create the guest with "xl create” and before that, I created the logical volume and trying to attach the logical volume
>>>>>>> block to the domain with “xl block-attach”. I observed this error with the "xl block-attach” command.
>>>>>>> This issue occurs after applying this patch as what I observed this patch introduce the calls to iommu_legacy_{, un}map() to map the grant pages for
>>>>>>> IOMMU that touches the page-tables. I am not sure but what I observed is that something is written wrong when iomm_unmap calls unmap the pages
>>>>>>> because of that issue is observed.
>>>>>>
>>>>>> Can you clarify what you mean by "written wrong"? What sort of error do you see in the iommu_unmap()?
>>>>> I might be wrong as per my understanding for ARM we are sharing the P2M between CPU and IOMMU always and the map_grant_ref() function is written in such a way that we have to call iommu_legacy_{, un}map() only if P2M is not shared.
>>>>
>>>> map_grant_ref() will call the IOMMU if gnttab_need_iommu_mapping() returns true. I don't really see where this is assuming the P2M is not shared.
>>>>
>>>> In fact, on x86, this will always be false for HVM domain (they support both shared and separate page-tables).
>>>>
>>>>> As we are sharing the P2M when we call the iommu_map() function it will overwrite the existing GFN -> MFN ( For DOM0 GFN is same as MFN) entry and when we call iommu_unmap() it will unmap the  (GFN -> MFN ) entry from the page-table.
>>>> AFAIK, there should be nothing mapped at that GFN because the page belongs to the guest. At worse, we would overwrite a mapping that is the same.
>>>> Sorry I should have mention before backend/frontend is dom0 in this
>> case and GFN is mapped. I am trying to attach the block device to DOM0
>>
>> Ah, your log makes a lot more sense now. Thank you for the clarification!
>>
>> So yes, I agree that iommu_{,un}map() will do the wrong thing if the frontend and backend in the same domain.
>>
>> I don't know what the state in Linux, but from Xen PoV it should be possible to have the backend/frontend in the same domain.
>>
>> I think we want to ignore the IOMMU mapping request when the domain is the same. Can you try this small untested patch:
> 
> I tested the patch and it is working fine for both dom0/domU. I am able to attach the block device to dom0/domu.
> Also I didn’t observe the IOMMU fault also for block device that we have behind IOMMU on our system and attached to domU.

Thanks for the testing. I noticed that my patch doesn't build because 
arm_iommu_unmap_page() doesn't have a parameter mfn. Can you confirm 
whether you had to replace mfn with _mfn(dfn_x(dfn))?

Cheers,
Rahul Singh Feb. 15, 2021, 9:12 a.m. UTC | #28
Hello Julien,

> On 14 Feb 2021, at 2:32 pm, Julien Grall <julien@xen.org> wrote:
> 
> Hi Rahul,
> 
> On 11/02/2021 16:05, Rahul Singh wrote:
>>> On 11 Feb 2021, at 1:52 pm, Julien Grall <julien@xen.org> wrote:
>>> 
>>> 
>>> 
>>> On 11/02/2021 13:20, Rahul Singh wrote:
>>>> Hello Julien,
>>> 
>>> Hi Rahul,
>>> 
>>>>> On 10 Feb 2021, at 7:52 pm, Julien Grall <julien@xen.org> wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>> On 10/02/2021 18:08, Rahul Singh wrote:
>>>>>> Hello Julien,
>>>>>>> On 10 Feb 2021, at 5:34 pm, Julien Grall <julien@xen.org> wrote:
>>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> On 10/02/2021 15:06, Rahul Singh wrote:
>>>>>>>>> On 9 Feb 2021, at 8:36 pm, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>>>>>>>> 
>>>>>>>>> On Tue, 9 Feb 2021, Rahul Singh wrote:
>>>>>>>>>>> On 8 Feb 2021, at 6:49 pm, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Commit 91d4eca7add broke gnttab_need_iommu_mapping on ARM.
>>>>>>>>>>> The offending chunk is:
>>>>>>>>>>> 
>>>>>>>>>>> #define gnttab_need_iommu_mapping(d)                    \
>>>>>>>>>>> -    (is_domain_direct_mapped(d) && need_iommu(d))
>>>>>>>>>>> +    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
>>>>>>>>>>> 
>>>>>>>>>>> On ARM we need gnttab_need_iommu_mapping to be true for dom0 when it is
>>>>>>>>>>> directly mapped and IOMMU is enabled for the domain, like the old check
>>>>>>>>>>> did, but the new check is always false.
>>>>>>>>>>> 
>>>>>>>>>>> In fact, need_iommu_pt_sync is defined as dom_iommu(d)->need_sync and
>>>>>>>>>>> need_sync is set as:
>>>>>>>>>>> 
>>>>>>>>>>>   if ( !is_hardware_domain(d) || iommu_hwdom_strict )
>>>>>>>>>>>       hd->need_sync = !iommu_use_hap_pt(d);
>>>>>>>>>>> 
>>>>>>>>>>> iommu_use_hap_pt(d) means that the page-table used by the IOMMU is the
>>>>>>>>>>> P2M. It is true on ARM. need_sync means that you have a separate IOMMU
>>>>>>>>>>> page-table and it needs to be updated for every change. need_sync is set
>>>>>>>>>>> to false on ARM. Hence, gnttab_need_iommu_mapping(d) is false too,
>>>>>>>>>>> which is wrong.
>>>>>>>>>>> 
>>>>>>>>>>> As a consequence, when using PV network from a domU on a system where
>>>>>>>>>>> IOMMU is on from Dom0, I get:
>>>>>>>>>>> 
>>>>>>>>>>> (XEN) smmu: /smmu@fd800000: Unhandled context fault: fsr=0x402, iova=0x8424cb148, fsynr=0xb0001, cb=0
>>>>>>>>>>> [   68.290307] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK
>>>>>>>>>>> 
>>>>>>>>>>> The fix is to go back to something along the lines of the old
>>>>>>>>>>> implementation of gnttab_need_iommu_mapping.
>>>>>>>>>>> 
>>>>>>>>>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
>>>>>>>>>>> Fixes: 91d4eca7add
>>>>>>>>>>> Backport: 4.12+
>>>>>>>>>>> 
>>>>>>>>>>> ---
>>>>>>>>>>> 
>>>>>>>>>>> Given the severity of the bug, I would like to request this patch to be
>>>>>>>>>>> backported to 4.12 too, even if 4.12 is security-fixes only since Oct
>>>>>>>>>>> 2020.
>>>>>>>>>>> 
>>>>>>>>>>> For the 4.12 backport, we can use iommu_enabled() instead of
>>>>>>>>>>> is_iommu_enabled() in the implementation of gnttab_need_iommu_mapping.
>>>>>>>>>>> 
>>>>>>>>>>> Changes in v2:
>>>>>>>>>>> - improve commit message
>>>>>>>>>>> - add is_iommu_enabled(d) to the check
>>>>>>>>>>> ---
>>>>>>>>>>> xen/include/asm-arm/grant_table.h | 2 +-
>>>>>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>>>>> 
>>>>>>>>>>> diff --git a/xen/include/asm-arm/grant_table.h b/xen/include/asm-arm/grant_table.h
>>>>>>>>>>> index 6f585b1538..0ce77f9a1c 100644
>>>>>>>>>>> --- a/xen/include/asm-arm/grant_table.h
>>>>>>>>>>> +++ b/xen/include/asm-arm/grant_table.h
>>>>>>>>>>> @@ -89,7 +89,7 @@ int replace_grant_host_mapping(unsigned long gpaddr, mfn_t mfn,
>>>>>>>>>>>    (((i) >= nr_status_frames(t)) ? INVALID_GFN : (t)->arch.status_gfn[i])
>>>>>>>>>>> 
>>>>>>>>>>> #define gnttab_need_iommu_mapping(d)                    \
>>>>>>>>>>> -    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
>>>>>>>>>>> +    (is_domain_direct_mapped(d) && is_iommu_enabled(d))
>>>>>>>>>>> 
>>>>>>>>>>> #endif /* __ASM_GRANT_TABLE_H__ */
>>>>>>>>>> 
>>>>>>>>>> I tested the patch and while creating the guest I observed the below warning from Linux for block device.
>>>>>>>>>> https://elixir.bootlin.com/linux/v4.3/source/drivers/block/xen-blkback/xenbus.c#L258
>>>>>>>>> 
>>>>>>>>> So you are creating a guest with "xl create" in dom0 and you see the
>>>>>>>>> warnings below printed by the Dom0 kernel? I imagine the domU has a
>>>>>>>>> virtual "disk" of some sort.
>>>>>>>> Yes you are right I am trying to create the guest with "xl create” and before that, I created the logical volume and trying to attach the logical volume
>>>>>>>> block to the domain with “xl block-attach”. I observed this error with the "xl block-attach” command.
>>>>>>>> This issue occurs after applying this patch as what I observed this patch introduce the calls to iommu_legacy_{, un}map() to map the grant pages for
>>>>>>>> IOMMU that touches the page-tables. I am not sure but what I observed is that something is written wrong when iomm_unmap calls unmap the pages
>>>>>>>> because of that issue is observed.
>>>>>>> 
>>>>>>> Can you clarify what you mean by "written wrong"? What sort of error do you see in the iommu_unmap()?
>>>>>> I might be wrong as per my understanding for ARM we are sharing the P2M between CPU and IOMMU always and the map_grant_ref() function is written in such a way that we have to call iommu_legacy_{, un}map() only if P2M is not shared.
>>>>> 
>>>>> map_grant_ref() will call the IOMMU if gnttab_need_iommu_mapping() returns true. I don't really see where this is assuming the P2M is not shared.
>>>>> 
>>>>> In fact, on x86, this will always be false for HVM domain (they support both shared and separate page-tables).
>>>>> 
>>>>>> As we are sharing the P2M when we call the iommu_map() function it will overwrite the existing GFN -> MFN ( For DOM0 GFN is same as MFN) entry and when we call iommu_unmap() it will unmap the  (GFN -> MFN ) entry from the page-table.
>>>>> AFAIK, there should be nothing mapped at that GFN because the page belongs to the guest. At worse, we would overwrite a mapping that is the same.
>>>>> Sorry I should have mention before backend/frontend is dom0 in this
>>> case and GFN is mapped. I am trying to attach the block device to DOM0
>>> 
>>> Ah, your log makes a lot more sense now. Thank you for the clarification!
>>> 
>>> So yes, I agree that iommu_{,un}map() will do the wrong thing if the frontend and backend in the same domain.
>>> 
>>> I don't know what the state in Linux, but from Xen PoV it should be possible to have the backend/frontend in the same domain.
>>> 
>>> I think we want to ignore the IOMMU mapping request when the domain is the same. Can you try this small untested patch:
>> I tested the patch and it is working fine for both dom0/domU. I am able to attach the block device to dom0/domu.
>> Also I didn’t observe the IOMMU fault also for block device that we have behind IOMMU on our system and attached to domU.
> 
> Thanks for the testing. I noticed that my patch doesn't build because arm_iommu_unmap_page() doesn't have a parameter mfn. Can you confirm whether you had to replace mfn with _mfn(dfn_x(dfn))?

Yes I have to replace the mfn with _mfn(dfn_x(dfn)) to test the patch.

Regards,
Rahul
> 
> Cheers,
> 
> -- 
> Julien Grall
>
Julien Grall Feb. 15, 2021, 11:34 a.m. UTC | #29
Hi Rahul,

On 15/02/2021 09:12, Rahul Singh wrote:
>> Thanks for the testing. I noticed that my patch doesn't build because arm_iommu_unmap_page() doesn't have a parameter mfn. Can you confirm whether you had to replace mfn with _mfn(dfn_x(dfn))?
> 
> Yes I have to replace the mfn with _mfn(dfn_x(dfn)) to test the patch.

Thanks.

In the future, I would suggest to mention such things in your testing.

This will be easier to figure out how one managed to test it and whether 
the fix was the same.

Cheers,
diff mbox series

Patch

diff --git a/xen/include/asm-arm/grant_table.h b/xen/include/asm-arm/grant_table.h
index 6f585b1538..0ce77f9a1c 100644
--- a/xen/include/asm-arm/grant_table.h
+++ b/xen/include/asm-arm/grant_table.h
@@ -89,7 +89,7 @@  int replace_grant_host_mapping(unsigned long gpaddr, mfn_t mfn,
     (((i) >= nr_status_frames(t)) ? INVALID_GFN : (t)->arch.status_gfn[i])
 
 #define gnttab_need_iommu_mapping(d)                    \
-    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
+    (is_domain_direct_mapped(d) && is_iommu_enabled(d))
 
 #endif /* __ASM_GRANT_TABLE_H__ */
 /*