diff mbox

[for-4.8] altp2m: don't attempt to unshare pages during change_altp2m_gfn op

Message ID 20161014000047.18762-1-tamas.lengyel@zentific.com (mailing list archive)
State New, archived
Headers show

Commit Message

Tamas Lengyel Oct. 14, 2016, midnight UTC
Attempting to change gfn mappings with altp2m on a memory shared page results
in a lock-order violation (mm locking order violation: 282 > 254), which
crashes the hypervisor. Don't attempt to automatically unshare such pages and
just fall back to failing the op if the page type is not correct.

Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
---
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/mm/p2m.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

George Dunlap Oct. 20, 2016, 4:18 p.m. UTC | #1
On 14/10/16 01:00, Tamas K Lengyel wrote:
> Attempting to change gfn mappings with altp2m on a memory shared page results
> in a lock-order violation (mm locking order violation: 282 > 254), which
> crashes the hypervisor. Don't attempt to automatically unshare such pages and
> just fall back to failing the op if the page type is not correct.
> 
> Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>

It would be nice to try to untangle thus such that you can reasonably
unshare a page in this circumstance; but given the point in the release
cycle, making it return an error instead of crashing is probably the
right thing to do.

Reviewed-by: George Dunlap <george.dunlap@citrix.com>

This needs a release ack now I think as well.
Wei Liu Oct. 20, 2016, 4:19 p.m. UTC | #2
On Thu, Oct 20, 2016 at 05:18:36PM +0100, George Dunlap wrote:
> On 14/10/16 01:00, Tamas K Lengyel wrote:
> > Attempting to change gfn mappings with altp2m on a memory shared page results
> > in a lock-order violation (mm locking order violation: 282 > 254), which
> > crashes the hypervisor. Don't attempt to automatically unshare such pages and
> > just fall back to failing the op if the page type is not correct.
> > 
> > Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
> 
> It would be nice to try to untangle thus such that you can reasonably
> unshare a page in this circumstance; but given the point in the release
> cycle, making it return an error instead of crashing is probably the
> right thing to do.
> 
> Reviewed-by: George Dunlap <george.dunlap@citrix.com>
> 
> This needs a release ack now I think as well.
> 
> 

Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Tamas Lengyel Oct. 20, 2016, 4:29 p.m. UTC | #3
On Oct 20, 2016 18:18, "George Dunlap" <george.dunlap@citrix.com> wrote:
>
> On 14/10/16 01:00, Tamas K Lengyel wrote:
> > Attempting to change gfn mappings with altp2m on a memory shared page
results
> > in a lock-order violation (mm locking order violation: 282 > 254), which
> > crashes the hypervisor. Don't attempt to automatically unshare such
pages and
> > just fall back to failing the op if the page type is not correct.
> >
> > Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
>
> It would be nice to try to untangle thus such that you can reasonably
> unshare a page in this circumstance; but given the point in the release
> cycle, making it return an error instead of crashing is probably the
> right thing to do.

You can unshare these pages, just have to do in a separate op so the locks
are taken in the right order (memshare before altp2m). Reversing the lock
order is not possible because otherwise the automatic unsharing and
propagation during runtime runs into the lock order problem without the
possibility of recovering. This way the user has the option to handle it
gracefully here.

Tamas
George Dunlap Oct. 20, 2016, 4:40 p.m. UTC | #4
On 20/10/16 17:29, Tamas K Lengyel wrote:
> On Oct 20, 2016 18:18, "George Dunlap" <george.dunlap@citrix.com> wrote:
>>
>> On 14/10/16 01:00, Tamas K Lengyel wrote:
>>> Attempting to change gfn mappings with altp2m on a memory shared page
> results
>>> in a lock-order violation (mm locking order violation: 282 > 254), which
>>> crashes the hypervisor. Don't attempt to automatically unshare such
> pages and
>>> just fall back to failing the op if the page type is not correct.
>>>
>>> Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
>>
>> It would be nice to try to untangle thus such that you can reasonably
>> unshare a page in this circumstance; but given the point in the release
>> cycle, making it return an error instead of crashing is probably the
>> right thing to do.
> 
> You can unshare these pages, just have to do in a separate op so the locks
> are taken in the right order (memshare before altp2m). Reversing the lock
> order is not possible because otherwise the automatic unsharing and
> propagation during runtime runs into the lock order problem without the
> possibility of recovering. This way the user has the option to handle it
> gracefully here.

Yay locks. :-)

It would probably be helpful to have a comment there explaining the
situation, so that people in the future don't need to re-discover this
issue.

Do you want to toss together a patch adding such a comment, or shall I?

 -George
Tamas Lengyel Oct. 20, 2016, 4:42 p.m. UTC | #5
On Oct 20, 2016 18:40, "George Dunlap" <george.dunlap@citrix.com> wrote:
>
> On 20/10/16 17:29, Tamas K Lengyel wrote:
> > On Oct 20, 2016 18:18, "George Dunlap" <george.dunlap@citrix.com> wrote:
> >>
> >> On 14/10/16 01:00, Tamas K Lengyel wrote:
> >>> Attempting to change gfn mappings with altp2m on a memory shared page
> > results
> >>> in a lock-order violation (mm locking order violation: 282 > 254),
which
> >>> crashes the hypervisor. Don't attempt to automatically unshare such
> > pages and
> >>> just fall back to failing the op if the page type is not correct.
> >>>
> >>> Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
> >>
> >> It would be nice to try to untangle thus such that you can reasonably
> >> unshare a page in this circumstance; but given the point in the release
> >> cycle, making it return an error instead of crashing is probably the
> >> right thing to do.
> >
> > You can unshare these pages, just have to do in a separate op so the
locks
> > are taken in the right order (memshare before altp2m). Reversing the
lock
> > order is not possible because otherwise the automatic unsharing and
> > propagation during runtime runs into the lock order problem without the
> > possibility of recovering. This way the user has the option to handle it
> > gracefully here.
>
> Yay locks. :-)
>
> It would probably be helpful to have a comment there explaining the
> situation, so that people in the future don't need to re-discover this
> issue.
>
> Do you want to toss together a patch adding such a comment, or shall I?
>

Please do so if you can, I'm traveling at the moment so it would be a
couple days before I could send a patch for that.

Thanks,
Tamas
diff mbox

Patch

diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 9526fff..6a45185 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -2628,7 +2628,7 @@  int p2m_change_altp2m_gfn(struct domain *d, unsigned int idx,
     if ( !mfn_valid(mfn) )
     {
         mfn = __get_gfn_type_access(hp2m, gfn_x(old_gfn), &t, &a,
-                                    P2M_ALLOC | P2M_UNSHARE, &page_order, 0);
+                                    P2M_ALLOC, &page_order, 0);
 
         if ( !mfn_valid(mfn) || t != p2m_ram_rw )
             goto out;