Message ID | 1490604050-5955-1-git-send-email-Wei.Chen@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Mar 27, 2017 at 04:40:50PM +0800, Wei Chen wrote: > While I was using the alternative patching in the SErrors patch series [1]. > I used a branch instruction as alternative instruction. > > ALTERNATIVE("nop", > "b skip_check", > SKIP_CHECK_PENDING_VSERROR) > > Unfortunately, I got a system panic message with this code: > > (XEN) build-id: f64081d86e7e88504b7d00e1486f25751c004e39 > (XEN) alternatives: Patching with alt table 100b9480 -> 100b9498 > (XEN) Xen BUG at alternative.c:61 > (XEN) ----[ Xen-4.9-unstable arm32 debug=y Tainted: C ]---- > (XEN) CPU: 0 > (XEN) PC: 00252b68 alternative.c#__apply_alternatives+0x128/0x1d4 > (XEN) CPSR: 800000da MODE:Hypervisor > (XEN) R0: 00000000 R1: 00000000 R2: 100b9490 R3: 100b949c > (XEN) R4: eafeff84 R5: 00000000 R6: 100b949c R7: 10079290 > (XEN) R8: 100792ac R9: 00000001 R10:100b948c R11:002cfe04 R12:002932c0 > (XEN) HYP: SP: 002cfdc4 LR: 00239128 > (XEN) > (XEN) VTCR_EL2: 80003558 > (XEN) VTTBR_EL2: 0000000000000000 > (XEN) > (XEN) SCTLR_EL2: 30cd187f > (XEN) HCR_EL2: 000000000038663f > (XEN) TTBR0_EL2: 00000000bff09000 > (XEN) > (XEN) ESR_EL2: 00000000 > (XEN) HPFAR_EL2: 0000000000000000 > (XEN) HDFAR: 00000000 > (XEN) HIFAR: 00000000 > (XEN) > (XEN) Xen stack trace from sp=002cfdc4: > (XEN) 00000000 00294328 002e0004 00000001 10079290 002cfe14 100b9490 00000000 > (XEN) 10010000 10122700 00200000 002cfe1c 00000080 00252c14 00000000 002cfe64 > (XEN) 00252dd8 00000007 00000000 000bfe00 100b9480 100b9498 002cfe1c 002cfe1c > (XEN) 10010000 10122700 00000000 00000000 00000000 00000000 00000000 00000000 > (XEN) 00000000 00000000 00000000 002ddf30 00000000 003113e8 0030f018 002cfe9c > (XEN) 00238914 00000002 00000000 00000000 00000000 0028b000 00000002 00293800 > (XEN) 00000002 0030f238 00000002 00290640 00000001 002cfea4 002a2840 002cff54 > (XEN) 002a65fc 11112131 10011142 00000000 0028d194 00000000 00000000 00000000 > (XEN) bdffb000 80000000 00000000 c0000000 00000000 00000002 00000000 c0000000 > (XEN) 002b8060 00002000 002b8040 00000000 c0000000 bc000000 00000000 c0000000 > (XEN) 00000000 be000000 00000000 00112701 00000000 bff12701 00000000 00000000 > (XEN) 00000000 00000000 00000000 00000000 00000018 00000000 00000001 00000000 > (XEN) 9fece000 80200000 80000000 00400000 00200550 00000000 00000000 00000000 > (XEN) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > (XEN) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > (XEN) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > (XEN) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > (XEN) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > (XEN) Xen call trace: > (XEN) [<00252b68>] alternative.c#__apply_alternatives+0x128/0x1d4 (PC) > (XEN) [<00239128>] is_active_kernel_text+0x10/0x28 (LR) > (XEN) [<00252dd8>] alternative.c#__apply_alternatives_multi_stop+0x1c4/0x204 > (XEN) [<00238914>] stop_machine_run+0x1e8/0x254 > (XEN) [<002a2840>] apply_alternatives_all+0x38/0x54 > (XEN) [<002a65fc>] start_xen+0xcf4/0xf88 > (XEN) [<00200550>] arm32/head.o#paging+0x94/0xd8 > (XEN) > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 0: > (XEN) Xen BUG at alternative.c:61 > (XEN) **************************************** > > This panic was triggered by the BUG(); in branch_insn_requires_update. > That's because in this case the alternative patching needs to update the > offset of the branch instruction. But the new target address of the branch > instruction could not pass the check of is_active_kernel_text(); > > The reason is that: When Xen is booting, it will call apply_alternatives_all > to do patching with alternative tables. In this progress, we should update > the offset of branch instructions if required. This means we should modify > the Xen text section. But Xen text section is marked as read-only and we > configure the hardware to not allow a region to be writable and executable at > the same time. So we re-map Xen in a temporary area for writing. In this case, > the calculation of the new target address of the branch instruction is based > on this re-mapped area. The new target address will point to a value in the > re-mapped area. But we haven't registered this area as an active kernel text. > So the check of is_active_kernel_text will always return false. > > We have to register the re-mapped Xen area as a virtual region temporarily to > solve this problem. > > 1. https://lists.xenproject.org/archives/html/xen-devel/2017-03/msg01939.html > > Signed-off-by: Wei Chen <Wei.Chen@arm.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > Notes: > This bug will affect the staging, staging-4.8 and stable-4.8 source trees. > > --- > v2->v3 > 1. Fix typos. > 2. Explain this bug only happening when booting Xen in commit message. > > --- > xen/arch/arm/alternative.c | 18 +++++++++++++++++- > 1 file changed, 17 insertions(+), 1 deletion(-) > > diff --git a/xen/arch/arm/alternative.c b/xen/arch/arm/alternative.c > index 1d10f51..96859fc 100644 > --- a/xen/arch/arm/alternative.c > +++ b/xen/arch/arm/alternative.c > @@ -24,6 +24,7 @@ > #include <xen/vmap.h> > #include <xen/smp.h> > #include <xen/stop_machine.h> > +#include <xen/virtual_region.h> > #include <asm/alternative.h> > #include <asm/atomic.h> > #include <asm/byteorder.h> > @@ -154,8 +155,12 @@ static int __apply_alternatives_multi_stop(void *unused) > int ret; > struct alt_region region; > mfn_t xen_mfn = _mfn(virt_to_mfn(_start)); > - unsigned int xen_order = get_order_from_bytes(_end - _start); > + unsigned int xen_size = _end - _start; > + unsigned int xen_order = get_order_from_bytes(xen_size); > void *xenmap; > + struct virtual_region patch_region = { > + .list = LIST_HEAD_INIT(patch_region.list), > + }; > > BUG_ON(patched); > > @@ -169,6 +174,15 @@ static int __apply_alternatives_multi_stop(void *unused) > BUG_ON(!xenmap); > > /* > + * If we generate a new branch instruction, the target will be > + * calculated in this re-mapped Xen region. So we have to register > + * this re-mapped Xen region as a virtual region temporarily. > + */ > + patch_region.start = xenmap; > + patch_region.end = xenmap + xen_size; > + register_virtual_region(&patch_region); > + > + /* > * Find the virtual address of the alternative region in the new > * mapping. > * alt_instr contains relative offset, so the function > @@ -182,6 +196,8 @@ static int __apply_alternatives_multi_stop(void *unused) > /* The patching is not expected to fail during boot. */ > BUG_ON(ret != 0); > > + unregister_virtual_region(&patch_region); > + > vunmap(xenmap); > > /* Barriers provided by the cache flushing */ > -- > 2.7.4 >
Hi Wei, On 27/03/17 09:40, Wei Chen wrote: > Signed-off-by: Wei Chen <Wei.Chen@arm.com> Thank you for updating the commit message :). With the small change below: Reviewed-by: Julien Grall <julien.grall@arm.com> > --- > Notes: > This bug will affect the staging, staging-4.8 and stable-4.8 source trees. > > --- > v2->v3 > 1. Fix typos. > 2. Explain this bug only happening when booting Xen in commit message. > > --- > xen/arch/arm/alternative.c | 18 +++++++++++++++++- > 1 file changed, 17 insertions(+), 1 deletion(-) > > diff --git a/xen/arch/arm/alternative.c b/xen/arch/arm/alternative.c > index 1d10f51..96859fc 100644 > --- a/xen/arch/arm/alternative.c > +++ b/xen/arch/arm/alternative.c > @@ -24,6 +24,7 @@ > #include <xen/vmap.h> > #include <xen/smp.h> > #include <xen/stop_machine.h> > +#include <xen/virtual_region.h> > #include <asm/alternative.h> > #include <asm/atomic.h> > #include <asm/byteorder.h> > @@ -154,8 +155,12 @@ static int __apply_alternatives_multi_stop(void *unused) > int ret; > struct alt_region region; > mfn_t xen_mfn = _mfn(virt_to_mfn(_start)); > - unsigned int xen_order = get_order_from_bytes(_end - _start); > + unsigned int xen_size = _end - _start; I didn't notice it on the previous reviews. xen_size should technically be paddr_t. It is more for consistency than a real bug as the result _end - _start will unlikely ever be > 4GB. I think Stefano should be able to fix on commit. So no need to resend the patch. Cheers,
>>> On 29.03.17 at 16:23, <julien.grall@arm.com> wrote: > On 27/03/17 09:40, Wei Chen wrote: >> @@ -154,8 +155,12 @@ static int __apply_alternatives_multi_stop(void *unused) >> int ret; >> struct alt_region region; >> mfn_t xen_mfn = _mfn(virt_to_mfn(_start)); >> - unsigned int xen_order = get_order_from_bytes(_end - _start); >> + unsigned int xen_size = _end - _start; > > I didn't notice it on the previous reviews. xen_size should technically > be paddr_t. > > It is more for consistency than a real bug as the result _end - _start > will unlikely ever be > 4GB. I think Stefano should be able to fix on > commit. So no need to resend the patch. Hmm, this mostly addresses one of the two complaints I was going to make (pushing a patch which didn't go via xen-devel). The other one remains, though: As indicated before, only security patches should be pushed to stable branches at about the same time as to master's staging; everything else should please wait until the patch has passed the push gate on master. Note, for example, how I've avoided including the backport of 4edb1a42e3 in the batch I've just pushed to 4.8-staging, despite this being a very simple and obvious change. Jan
On 03/31/2017 07:47 AM, Jan Beulich wrote: >>>> On 29.03.17 at 16:23, <julien.grall@arm.com> wrote: >> On 27/03/17 09:40, Wei Chen wrote: >>> @@ -154,8 +155,12 @@ static int __apply_alternatives_multi_stop(void *unused) >>> int ret; >>> struct alt_region region; >>> mfn_t xen_mfn = _mfn(virt_to_mfn(_start)); >>> - unsigned int xen_order = get_order_from_bytes(_end - _start); >>> + unsigned int xen_size = _end - _start; >> >> I didn't notice it on the previous reviews. xen_size should technically >> be paddr_t. >> >> It is more for consistency than a real bug as the result _end - _start >> will unlikely ever be > 4GB. I think Stefano should be able to fix on >> commit. So no need to resend the patch. > > Hmm, this mostly addresses one of the two complaints I was going > to make (pushing a patch which didn't go via xen-devel). Actually, I was expected the patch to be fixed up on commit and not seen a separate commit. I am not sure why it has been done like that... and the commit title does not make that much sense. Cheers,
On Fri, 31 Mar 2017, Jan Beulich wrote: > >>> On 29.03.17 at 16:23, <julien.grall@arm.com> wrote: > > On 27/03/17 09:40, Wei Chen wrote: > >> @@ -154,8 +155,12 @@ static int __apply_alternatives_multi_stop(void *unused) > >> int ret; > >> struct alt_region region; > >> mfn_t xen_mfn = _mfn(virt_to_mfn(_start)); > >> - unsigned int xen_order = get_order_from_bytes(_end - _start); > >> + unsigned int xen_size = _end - _start; > > > > I didn't notice it on the previous reviews. xen_size should technically > > be paddr_t. > > > > It is more for consistency than a real bug as the result _end - _start > > will unlikely ever be > 4GB. I think Stefano should be able to fix on > > commit. So no need to resend the patch. > > Hmm, this mostly addresses one of the two complaints I was going > to make (pushing a patch which didn't go via xen-devel). I noticed the comment only after I pushed the commit. Given how trivial it is, I fixed it in a separate commit. > The other one remains, though: As indicated before, only security patches > should be pushed to stable branches at about the same time as to > master's staging; everything else should please wait until the patch > has passed the push gate on master. Note, for example, how I've > avoided including the backport of 4edb1a42e3 in the batch I've > just pushed to 4.8-staging, despite this being a very simple and > obvious change. Yes, you are right, this is important. I admit that it is the second time that I fall into this easy mistake. Of course I ran some tests on the backport, but I still should have waited for the push-gate. Given that the vast majority of the backports are security fixes, which are pushed straight way to several trees, it is easy to forget I shouldn't do that for non-security fixes. I'll get better. But I wonder, as stable trees maintainer, do you have a specific git configuration or a script that helps you avoid this kind of mistakes? Or is it all by hand?
>>> On 31.03.17 at 20:15, <sstabellini@kernel.org> wrote: > On Fri, 31 Mar 2017, Jan Beulich wrote: >> The other one remains, though: As indicated before, only security patches >> should be pushed to stable branches at about the same time as to >> master's staging; everything else should please wait until the patch >> has passed the push gate on master. Note, for example, how I've >> avoided including the backport of 4edb1a42e3 in the batch I've >> just pushed to 4.8-staging, despite this being a very simple and >> obvious change. > > Yes, you are right, this is important. I admit that it is the second > time that I fall into this easy mistake. Of course I ran some tests on > the backport, but I still should have waited for the push-gate. Given > that the vast majority of the backports are security fixes, which are > pushed straight way to several trees, it is easy to forget I shouldn't > do that for non-security fixes. I'll get better. But I wonder, as stable > trees maintainer, do you have a specific git configuration or a script > that helps you avoid this kind of mistakes? Or is it all by hand? All by hand indeed. I merely have security and non-security patches queued in different places. Jan
diff --git a/xen/arch/arm/alternative.c b/xen/arch/arm/alternative.c index 1d10f51..96859fc 100644 --- a/xen/arch/arm/alternative.c +++ b/xen/arch/arm/alternative.c @@ -24,6 +24,7 @@ #include <xen/vmap.h> #include <xen/smp.h> #include <xen/stop_machine.h> +#include <xen/virtual_region.h> #include <asm/alternative.h> #include <asm/atomic.h> #include <asm/byteorder.h> @@ -154,8 +155,12 @@ static int __apply_alternatives_multi_stop(void *unused) int ret; struct alt_region region; mfn_t xen_mfn = _mfn(virt_to_mfn(_start)); - unsigned int xen_order = get_order_from_bytes(_end - _start); + unsigned int xen_size = _end - _start; + unsigned int xen_order = get_order_from_bytes(xen_size); void *xenmap; + struct virtual_region patch_region = { + .list = LIST_HEAD_INIT(patch_region.list), + }; BUG_ON(patched); @@ -169,6 +174,15 @@ static int __apply_alternatives_multi_stop(void *unused) BUG_ON(!xenmap); /* + * If we generate a new branch instruction, the target will be + * calculated in this re-mapped Xen region. So we have to register + * this re-mapped Xen region as a virtual region temporarily. + */ + patch_region.start = xenmap; + patch_region.end = xenmap + xen_size; + register_virtual_region(&patch_region); + + /* * Find the virtual address of the alternative region in the new * mapping. * alt_instr contains relative offset, so the function @@ -182,6 +196,8 @@ static int __apply_alternatives_multi_stop(void *unused) /* The patching is not expected to fail during boot. */ BUG_ON(ret != 0); + unregister_virtual_region(&patch_region); + vunmap(xenmap); /* Barriers provided by the cache flushing */
While I was using the alternative patching in the SErrors patch series [1]. I used a branch instruction as alternative instruction. ALTERNATIVE("nop", "b skip_check", SKIP_CHECK_PENDING_VSERROR) Unfortunately, I got a system panic message with this code: (XEN) build-id: f64081d86e7e88504b7d00e1486f25751c004e39 (XEN) alternatives: Patching with alt table 100b9480 -> 100b9498 (XEN) Xen BUG at alternative.c:61 (XEN) ----[ Xen-4.9-unstable arm32 debug=y Tainted: C ]---- (XEN) CPU: 0 (XEN) PC: 00252b68 alternative.c#__apply_alternatives+0x128/0x1d4 (XEN) CPSR: 800000da MODE:Hypervisor (XEN) R0: 00000000 R1: 00000000 R2: 100b9490 R3: 100b949c (XEN) R4: eafeff84 R5: 00000000 R6: 100b949c R7: 10079290 (XEN) R8: 100792ac R9: 00000001 R10:100b948c R11:002cfe04 R12:002932c0 (XEN) HYP: SP: 002cfdc4 LR: 00239128 (XEN) (XEN) VTCR_EL2: 80003558 (XEN) VTTBR_EL2: 0000000000000000 (XEN) (XEN) SCTLR_EL2: 30cd187f (XEN) HCR_EL2: 000000000038663f (XEN) TTBR0_EL2: 00000000bff09000 (XEN) (XEN) ESR_EL2: 00000000 (XEN) HPFAR_EL2: 0000000000000000 (XEN) HDFAR: 00000000 (XEN) HIFAR: 00000000 (XEN) (XEN) Xen stack trace from sp=002cfdc4: (XEN) 00000000 00294328 002e0004 00000001 10079290 002cfe14 100b9490 00000000 (XEN) 10010000 10122700 00200000 002cfe1c 00000080 00252c14 00000000 002cfe64 (XEN) 00252dd8 00000007 00000000 000bfe00 100b9480 100b9498 002cfe1c 002cfe1c (XEN) 10010000 10122700 00000000 00000000 00000000 00000000 00000000 00000000 (XEN) 00000000 00000000 00000000 002ddf30 00000000 003113e8 0030f018 002cfe9c (XEN) 00238914 00000002 00000000 00000000 00000000 0028b000 00000002 00293800 (XEN) 00000002 0030f238 00000002 00290640 00000001 002cfea4 002a2840 002cff54 (XEN) 002a65fc 11112131 10011142 00000000 0028d194 00000000 00000000 00000000 (XEN) bdffb000 80000000 00000000 c0000000 00000000 00000002 00000000 c0000000 (XEN) 002b8060 00002000 002b8040 00000000 c0000000 bc000000 00000000 c0000000 (XEN) 00000000 be000000 00000000 00112701 00000000 bff12701 00000000 00000000 (XEN) 00000000 00000000 00000000 00000000 00000018 00000000 00000001 00000000 (XEN) 9fece000 80200000 80000000 00400000 00200550 00000000 00000000 00000000 (XEN) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 (XEN) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 (XEN) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 (XEN) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 (XEN) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 (XEN) Xen call trace: (XEN) [<00252b68>] alternative.c#__apply_alternatives+0x128/0x1d4 (PC) (XEN) [<00239128>] is_active_kernel_text+0x10/0x28 (LR) (XEN) [<00252dd8>] alternative.c#__apply_alternatives_multi_stop+0x1c4/0x204 (XEN) [<00238914>] stop_machine_run+0x1e8/0x254 (XEN) [<002a2840>] apply_alternatives_all+0x38/0x54 (XEN) [<002a65fc>] start_xen+0xcf4/0xf88 (XEN) [<00200550>] arm32/head.o#paging+0x94/0xd8 (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 0: (XEN) Xen BUG at alternative.c:61 (XEN) **************************************** This panic was triggered by the BUG(); in branch_insn_requires_update. That's because in this case the alternative patching needs to update the offset of the branch instruction. But the new target address of the branch instruction could not pass the check of is_active_kernel_text(); The reason is that: When Xen is booting, it will call apply_alternatives_all to do patching with alternative tables. In this progress, we should update the offset of branch instructions if required. This means we should modify the Xen text section. But Xen text section is marked as read-only and we configure the hardware to not allow a region to be writable and executable at the same time. So we re-map Xen in a temporary area for writing. In this case, the calculation of the new target address of the branch instruction is based on this re-mapped area. The new target address will point to a value in the re-mapped area. But we haven't registered this area as an active kernel text. So the check of is_active_kernel_text will always return false. We have to register the re-mapped Xen area as a virtual region temporarily to solve this problem. 1. https://lists.xenproject.org/archives/html/xen-devel/2017-03/msg01939.html Signed-off-by: Wei Chen <Wei.Chen@arm.com> --- Notes: This bug will affect the staging, staging-4.8 and stable-4.8 source trees. --- v2->v3 1. Fix typos. 2. Explain this bug only happening when booting Xen in commit message. --- xen/arch/arm/alternative.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-)