Message ID | 20220528065056.1034168-8-tongtiangen@huawei.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | arm64: add machine check safe support | expand |
On Sat, May 28, 2022 at 06:50:55AM +0000, Tong Tiangen wrote: > If user access fail due to hardware memory error, only the relevant > processes are affected, so killing the user process and isolate the > error page with hardware memory errors is a more reasonable choice > than kernel panic. > > Signed-off-by: Tong Tiangen <tongtiangen@huawei.com> > --- > arch/arm64/lib/copy_from_user.S | 8 ++++---- > arch/arm64/lib/copy_to_user.S | 8 ++++---- All of these changes are to the *kernel* accesses performed as part of copy to/from user, and have nothing to do with userspace, so it does not make sense to mark these as UACCESS. Do we *actually* need to recover from failues on these accesses? Looking at _copy_from_user(), the kernel will immediately follow this up with a memset() to the same address which will be fatal anyway, so this is only punting the failure for a few instructions. If we really need to recover from certain accesses to kernel memory we should add a new EX_TYPE_KACCESS_ERR_ZERO_MC or similar, but we need a strong rationale as to why that's useful. As things stand I do not beleive it makes sense for copy to/from user specifically. > arch/arm64/mm/extable.c | 8 ++++---- > 3 files changed, 12 insertions(+), 12 deletions(-) > > diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S > index 34e317907524..402dd48a4f93 100644 > --- a/arch/arm64/lib/copy_from_user.S > +++ b/arch/arm64/lib/copy_from_user.S > @@ -25,7 +25,7 @@ > .endm > > .macro strb1 reg, ptr, val > - strb \reg, [\ptr], \val > + USER(9998f, strb \reg, [\ptr], \val) > .endm > > .macro ldrh1 reg, ptr, val > @@ -33,7 +33,7 @@ > .endm > > .macro strh1 reg, ptr, val > - strh \reg, [\ptr], \val > + USER(9998f, strh \reg, [\ptr], \val) > .endm > > .macro ldr1 reg, ptr, val > @@ -41,7 +41,7 @@ > .endm > > .macro str1 reg, ptr, val > - str \reg, [\ptr], \val > + USER(9998f, str \reg, [\ptr], \val) > .endm > > .macro ldp1 reg1, reg2, ptr, val > @@ -49,7 +49,7 @@ > .endm > > .macro stp1 reg1, reg2, ptr, val > - stp \reg1, \reg2, [\ptr], \val > + USER(9998f, stp \reg1, \reg2, [\ptr], \val) > .endm > > end .req x5 > diff --git a/arch/arm64/lib/copy_to_user.S b/arch/arm64/lib/copy_to_user.S > index 802231772608..4134bdb3a8b0 100644 > --- a/arch/arm64/lib/copy_to_user.S > +++ b/arch/arm64/lib/copy_to_user.S > @@ -20,7 +20,7 @@ > * x0 - bytes not copied > */ > .macro ldrb1 reg, ptr, val > - ldrb \reg, [\ptr], \val > + USER(9998f, ldrb \reg, [\ptr], \val) > .endm > > .macro strb1 reg, ptr, val > @@ -28,7 +28,7 @@ > .endm > > .macro ldrh1 reg, ptr, val > - ldrh \reg, [\ptr], \val > + USER(9998f, ldrh \reg, [\ptr], \val) > .endm > > .macro strh1 reg, ptr, val > @@ -36,7 +36,7 @@ > .endm > > .macro ldr1 reg, ptr, val > - ldr \reg, [\ptr], \val > + USER(9998f, ldr \reg, [\ptr], \val) > .endm > > .macro str1 reg, ptr, val > @@ -44,7 +44,7 @@ > .endm > > .macro ldp1 reg1, reg2, ptr, val > - ldp \reg1, \reg2, [\ptr], \val > + USER(9998f, ldp \reg1, \reg2, [\ptr], \val) > .endm > > .macro stp1 reg1, reg2, ptr, val > diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c > index c301dcf6335f..8ca8d9639f9f 100644 > --- a/arch/arm64/mm/extable.c > +++ b/arch/arm64/mm/extable.c > @@ -86,10 +86,10 @@ bool fixup_exception_mc(struct pt_regs *regs) > if (!ex) > return false; > > - /* > - * This is not complete, More Machine check safe extable type can > - * be processed here. > - */ > + switch (ex->type) { > + case EX_TYPE_UACCESS_ERR_ZERO: > + return ex_handler_uaccess_err_zero(ex, regs); > + } This addition specifically makes sense to me, so can you split this into a separate patch? Thanks, Mark.
在 2022/6/17 17:06, Mark Rutland 写道: > On Sat, May 28, 2022 at 06:50:55AM +0000, Tong Tiangen wrote: >> If user access fail due to hardware memory error, only the relevant >> processes are affected, so killing the user process and isolate the >> error page with hardware memory errors is a more reasonable choice >> than kernel panic. >> >> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com> > >> --- >> arch/arm64/lib/copy_from_user.S | 8 ++++---- >> arch/arm64/lib/copy_to_user.S | 8 ++++---- > > All of these changes are to the *kernel* accesses performed as part of copy > to/from user, and have nothing to do with userspace, so it does not make sense > to mark these as UACCESS. You have a point. so there is no need to modify copy_from/to_user.S in this patch set. > > Do we *actually* need to recover from failues on these accesses? Looking at > _copy_from_user(), the kernel will immediately follow this up with a memset() > to the same address which will be fatal anyway, so this is only punting the > failure for a few instructions. If recovery success, The task will be killed and there will be no subsequent memset(). > > If we really need to recover from certain accesses to kernel memory we should > add a new EX_TYPE_KACCESS_ERR_ZERO_MC or similar, but we need a strong > rationale as to why that's useful. As things stand I do not beleive it makes > sense for copy to/from user specifically. > >> arch/arm64/mm/extable.c | 8 ++++---- >> 3 files changed, 12 insertions(+), 12 deletions(-) >> >> diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S >> index 34e317907524..402dd48a4f93 100644 >> --- a/arch/arm64/lib/copy_from_user.S >> +++ b/arch/arm64/lib/copy_from_user.S >> @@ -25,7 +25,7 @@ >> .endm >> >> .macro strb1 reg, ptr, val >> - strb \reg, [\ptr], \val >> + USER(9998f, strb \reg, [\ptr], \val) >> .endm >> >> .macro ldrh1 reg, ptr, val >> @@ -33,7 +33,7 @@ >> .endm >> >> .macro strh1 reg, ptr, val >> - strh \reg, [\ptr], \val >> + USER(9998f, strh \reg, [\ptr], \val) >> .endm >> >> .macro ldr1 reg, ptr, val >> @@ -41,7 +41,7 @@ >> .endm >> >> .macro str1 reg, ptr, val >> - str \reg, [\ptr], \val >> + USER(9998f, str \reg, [\ptr], \val) >> .endm >> >> .macro ldp1 reg1, reg2, ptr, val >> @@ -49,7 +49,7 @@ >> .endm >> >> .macro stp1 reg1, reg2, ptr, val >> - stp \reg1, \reg2, [\ptr], \val >> + USER(9998f, stp \reg1, \reg2, [\ptr], \val) >> .endm >> >> end .req x5 >> diff --git a/arch/arm64/lib/copy_to_user.S b/arch/arm64/lib/copy_to_user.S >> index 802231772608..4134bdb3a8b0 100644 >> --- a/arch/arm64/lib/copy_to_user.S >> +++ b/arch/arm64/lib/copy_to_user.S >> @@ -20,7 +20,7 @@ >> * x0 - bytes not copied >> */ >> .macro ldrb1 reg, ptr, val >> - ldrb \reg, [\ptr], \val >> + USER(9998f, ldrb \reg, [\ptr], \val) >> .endm >> >> .macro strb1 reg, ptr, val >> @@ -28,7 +28,7 @@ >> .endm >> >> .macro ldrh1 reg, ptr, val >> - ldrh \reg, [\ptr], \val >> + USER(9998f, ldrh \reg, [\ptr], \val) >> .endm >> >> .macro strh1 reg, ptr, val >> @@ -36,7 +36,7 @@ >> .endm >> >> .macro ldr1 reg, ptr, val >> - ldr \reg, [\ptr], \val >> + USER(9998f, ldr \reg, [\ptr], \val) >> .endm >> >> .macro str1 reg, ptr, val >> @@ -44,7 +44,7 @@ >> .endm >> >> .macro ldp1 reg1, reg2, ptr, val >> - ldp \reg1, \reg2, [\ptr], \val >> + USER(9998f, ldp \reg1, \reg2, [\ptr], \val) >> .endm >> >> .macro stp1 reg1, reg2, ptr, val >> diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c >> index c301dcf6335f..8ca8d9639f9f 100644 >> --- a/arch/arm64/mm/extable.c >> +++ b/arch/arm64/mm/extable.c >> @@ -86,10 +86,10 @@ bool fixup_exception_mc(struct pt_regs *regs) >> if (!ex) >> return false; >> >> - /* >> - * This is not complete, More Machine check safe extable type can >> - * be processed here. >> - */ >> + switch (ex->type) { >> + case EX_TYPE_UACCESS_ERR_ZERO: >> + return ex_handler_uaccess_err_zero(ex, regs); >> + } > > This addition specifically makes sense to me, so can you split this into a separate patch? According to my understanding of the above, only the modification of extable.c is retained. So what do you mean which part is made into a separate patch? Thanks, Tong. > > Thanks, > Mark. > > .
On Sat, Jun 18, 2022 at 05:27:45PM +0800, Tong Tiangen wrote: > > > 在 2022/6/17 17:06, Mark Rutland 写道: > > On Sat, May 28, 2022 at 06:50:55AM +0000, Tong Tiangen wrote: > > > If user access fail due to hardware memory error, only the relevant > > > processes are affected, so killing the user process and isolate the > > > error page with hardware memory errors is a more reasonable choice > > > than kernel panic. > > > > > > Signed-off-by: Tong Tiangen <tongtiangen@huawei.com> > > > > > --- > > > arch/arm64/lib/copy_from_user.S | 8 ++++---- > > > arch/arm64/lib/copy_to_user.S | 8 ++++---- > > > > All of these changes are to the *kernel* accesses performed as part of copy > > to/from user, and have nothing to do with userspace, so it does not make sense > > to mark these as UACCESS. > > You have a point. so there is no need to modify copy_from/to_user.S in this > patch set. Cool, thanks. If this patch just has the extable change, that's fine by me. > > Do we *actually* need to recover from failues on these accesses? Looking at > > _copy_from_user(), the kernel will immediately follow this up with a memset() > > to the same address which will be fatal anyway, so this is only punting the > > failure for a few instructions. > > If recovery success, The task will be killed and there will be no subsequent > memset(). I don't think that's true. IIUC per the last patch, in the exception handler we'll apply the fixup then force a signal. That doesn't kill the task immediately, and we'll return from the exception handler back into the original context (with the fixup applied). The structure of copy_from_user() is copy_from_user(to, from, n) { _copy_from_user(to, from, n) { res = n; res = raw_copy_from_user(to, from, n); if (res) memset(to + (n - res), 0, res); } } So when the fixup is applied and res indicates that the copy terminated early, there is an unconditinal memset() before the fatal signal is handled in the return to userspace path. > > If we really need to recover from certain accesses to kernel memory we should > > add a new EX_TYPE_KACCESS_ERR_ZERO_MC or similar, but we need a strong > > rationale as to why that's useful. As things stand I do not beleive it makes > > sense for copy to/from user specifically. [...] > > > diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c > > > index c301dcf6335f..8ca8d9639f9f 100644 > > > --- a/arch/arm64/mm/extable.c > > > +++ b/arch/arm64/mm/extable.c > > > @@ -86,10 +86,10 @@ bool fixup_exception_mc(struct pt_regs *regs) > > > if (!ex) > > > return false; > > > - /* > > > - * This is not complete, More Machine check safe extable type can > > > - * be processed here. > > > - */ > > > + switch (ex->type) { > > > + case EX_TYPE_UACCESS_ERR_ZERO: > > > + return ex_handler_uaccess_err_zero(ex, regs); > > > + } > > > > This addition specifically makes sense to me, so can you split this into a separate patch? > > According to my understanding of the above, only the modification of > extable.c is retained. > > So what do you mean which part is made into a separate patch? As above, if you just retain the extable.c changes, that's fine by me. Thanks, Mark.
在 2022/6/18 19:35, Mark Rutland 写道: > On Sat, Jun 18, 2022 at 05:27:45PM +0800, Tong Tiangen wrote: >> >> >> 在 2022/6/17 17:06, Mark Rutland 写道: >>> On Sat, May 28, 2022 at 06:50:55AM +0000, Tong Tiangen wrote: >>>> If user access fail due to hardware memory error, only the relevant >>>> processes are affected, so killing the user process and isolate the >>>> error page with hardware memory errors is a more reasonable choice >>>> than kernel panic. >>>> >>>> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com> >>> >>>> --- >>>> arch/arm64/lib/copy_from_user.S | 8 ++++---- >>>> arch/arm64/lib/copy_to_user.S | 8 ++++---- >>> >>> All of these changes are to the *kernel* accesses performed as part of copy >>> to/from user, and have nothing to do with userspace, so it does not make sense >>> to mark these as UACCESS. >> >> You have a point. so there is no need to modify copy_from/to_user.S in this >> patch set. > > Cool, thanks. If this patch just has the extable change, that's fine by me. > >>> Do we *actually* need to recover from failues on these accesses? Looking at >>> _copy_from_user(), the kernel will immediately follow this up with a memset() >>> to the same address which will be fatal anyway, so this is only punting the >>> failure for a few instructions. >> >> If recovery success, The task will be killed and there will be no subsequent >> memset(). > > I don't think that's true. > > IIUC per the last patch, in the exception handler we'll apply the fixup then > force a signal. That doesn't kill the task immediately, and we'll return from > the exception handler back into the original context (with the fixup applied). > correct. > The structure of copy_from_user() is > > copy_from_user(to, from, n) { > _copy_from_user(to, from, n) { > res = n; > res = raw_copy_from_user(to, from, n); > if (res) > memset(to + (n - res), 0, res); > } > } > > So when the fixup is applied and res indicates that the copy terminated early, > there is an unconditinal memset() before the fatal signal is handled in the > return to userspace path. correct in this scenario. My idea is also valuable in many other scenarios. > >>> If we really need to recover from certain accesses to kernel memory we should >>> add a new EX_TYPE_KACCESS_ERR_ZERO_MC or similar, but we need a strong >>> rationale as to why that's useful. As things stand I do not beleive it makes >>> sense for copy to/from user specifically. > > [...] > >>>> diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c >>>> index c301dcf6335f..8ca8d9639f9f 100644 >>>> --- a/arch/arm64/mm/extable.c >>>> +++ b/arch/arm64/mm/extable.c >>>> @@ -86,10 +86,10 @@ bool fixup_exception_mc(struct pt_regs *regs) >>>> if (!ex) >>>> return false; >>>> - /* >>>> - * This is not complete, More Machine check safe extable type can >>>> - * be processed here. >>>> - */ >>>> + switch (ex->type) { >>>> + case EX_TYPE_UACCESS_ERR_ZERO: >>>> + return ex_handler_uaccess_err_zero(ex, regs); >>>> + } >>> >>> This addition specifically makes sense to me, so can you split this into a separate patch? >> >> According to my understanding of the above, only the modification of >> extable.c is retained. >> >> So what do you mean which part is made into a separate patch? > > As above, if you just retain the extable.c changes, that's fine by me. Thanks, Tong. > > Thanks, > Mark. > .
diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S index 34e317907524..402dd48a4f93 100644 --- a/arch/arm64/lib/copy_from_user.S +++ b/arch/arm64/lib/copy_from_user.S @@ -25,7 +25,7 @@ .endm .macro strb1 reg, ptr, val - strb \reg, [\ptr], \val + USER(9998f, strb \reg, [\ptr], \val) .endm .macro ldrh1 reg, ptr, val @@ -33,7 +33,7 @@ .endm .macro strh1 reg, ptr, val - strh \reg, [\ptr], \val + USER(9998f, strh \reg, [\ptr], \val) .endm .macro ldr1 reg, ptr, val @@ -41,7 +41,7 @@ .endm .macro str1 reg, ptr, val - str \reg, [\ptr], \val + USER(9998f, str \reg, [\ptr], \val) .endm .macro ldp1 reg1, reg2, ptr, val @@ -49,7 +49,7 @@ .endm .macro stp1 reg1, reg2, ptr, val - stp \reg1, \reg2, [\ptr], \val + USER(9998f, stp \reg1, \reg2, [\ptr], \val) .endm end .req x5 diff --git a/arch/arm64/lib/copy_to_user.S b/arch/arm64/lib/copy_to_user.S index 802231772608..4134bdb3a8b0 100644 --- a/arch/arm64/lib/copy_to_user.S +++ b/arch/arm64/lib/copy_to_user.S @@ -20,7 +20,7 @@ * x0 - bytes not copied */ .macro ldrb1 reg, ptr, val - ldrb \reg, [\ptr], \val + USER(9998f, ldrb \reg, [\ptr], \val) .endm .macro strb1 reg, ptr, val @@ -28,7 +28,7 @@ .endm .macro ldrh1 reg, ptr, val - ldrh \reg, [\ptr], \val + USER(9998f, ldrh \reg, [\ptr], \val) .endm .macro strh1 reg, ptr, val @@ -36,7 +36,7 @@ .endm .macro ldr1 reg, ptr, val - ldr \reg, [\ptr], \val + USER(9998f, ldr \reg, [\ptr], \val) .endm .macro str1 reg, ptr, val @@ -44,7 +44,7 @@ .endm .macro ldp1 reg1, reg2, ptr, val - ldp \reg1, \reg2, [\ptr], \val + USER(9998f, ldp \reg1, \reg2, [\ptr], \val) .endm .macro stp1 reg1, reg2, ptr, val diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c index c301dcf6335f..8ca8d9639f9f 100644 --- a/arch/arm64/mm/extable.c +++ b/arch/arm64/mm/extable.c @@ -86,10 +86,10 @@ bool fixup_exception_mc(struct pt_regs *regs) if (!ex) return false; - /* - * This is not complete, More Machine check safe extable type can - * be processed here. - */ + switch (ex->type) { + case EX_TYPE_UACCESS_ERR_ZERO: + return ex_handler_uaccess_err_zero(ex, regs); + } return false; }
If user access fail due to hardware memory error, only the relevant processes are affected, so killing the user process and isolate the error page with hardware memory errors is a more reasonable choice than kernel panic. Signed-off-by: Tong Tiangen <tongtiangen@huawei.com> --- arch/arm64/lib/copy_from_user.S | 8 ++++---- arch/arm64/lib/copy_to_user.S | 8 ++++---- arch/arm64/mm/extable.c | 8 ++++---- 3 files changed, 12 insertions(+), 12 deletions(-)