Message ID | 20191007085501.23202-1-hdegoede@redhat.com (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | Herbert Xu |
Headers | show |
Series | [5.4,regression,fix] x86/boot: Provide memzero_explicit | expand |
Am Montag, 7. Oktober 2019, 10:55:01 CEST schrieb Hans de Goede: Hi Hans, > The purgatory code now uses the shared lib/crypto/sha256.c sha256 > implementation. This needs memzero_explicit, implement this. > > Reported-by: Arvind Sankar <nivedita@alum.mit.edu> > Fixes: 906a4bb97f5d ("crypto: sha256 - Use get/put_unaligned_be32 to get > input, memzero_explicit") Signed-off-by: Hans de Goede > <hdegoede@redhat.com> > --- > arch/x86/boot/compressed/string.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/arch/x86/boot/compressed/string.c > b/arch/x86/boot/compressed/string.c index 81fc1eaa3229..511332e279fe 100644 > --- a/arch/x86/boot/compressed/string.c > +++ b/arch/x86/boot/compressed/string.c > @@ -50,6 +50,11 @@ void *memset(void *s, int c, size_t n) > return s; > } > > +void memzero_explicit(void *s, size_t count) > +{ > + memset(s, 0, count); May I ask how it is guaranteed that this memset is not optimized out by the compiler, e.g. for stack variables? > +} > + > void *memmove(void *dest, const void *src, size_t n) > { > unsigned char *d = dest; Ciao Stephan
Hi Stephan, On 07-10-2019 10:59, Stephan Mueller wrote: > Am Montag, 7. Oktober 2019, 10:55:01 CEST schrieb Hans de Goede: > > Hi Hans, > >> The purgatory code now uses the shared lib/crypto/sha256.c sha256 >> implementation. This needs memzero_explicit, implement this. >> >> Reported-by: Arvind Sankar <nivedita@alum.mit.edu> >> Fixes: 906a4bb97f5d ("crypto: sha256 - Use get/put_unaligned_be32 to get >> input, memzero_explicit") Signed-off-by: Hans de Goede >> <hdegoede@redhat.com> >> --- >> arch/x86/boot/compressed/string.c | 5 +++++ >> 1 file changed, 5 insertions(+) >> >> diff --git a/arch/x86/boot/compressed/string.c >> b/arch/x86/boot/compressed/string.c index 81fc1eaa3229..511332e279fe 100644 >> --- a/arch/x86/boot/compressed/string.c >> +++ b/arch/x86/boot/compressed/string.c >> @@ -50,6 +50,11 @@ void *memset(void *s, int c, size_t n) >> return s; >> } >> >> +void memzero_explicit(void *s, size_t count) >> +{ >> + memset(s, 0, count); > > May I ask how it is guaranteed that this memset is not optimized out by the > compiler, e.g. for stack variables? The function and the caller live in different compile units, so unless LTO is used this cannot happen. Also note that the previous purgatory private (vs shared) sha256 implementation had: /* Zeroize sensitive information. */ memset(sctx, 0, sizeof(*sctx)); In the place where the new shared 256 code uses memzero_explicit() and the new shared sha256 code is the only user of the arch/x86/boot/compressed/string.c memzero_explicit() implementation. With that all said I'm open to suggestions for improving this. Regards, Hans
Am Montag, 7. Oktober 2019, 11:06:04 CEST schrieb Hans de Goede: Hi Hans, > Hi Stephan, > > On 07-10-2019 10:59, Stephan Mueller wrote: > > Am Montag, 7. Oktober 2019, 10:55:01 CEST schrieb Hans de Goede: > > > > Hi Hans, > > > >> The purgatory code now uses the shared lib/crypto/sha256.c sha256 > >> implementation. This needs memzero_explicit, implement this. > >> > >> Reported-by: Arvind Sankar <nivedita@alum.mit.edu> > >> Fixes: 906a4bb97f5d ("crypto: sha256 - Use get/put_unaligned_be32 to get > >> input, memzero_explicit") Signed-off-by: Hans de Goede > >> <hdegoede@redhat.com> > >> --- > >> > >> arch/x86/boot/compressed/string.c | 5 +++++ > >> 1 file changed, 5 insertions(+) > >> > >> diff --git a/arch/x86/boot/compressed/string.c > >> b/arch/x86/boot/compressed/string.c index 81fc1eaa3229..511332e279fe > >> 100644 > >> --- a/arch/x86/boot/compressed/string.c > >> +++ b/arch/x86/boot/compressed/string.c > >> @@ -50,6 +50,11 @@ void *memset(void *s, int c, size_t n) > >> > >> return s; > >> > >> } > >> > >> +void memzero_explicit(void *s, size_t count) > >> +{ > >> + memset(s, 0, count); > > > > May I ask how it is guaranteed that this memset is not optimized out by > > the > > compiler, e.g. for stack variables? > > The function and the caller live in different compile units, so unless > LTO is used this cannot happen. Agreed in this case. I would just be worried that this memzero_explicit implementation is assumed to be protected against optimization when used elsewhere since other implementations of memzero_explicit are provided with the goal to be protected against optimizations. > > Also note that the previous purgatory private (vs shared) sha256 > implementation had: > > /* Zeroize sensitive information. */ > memset(sctx, 0, sizeof(*sctx)); > > In the place where the new shared 256 code uses memzero_explicit() and the > new shared sha256 code is the only user of the > arch/x86/boot/compressed/string.c memzero_explicit() implementation. > > With that all said I'm open to suggestions for improving this. What speaks against the common memzero_explicit implementation? If you cannot use it, what about adding a barrier in the memzero_explicit implementation? Or what about adding some compiler magic as attached to this email? > > Regards, > > Hans Ciao Stephan
Hi Stephan, On 07-10-2019 11:34, Stephan Mueller wrote: > Am Montag, 7. Oktober 2019, 11:06:04 CEST schrieb Hans de Goede: > > Hi Hans, > >> Hi Stephan, >> >> On 07-10-2019 10:59, Stephan Mueller wrote: >>> Am Montag, 7. Oktober 2019, 10:55:01 CEST schrieb Hans de Goede: >>> >>> Hi Hans, >>> >>>> The purgatory code now uses the shared lib/crypto/sha256.c sha256 >>>> implementation. This needs memzero_explicit, implement this. >>>> >>>> Reported-by: Arvind Sankar <nivedita@alum.mit.edu> >>>> Fixes: 906a4bb97f5d ("crypto: sha256 - Use get/put_unaligned_be32 to get >>>> input, memzero_explicit") Signed-off-by: Hans de Goede >>>> <hdegoede@redhat.com> >>>> --- >>>> >>>> arch/x86/boot/compressed/string.c | 5 +++++ >>>> 1 file changed, 5 insertions(+) >>>> >>>> diff --git a/arch/x86/boot/compressed/string.c >>>> b/arch/x86/boot/compressed/string.c index 81fc1eaa3229..511332e279fe >>>> 100644 >>>> --- a/arch/x86/boot/compressed/string.c >>>> +++ b/arch/x86/boot/compressed/string.c >>>> @@ -50,6 +50,11 @@ void *memset(void *s, int c, size_t n) >>>> >>>> return s; >>>> >>>> } >>>> >>>> +void memzero_explicit(void *s, size_t count) >>>> +{ >>>> + memset(s, 0, count); >>> >>> May I ask how it is guaranteed that this memset is not optimized out by >>> the >>> compiler, e.g. for stack variables? >> >> The function and the caller live in different compile units, so unless >> LTO is used this cannot happen. > > Agreed in this case. > > I would just be worried that this memzero_explicit implementation is assumed > to be protected against optimization when used elsewhere since other > implementations of memzero_explicit are provided with the goal to be protected > against optimizations. >> >> Also note that the previous purgatory private (vs shared) sha256 >> implementation had: >> >> /* Zeroize sensitive information. */ >> memset(sctx, 0, sizeof(*sctx)); >> >> In the place where the new shared 256 code uses memzero_explicit() and the >> new shared sha256 code is the only user of the >> arch/x86/boot/compressed/string.c memzero_explicit() implementation. >> >> With that all said I'm open to suggestions for improving this. > > What speaks against the common memzero_explicit implementation? Nothing, but the purgatory is a standalone binary which runs between 2 kernels when doing kexec so it cannot use the function from lib/string.c since it is not linked against the lib/string.o object. > If you cannot > use it, what about adding a barrier in the memzero_explicit implementation? Or > what about adding some compiler magic as attached to this email? Since the purgatory code is running in a somewhat limited environment, with not all standard headers / macros available I was afraid that the barrier_data() from the lib/string.c implementation would not work, so I left it out. In hindsight I should have really given it a try first as it seems to compile fine and there are no missing symbols in arch/x86/purgatory/purgatory.ro when using it. So I will send out a new version with the barrier_data() added making the arch/x86/boot/compressed/string.c implementation identical to the lib/string.c one. Regards, Hans
On Mon, Oct 07, 2019 at 11:34:09AM +0200, Stephan Mueller wrote: > Am Montag, 7. Oktober 2019, 11:06:04 CEST schrieb Hans de Goede: > > Hi Hans, > > > Hi Stephan, > > > > On 07-10-2019 10:59, Stephan Mueller wrote: > > > Am Montag, 7. Oktober 2019, 10:55:01 CEST schrieb Hans de Goede: > > > > > > Hi Hans, > > > > > >> The purgatory code now uses the shared lib/crypto/sha256.c sha256 > > >> implementation. This needs memzero_explicit, implement this. > > >> > > >> Reported-by: Arvind Sankar <nivedita@alum.mit.edu> > > >> Fixes: 906a4bb97f5d ("crypto: sha256 - Use get/put_unaligned_be32 to get > > >> input, memzero_explicit") Signed-off-by: Hans de Goede > > >> <hdegoede@redhat.com> > > >> --- > > >> > > >> arch/x86/boot/compressed/string.c | 5 +++++ > > >> 1 file changed, 5 insertions(+) > > >> > > >> diff --git a/arch/x86/boot/compressed/string.c > > >> b/arch/x86/boot/compressed/string.c index 81fc1eaa3229..511332e279fe > > >> 100644 > > >> --- a/arch/x86/boot/compressed/string.c > > >> +++ b/arch/x86/boot/compressed/string.c > > >> @@ -50,6 +50,11 @@ void *memset(void *s, int c, size_t n) > > >> > > >> return s; > > >> > > >> } > > >> > > >> +void memzero_explicit(void *s, size_t count) > > >> +{ > > >> + memset(s, 0, count); > > > > > > May I ask how it is guaranteed that this memset is not optimized out by > > > the > > > compiler, e.g. for stack variables? > > > > The function and the caller live in different compile units, so unless > > LTO is used this cannot happen. > > Agreed in this case. > > I would just be worried that this memzero_explicit implementation is assumed > to be protected against optimization when used elsewhere since other > implementations of memzero_explicit are provided with the goal to be protected > against optimizations. > > > > Also note that the previous purgatory private (vs shared) sha256 > > implementation had: > > > > /* Zeroize sensitive information. */ > > memset(sctx, 0, sizeof(*sctx)); > > > > In the place where the new shared 256 code uses memzero_explicit() and the > > new shared sha256 code is the only user of the > > arch/x86/boot/compressed/string.c memzero_explicit() implementation. > > > > With that all said I'm open to suggestions for improving this. > > What speaks against the common memzero_explicit implementation? If you cannot > use it, what about adding a barrier in the memzero_explicit implementation? Or > what about adding some compiler magic as attached to this email? > The common definition I think is the same as your attachment, i.e. memset followed by barrier_data(). I don't think there is any reason not to just copy that definition? Alternatively, could the common definition not be made an inline or macro? or is there a risk that could introduce unsafe optimizations to eliminate it?
On Mon, Oct 07, 2019 at 03:00:51PM +0200, Hans de Goede wrote: > Hi Stephan, > > On 07-10-2019 11:34, Stephan Mueller wrote: > > Am Montag, 7. Oktober 2019, 11:06:04 CEST schrieb Hans de Goede: > > > > Hi Hans, > > > >> Hi Stephan, > >> > >> On 07-10-2019 10:59, Stephan Mueller wrote: > >>> Am Montag, 7. Oktober 2019, 10:55:01 CEST schrieb Hans de Goede: > >>> > >>> Hi Hans, > >>> > >>>> The purgatory code now uses the shared lib/crypto/sha256.c sha256 > >>>> implementation. This needs memzero_explicit, implement this. > >>>> > >>>> Reported-by: Arvind Sankar <nivedita@alum.mit.edu> > >>>> Fixes: 906a4bb97f5d ("crypto: sha256 - Use get/put_unaligned_be32 to get > >>>> input, memzero_explicit") Signed-off-by: Hans de Goede > >>>> <hdegoede@redhat.com> > >>>> --- > >>>> > >>>> arch/x86/boot/compressed/string.c | 5 +++++ > >>>> 1 file changed, 5 insertions(+) > >>>> > >>>> diff --git a/arch/x86/boot/compressed/string.c > >>>> b/arch/x86/boot/compressed/string.c index 81fc1eaa3229..511332e279fe > >>>> 100644 > >>>> --- a/arch/x86/boot/compressed/string.c > >>>> +++ b/arch/x86/boot/compressed/string.c > >>>> @@ -50,6 +50,11 @@ void *memset(void *s, int c, size_t n) > >>>> > >>>> return s; > >>>> > >>>> } > >>>> > >>>> +void memzero_explicit(void *s, size_t count) > >>>> +{ > >>>> + memset(s, 0, count); > >>> > >>> May I ask how it is guaranteed that this memset is not optimized out by > >>> the > >>> compiler, e.g. for stack variables? > >> > >> The function and the caller live in different compile units, so unless > >> LTO is used this cannot happen. > > > > Agreed in this case. > > > > I would just be worried that this memzero_explicit implementation is assumed > > to be protected against optimization when used elsewhere since other > > implementations of memzero_explicit are provided with the goal to be protected > > against optimizations. > >> > >> Also note that the previous purgatory private (vs shared) sha256 > >> implementation had: > >> > >> /* Zeroize sensitive information. */ > >> memset(sctx, 0, sizeof(*sctx)); > >> > >> In the place where the new shared 256 code uses memzero_explicit() and the > >> new shared sha256 code is the only user of the > >> arch/x86/boot/compressed/string.c memzero_explicit() implementation. > >> > >> With that all said I'm open to suggestions for improving this. > > > > What speaks against the common memzero_explicit implementation? > > Nothing, but the purgatory is a standalone binary which runs between > 2 kernels when doing kexec so it cannot use the function from lib/string.c > since it is not linked against the lib/string.o object. > > > If you cannot > > use it, what about adding a barrier in the memzero_explicit implementation? Or > > what about adding some compiler magic as attached to this email? > > Since the purgatory code is running in a somewhat limited environment, > with not all standard headers / macros available I was afraid that the > barrier_data() from the lib/string.c implementation would not work, so > I left it out. In hindsight I should have really given it a try first as > it seems to compile fine and there are no missing symbols in > arch/x86/purgatory/purgatory.ro when using it. > > So I will send out a new version with the barrier_data() added making > the arch/x86/boot/compressed/string.c implementation identical to the > lib/string.c one. > > Regards, > > Hans > I think we also need a fix for at least s390 right? That also has sha256 verification and would presumably have the same issue with undefined memzero_explicit? powerpc does not seem to do sha256 verification afaict.
diff --git a/arch/x86/boot/compressed/string.c b/arch/x86/boot/compressed/string.c index 81fc1eaa3229..511332e279fe 100644 --- a/arch/x86/boot/compressed/string.c +++ b/arch/x86/boot/compressed/string.c @@ -50,6 +50,11 @@ void *memset(void *s, int c, size_t n) return s; } +void memzero_explicit(void *s, size_t count) +{ + memset(s, 0, count); +} + void *memmove(void *dest, const void *src, size_t n) { unsigned char *d = dest;
The purgatory code now uses the shared lib/crypto/sha256.c sha256 implementation. This needs memzero_explicit, implement this. Reported-by: Arvind Sankar <nivedita@alum.mit.edu> Fixes: 906a4bb97f5d ("crypto: sha256 - Use get/put_unaligned_be32 to get input, memzero_explicit") Signed-off-by: Hans de Goede <hdegoede@redhat.com> --- arch/x86/boot/compressed/string.c | 5 +++++ 1 file changed, 5 insertions(+)