Message ID | 20220715132651.1093-3-andrew.cooper3@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | xen: Fixes to check-endbr.sh | expand |
On 15.07.2022 15:26, Andrew Cooper wrote: > While Xen's current VMA means it works, the mawk fix (i.e. using $((0xN)) in > the shell) isn't portable in 32bit shells. See the code comment for the fix. > > The fix found a second latent bug. Recombining $vma_hi/lo should have used > printf "%s%08x" and only worked previously because $vma_lo had bits set in > it's top nibble. Combining with the main fix, %08x becomes %07x. > > Fixes: $XXX patch 1 > Reported-by: Jan Beulich <JBeulich@suse.com> > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> with, I guess, ... > --- a/xen/tools/check-endbr.sh > +++ b/xen/tools/check-endbr.sh > @@ -61,19 +61,36 @@ ${OBJDUMP} -j .text $1 -d -w | grep ' endbr64 *$' | cut -f 1 -d ':' > $VALID & > # the lower bits, rounding integers to the nearest 4k. > # > # Instead, use the fact that Xen's .text is within a 1G aligned region, and > -# split the VMA in half so AWK's numeric addition is only working on 32 bit > -# numbers, which don't lose precision. > +# split the VMA so AWK's numeric addition is only working on <32 bit > +# numbers, which don't lose precision. (See point 5) > # > # 4) MAWK doesn't support plain hex constants (an optional part of the POSIX > # spec), and GAWK and MAWK can't agree on how to work with hex constants in > # a string. Use the shell to convert $vma_lo to decimal before passing to > # AWK. > # > +# 5) Point 4 isn't fully portable. POSIX only requires that $((0xN)) be > +# evaluated as long, which in 32bit shells turns negative if bit 31 of the > +# VMA is set. AWK then interprets this negative number as a double before > +# adding the offsets from the binary grep. > +# > +# Instead of doing an 8/8 split with vma_hi/lo, do a 9/7 split. > +# > +# The consequence of this is that for all offsets, $vma_lo + offset needs > +# to be less that 256M (i.e. 7 nibbles) so as to be successfully recombined > +# with the 9 nibbles of $vma_hi. This is fine; .text is at the start of a > +# 1G aligned region, and Xen is far far smaller than 256M, but leave safety > +# check nevertheless. > +# > eval $(${OBJDUMP} -j .text $1 -h | > - $AWK '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, 8), substr($4, 9, 16)}') > + $AWK '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, 9), substr($4, 10, 16)}') > > ${OBJCOPY} -j .text $1 -O binary $TEXT_BIN > > +bin_sz=$(stat -c '%s' $TEXT_BIN) > +[ "$bin_sz" -ge $(((1 << 28) - $vma_lo)) ] && > + { echo "$MSG_PFX Error: .text offsets can exceed 256M" >&2; exit 1; } ... s/can/cannot/ ? Jan
On 18/07/2022 10:11, Jan Beulich wrote: > On 15.07.2022 15:26, Andrew Cooper wrote: >> While Xen's current VMA means it works, the mawk fix (i.e. using $((0xN)) in >> the shell) isn't portable in 32bit shells. See the code comment for the fix. >> >> The fix found a second latent bug. Recombining $vma_hi/lo should have used >> printf "%s%08x" and only worked previously because $vma_lo had bits set in >> it's top nibble. Combining with the main fix, %08x becomes %07x. >> >> Fixes: $XXX patch 1 >> Reported-by: Jan Beulich <JBeulich@suse.com> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > Reviewed-by: Jan Beulich <jbeulich@suse.com> Thanks, but... > with, I guess, ... > >> --- a/xen/tools/check-endbr.sh >> +++ b/xen/tools/check-endbr.sh >> @@ -61,19 +61,36 @@ ${OBJDUMP} -j .text $1 -d -w | grep ' endbr64 *$' | cut -f 1 -d ':' > $VALID & >> # the lower bits, rounding integers to the nearest 4k. >> # >> # Instead, use the fact that Xen's .text is within a 1G aligned region, and >> -# split the VMA in half so AWK's numeric addition is only working on 32 bit >> -# numbers, which don't lose precision. >> +# split the VMA so AWK's numeric addition is only working on <32 bit >> +# numbers, which don't lose precision. (See point 5) >> # >> # 4) MAWK doesn't support plain hex constants (an optional part of the POSIX >> # spec), and GAWK and MAWK can't agree on how to work with hex constants in >> # a string. Use the shell to convert $vma_lo to decimal before passing to >> # AWK. >> # >> +# 5) Point 4 isn't fully portable. POSIX only requires that $((0xN)) be >> +# evaluated as long, which in 32bit shells turns negative if bit 31 of the >> +# VMA is set. AWK then interprets this negative number as a double before >> +# adding the offsets from the binary grep. >> +# >> +# Instead of doing an 8/8 split with vma_hi/lo, do a 9/7 split. >> +# >> +# The consequence of this is that for all offsets, $vma_lo + offset needs >> +# to be less that 256M (i.e. 7 nibbles) so as to be successfully recombined >> +# with the 9 nibbles of $vma_hi. This is fine; .text is at the start of a >> +# 1G aligned region, and Xen is far far smaller than 256M, but leave safety >> +# check nevertheless. >> +# >> eval $(${OBJDUMP} -j .text $1 -h | >> - $AWK '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, 8), substr($4, 9, 16)}') >> + $AWK '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, 9), substr($4, 10, 16)}') >> >> ${OBJCOPY} -j .text $1 -O binary $TEXT_BIN >> >> +bin_sz=$(stat -c '%s' $TEXT_BIN) >> +[ "$bin_sz" -ge $(((1 << 28) - $vma_lo)) ] && >> + { echo "$MSG_PFX Error: .text offsets can exceed 256M" >&2; exit 1; } > ... s/can/cannot/ ? Why? "Can" is correct here. If the offsets can't exceed 256M, then everything is good. ~Andrew
On 18.07.2022 11:31, Andrew Cooper wrote: > On 18/07/2022 10:11, Jan Beulich wrote: >> On 15.07.2022 15:26, Andrew Cooper wrote: >>> --- a/xen/tools/check-endbr.sh >>> +++ b/xen/tools/check-endbr.sh >>> @@ -61,19 +61,36 @@ ${OBJDUMP} -j .text $1 -d -w | grep ' endbr64 *$' | cut -f 1 -d ':' > $VALID & >>> # the lower bits, rounding integers to the nearest 4k. >>> # >>> # Instead, use the fact that Xen's .text is within a 1G aligned region, and >>> -# split the VMA in half so AWK's numeric addition is only working on 32 bit >>> -# numbers, which don't lose precision. >>> +# split the VMA so AWK's numeric addition is only working on <32 bit >>> +# numbers, which don't lose precision. (See point 5) >>> # >>> # 4) MAWK doesn't support plain hex constants (an optional part of the POSIX >>> # spec), and GAWK and MAWK can't agree on how to work with hex constants in >>> # a string. Use the shell to convert $vma_lo to decimal before passing to >>> # AWK. >>> # >>> +# 5) Point 4 isn't fully portable. POSIX only requires that $((0xN)) be >>> +# evaluated as long, which in 32bit shells turns negative if bit 31 of the >>> +# VMA is set. AWK then interprets this negative number as a double before >>> +# adding the offsets from the binary grep. >>> +# >>> +# Instead of doing an 8/8 split with vma_hi/lo, do a 9/7 split. >>> +# >>> +# The consequence of this is that for all offsets, $vma_lo + offset needs >>> +# to be less that 256M (i.e. 7 nibbles) so as to be successfully recombined >>> +# with the 9 nibbles of $vma_hi. This is fine; .text is at the start of a >>> +# 1G aligned region, and Xen is far far smaller than 256M, but leave safety >>> +# check nevertheless. >>> +# >>> eval $(${OBJDUMP} -j .text $1 -h | >>> - $AWK '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, 8), substr($4, 9, 16)}') >>> + $AWK '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, 9), substr($4, 10, 16)}') >>> >>> ${OBJCOPY} -j .text $1 -O binary $TEXT_BIN >>> >>> +bin_sz=$(stat -c '%s' $TEXT_BIN) >>> +[ "$bin_sz" -ge $(((1 << 28) - $vma_lo)) ] && >>> + { echo "$MSG_PFX Error: .text offsets can exceed 256M" >&2; exit 1; } >> ... s/can/cannot/ ? > > Why? "Can" is correct here. If the offsets can't exceed 256M, then > everything is good. Hmm, the wording then indeed is ambiguous. I read "can" as "are allowed to", when we mean "aren't allowed to". Maybe ".text is 256M or more in size"? If you mention "offsets", then I think the check should be based on actually observing an offset which is too large (which .text size alone doesn't guarantee will happen). Jan
On 18/07/2022 10:49, Jan Beulich wrote: > On 18.07.2022 11:31, Andrew Cooper wrote: >> On 18/07/2022 10:11, Jan Beulich wrote: >>> On 15.07.2022 15:26, Andrew Cooper wrote: >>>> --- a/xen/tools/check-endbr.sh >>>> +++ b/xen/tools/check-endbr.sh >>>> @@ -61,19 +61,36 @@ ${OBJDUMP} -j .text $1 -d -w | grep ' endbr64 *$' | cut -f 1 -d ':' > $VALID & >>>> # the lower bits, rounding integers to the nearest 4k. >>>> # >>>> # Instead, use the fact that Xen's .text is within a 1G aligned region, and >>>> -# split the VMA in half so AWK's numeric addition is only working on 32 bit >>>> -# numbers, which don't lose precision. >>>> +# split the VMA so AWK's numeric addition is only working on <32 bit >>>> +# numbers, which don't lose precision. (See point 5) >>>> # >>>> # 4) MAWK doesn't support plain hex constants (an optional part of the POSIX >>>> # spec), and GAWK and MAWK can't agree on how to work with hex constants in >>>> # a string. Use the shell to convert $vma_lo to decimal before passing to >>>> # AWK. >>>> # >>>> +# 5) Point 4 isn't fully portable. POSIX only requires that $((0xN)) be >>>> +# evaluated as long, which in 32bit shells turns negative if bit 31 of the >>>> +# VMA is set. AWK then interprets this negative number as a double before >>>> +# adding the offsets from the binary grep. >>>> +# >>>> +# Instead of doing an 8/8 split with vma_hi/lo, do a 9/7 split. >>>> +# >>>> +# The consequence of this is that for all offsets, $vma_lo + offset needs >>>> +# to be less that 256M (i.e. 7 nibbles) so as to be successfully recombined >>>> +# with the 9 nibbles of $vma_hi. This is fine; .text is at the start of a >>>> +# 1G aligned region, and Xen is far far smaller than 256M, but leave safety >>>> +# check nevertheless. >>>> +# >>>> eval $(${OBJDUMP} -j .text $1 -h | >>>> - $AWK '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, 8), substr($4, 9, 16)}') >>>> + $AWK '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, 9), substr($4, 10, 16)}') >>>> >>>> ${OBJCOPY} -j .text $1 -O binary $TEXT_BIN >>>> >>>> +bin_sz=$(stat -c '%s' $TEXT_BIN) >>>> +[ "$bin_sz" -ge $(((1 << 28) - $vma_lo)) ] && >>>> + { echo "$MSG_PFX Error: .text offsets can exceed 256M" >&2; exit 1; } >>> ... s/can/cannot/ ? >> Why? "Can" is correct here. If the offsets can't exceed 256M, then >> everything is good. > Hmm, the wording then indeed is ambiguous. I see your point. In this case it's meant as "are able to", but this is still clearer than using "can't" because at least the text matches the check which triggered it. > I read "can" as "are allowed > to", when we mean "aren't allowed to". Maybe ".text is 256M or more in > size"? If you mention "offsets", then I think the check should be based > on actually observing an offset which is too large (which .text size > alone doesn't guarantee will happen). It's not just .text on its own because the VMA of offset by 2M, hence the subtraction of $vma_lo in the main calculation. There's no point searching for offsets. There will be one near the end, so all searching for an offset would do is complicate the critical loop. How about ".text offsets must not exceed 256M" ? That should be unambiguous. ~Andrew
On 18.07.2022 14:07, Andrew Cooper wrote: > On 18/07/2022 10:49, Jan Beulich wrote: >> On 18.07.2022 11:31, Andrew Cooper wrote: >>> On 18/07/2022 10:11, Jan Beulich wrote: >>>> On 15.07.2022 15:26, Andrew Cooper wrote: >>>>> --- a/xen/tools/check-endbr.sh >>>>> +++ b/xen/tools/check-endbr.sh >>>>> @@ -61,19 +61,36 @@ ${OBJDUMP} -j .text $1 -d -w | grep ' endbr64 *$' | cut -f 1 -d ':' > $VALID & >>>>> # the lower bits, rounding integers to the nearest 4k. >>>>> # >>>>> # Instead, use the fact that Xen's .text is within a 1G aligned region, and >>>>> -# split the VMA in half so AWK's numeric addition is only working on 32 bit >>>>> -# numbers, which don't lose precision. >>>>> +# split the VMA so AWK's numeric addition is only working on <32 bit >>>>> +# numbers, which don't lose precision. (See point 5) >>>>> # >>>>> # 4) MAWK doesn't support plain hex constants (an optional part of the POSIX >>>>> # spec), and GAWK and MAWK can't agree on how to work with hex constants in >>>>> # a string. Use the shell to convert $vma_lo to decimal before passing to >>>>> # AWK. >>>>> # >>>>> +# 5) Point 4 isn't fully portable. POSIX only requires that $((0xN)) be >>>>> +# evaluated as long, which in 32bit shells turns negative if bit 31 of the >>>>> +# VMA is set. AWK then interprets this negative number as a double before >>>>> +# adding the offsets from the binary grep. >>>>> +# >>>>> +# Instead of doing an 8/8 split with vma_hi/lo, do a 9/7 split. >>>>> +# >>>>> +# The consequence of this is that for all offsets, $vma_lo + offset needs >>>>> +# to be less that 256M (i.e. 7 nibbles) so as to be successfully recombined >>>>> +# with the 9 nibbles of $vma_hi. This is fine; .text is at the start of a >>>>> +# 1G aligned region, and Xen is far far smaller than 256M, but leave safety >>>>> +# check nevertheless. >>>>> +# >>>>> eval $(${OBJDUMP} -j .text $1 -h | >>>>> - $AWK '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, 8), substr($4, 9, 16)}') >>>>> + $AWK '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, 9), substr($4, 10, 16)}') >>>>> >>>>> ${OBJCOPY} -j .text $1 -O binary $TEXT_BIN >>>>> >>>>> +bin_sz=$(stat -c '%s' $TEXT_BIN) >>>>> +[ "$bin_sz" -ge $(((1 << 28) - $vma_lo)) ] && >>>>> + { echo "$MSG_PFX Error: .text offsets can exceed 256M" >&2; exit 1; } >>>> ... s/can/cannot/ ? >>> Why? "Can" is correct here. If the offsets can't exceed 256M, then >>> everything is good. >> Hmm, the wording then indeed is ambiguous. > > I see your point. In this case it's meant as "are able to", but this is > still clearer than using "can't" because at least the text matches the > check which triggered it. > >> I read "can" as "are allowed >> to", when we mean "aren't allowed to". Maybe ".text is 256M or more in >> size"? If you mention "offsets", then I think the check should be based >> on actually observing an offset which is too large (which .text size >> alone doesn't guarantee will happen). > > It's not just .text on its own because the VMA of offset by 2M, hence > the subtraction of $vma_lo in the main calculation. > > There's no point searching for offsets. There will be one near the end, > so all searching for an offset would do is complicate the critical loop. > > How about ".text offsets must not exceed 256M" ? > > That should be unambiguous. Yes, that reads fine. Thanks. Jan
diff --git a/xen/tools/check-endbr.sh b/xen/tools/check-endbr.sh index b3febd6a4ccc..d6aa117de13b 100755 --- a/xen/tools/check-endbr.sh +++ b/xen/tools/check-endbr.sh @@ -61,19 +61,36 @@ ${OBJDUMP} -j .text $1 -d -w | grep ' endbr64 *$' | cut -f 1 -d ':' > $VALID & # the lower bits, rounding integers to the nearest 4k. # # Instead, use the fact that Xen's .text is within a 1G aligned region, and -# split the VMA in half so AWK's numeric addition is only working on 32 bit -# numbers, which don't lose precision. +# split the VMA so AWK's numeric addition is only working on <32 bit +# numbers, which don't lose precision. (See point 5) # # 4) MAWK doesn't support plain hex constants (an optional part of the POSIX # spec), and GAWK and MAWK can't agree on how to work with hex constants in # a string. Use the shell to convert $vma_lo to decimal before passing to # AWK. # +# 5) Point 4 isn't fully portable. POSIX only requires that $((0xN)) be +# evaluated as long, which in 32bit shells turns negative if bit 31 of the +# VMA is set. AWK then interprets this negative number as a double before +# adding the offsets from the binary grep. +# +# Instead of doing an 8/8 split with vma_hi/lo, do a 9/7 split. +# +# The consequence of this is that for all offsets, $vma_lo + offset needs +# to be less that 256M (i.e. 7 nibbles) so as to be successfully recombined +# with the 9 nibbles of $vma_hi. This is fine; .text is at the start of a +# 1G aligned region, and Xen is far far smaller than 256M, but leave safety +# check nevertheless. +# eval $(${OBJDUMP} -j .text $1 -h | - $AWK '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, 8), substr($4, 9, 16)}') + $AWK '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, 9), substr($4, 10, 16)}') ${OBJCOPY} -j .text $1 -O binary $TEXT_BIN +bin_sz=$(stat -c '%s' $TEXT_BIN) +[ "$bin_sz" -ge $(((1 << 28) - $vma_lo)) ] && + { echo "$MSG_PFX Error: .text offsets can exceed 256M" >&2; exit 1; } + # instruction: hex: oct: # endbr64 f3 0f 1e fa 363 017 036 372 # endbr32 f3 0f 1e fb 363 017 036 373 @@ -84,7 +101,7 @@ then else grep -aob -e "$(printf '\363\17\36\372')" -e "$(printf '\363\17\36\373')" \ -e "$(printf '\146\17\37\1')" $TEXT_BIN -fi | $AWK -F':' '{printf "%s%x\n", "'$vma_hi'", int('$((0x$vma_lo))') + $1}' > $ALL +fi | $AWK -F':' '{printf "%s%07x\n", "'$vma_hi'", int('$((0x$vma_lo))') + $1}' > $ALL # Wait for $VALID to become complete wait
While Xen's current VMA means it works, the mawk fix (i.e. using $((0xN)) in the shell) isn't portable in 32bit shells. See the code comment for the fix. The fix found a second latent bug. Recombining $vma_hi/lo should have used printf "%s%08x" and only worked previously because $vma_lo had bits set in it's top nibble. Combining with the main fix, %08x becomes %07x. Fixes: $XXX patch 1 Reported-by: Jan Beulich <JBeulich@suse.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> --- CC: George Dunlap <George.Dunlap@eu.citrix.com> CC: Jan Beulich <JBeulich@suse.com> CC: Stefano Stabellini <sstabellini@kernel.org> CC: Wei Liu <wl@xen.org> CC: Julien Grall <julien@xen.org> CC: Anthony PERARD <anthony.perard@citrix.com> CC: Luca Fancellu <Luca.Fancellu@arm.com> CC: Mathieu Tarral <mathieu.tarral@protonmail.com> CC: Bertrand Marquis <Bertrand.Marquis@arm.com> v2: * New --- xen/tools/check-endbr.sh | 25 +++++++++++++++++++++---- 1 file changed, 21 insertions(+), 4 deletions(-)