Message ID | alpine.DEB.2.21.2104200331110.44318@angie.orcam.me.uk (mailing list archive) |
---|---|
State | Accepted |
Commit | c1d337d45ec0a802299688e17d568c4e3a585895 |
Headers | show |
Series | Reinstate and improve MIPS `do_div' implementation | expand |
> Am 20.04.2021 um 04:50 schrieb Maciej W. Rozycki <macro@orcam.me.uk>: > > We already check the high part of the divident against zero to avoid the nit-picking: s/divident/dividend/ (seems to come from from Latin "dividendum" = the number that is to be divided). > costly DIVU instruction in that case, needed to reduce the high part of > the divident, so we may well check against the divisor instead and set > the high part of the quotient to zero right away. We need to treat the > high part the divident in that case though as the remainder that would > be calculated by the DIVU instruction we avoided. > > This has passed correctness verification with test_div64 and reduced the > module's average execution time down to 1.0445s and 0.2619s from 1.0668s > and 0.2629s respectively for an R3400 CPU @40MHz and a 5Kc CPU @160MHz. Impressive. > > Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> > --- > I have made an experimental change on top of this to put `__div64_32' out > of line, and that increases the averages respectively up to 1.0785s and > 0.2705s. Not a terrible loss, especially compared to generic times quoted > with 3/4, but still, so I think it would best be made where optimising for > size, as noted in the cover letter. > --- > arch/mips/include/asm/div64.h | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > Index: linux-3maxp-div64/arch/mips/include/asm/div64.h > =================================================================== > --- linux-3maxp-div64.orig/arch/mips/include/asm/div64.h > +++ linux-3maxp-div64/arch/mips/include/asm/div64.h > @@ -68,9 +68,11 @@ > \ > __high = __div >> 32; \ > __low = __div; \ > - __upper = __high; \ > \ > - if (__high) { \ > + if (__high < __radix) { \ > + __upper = __high; \ > + __high = 0; \ > + } else { \ > __asm__("divu $0, %z1, %z2" \ > : "=x" (__modquot) \ > : "Jr" (__high), "Jr" (__radix)); \
On Wed, 21 Apr 2021, H. Nikolaus Schaller wrote: > > We already check the high part of the divident against zero to avoid the > > nit-picking: s/divident/dividend/ Umm, I find this embarassing (and I hesitated reaching for a dictionary to double-check the spelling!). Indeed why would this be different from subtrahend or minuend? Thomas, as this mistake has spread across three out of these patches, both in change descriptions and actual code, would you mind dropping the series from mips-next altogether and I'll respin the series with all these issues corrected? Maciej
On Wed, Apr 21, 2021 at 06:16:18PM +0200, Maciej W. Rozycki wrote: > On Wed, 21 Apr 2021, H. Nikolaus Schaller wrote: > > > > We already check the high part of the divident against zero to avoid the > > > > nit-picking: s/divident/dividend/ > > Umm, I find this embarassing (and I hesitated reaching for a dictionary > to double-check the spelling!). Indeed why would this be different from > subtrahend or minuend? > > Thomas, as this mistake has spread across three out of these patches, > both in change descriptions and actual code, would you mind dropping the > series from mips-next altogether and I'll respin the series with all > these issues corrected? I'm sorry, but I don't rebase mips-next and the patches have been pushed out before I received this mail. Thomas.
On Thu, 22 Apr 2021, Thomas Bogendoerfer wrote: > > Thomas, as this mistake has spread across three out of these patches, > > both in change descriptions and actual code, would you mind dropping the > > series from mips-next altogether and I'll respin the series with all > > these issues corrected? > > I'm sorry, but I don't rebase mips-next and the patches have been pushed > out before I received this mail. Hmm, what about changes known to actually break something then? Like with R6 here? Those will undoubtedly cause someone a headache with bisection sometime in the future. Are you sure your policy is the best possible? NB, I have benchmarked the update with my DECstation, however my Malta has not come up after a reboot last evening and neither it has after a few remote power cycles. I have planned a visit in my lab next week anyway, so I'll see what has happened there; hopefully I'm able to bring the board back to life as I find it valuable for my purposes. I had to replace the PSU it came with already a couple years back and the new one is supposedly high-quality, so I fear it's the board itself. Meanwhile I'll be able to give DECstation figures only. I guess such limited results will suffice if I post the fix as an update rather than replacement. Maciej
On Thu, Apr 22, 2021 at 11:12:40AM +0200, Maciej W. Rozycki wrote: > On Thu, 22 Apr 2021, Thomas Bogendoerfer wrote: > > > > Thomas, as this mistake has spread across three out of these patches, > > > both in change descriptions and actual code, would you mind dropping the > > > series from mips-next altogether and I'll respin the series with all > > > these issues corrected? > > > > I'm sorry, but I don't rebase mips-next and the patches have been pushed > > out before I received this mail. > > Hmm, what about changes known to actually break something then? Like > with R6 here? Those will undoubtedly cause someone a headache with > bisection sometime in the future. Are you sure your policy is the best > possible? This is my Linus pull tree, so I'm following Documentation/maintainer/rebasing-and-merging.rst > Meanwhile I'll be able to give DECstation figures only. I guess such > limited results will suffice if I post the fix as an update rather than > replacement. thank you. Thomas.
Subject typo: s/is/if/ Andreas.
On Thu, 22 Apr 2021, Thomas Bogendoerfer wrote: > > Hmm, what about changes known to actually break something then? Like > > with R6 here? Those will undoubtedly cause someone a headache with > > bisection sometime in the future. Are you sure your policy is the best > > possible? > > This is my Linus pull tree, so I'm following > > Documentation/maintainer/rebasing-and-merging.rst Fair enough. > > Meanwhile I'll be able to give DECstation figures only. I guess such > > limited results will suffice if I post the fix as an update rather than > > replacement. > > thank you. I have requested 3/4 to be backported however, so I think in this case 4/4 will have to follow as well as 2/2 of the fix series I have just posted. Will you be able to resolve this somehow? Maciej
On Thu, 22 Apr 2021, Maciej W. Rozycki wrote: > NB, I have benchmarked the update with my DECstation, however my Malta > has not come up after a reboot last evening and neither it has after a few > remote power cycles. I have planned a visit in my lab next week anyway, > so I'll see what has happened there; hopefully I'm able to bring the board > back to life as I find it valuable for my purposes. I had to replace the > PSU it came with already a couple years back and the new one is supposedly > high-quality, so I fear it's the board itself. For the record I have put the Malta back in service now. It was weird: the YAMON status shown on the ASCII display was "E:P_UNKN", i.e. "Unknown PCI device on board (should never happen)" (according to the manual). Indeed I never saw that message before. By elimination I have tracked down the DEFPA FDDI network interface to be the culprit and upon a close inspection found a bent pin with the PFI ASIC (the PCI interface chip), a 160-pin 20-mil PQFP part. It was IDSEL shorting to RST_L, the adjacent pin. No wonder the card wreaked havoc with the system. Fixed by carefully moving the pins apart with a razor blade (always good to have one at hand!). Now I wonder how this card happened to work with the pin bent for some 15 years and only finally developed a short now. Weird indeed, and weirder even to see a second failure related to IC pins this year only. Maciej
Index: linux-3maxp-div64/arch/mips/include/asm/div64.h =================================================================== --- linux-3maxp-div64.orig/arch/mips/include/asm/div64.h +++ linux-3maxp-div64/arch/mips/include/asm/div64.h @@ -68,9 +68,11 @@ \ __high = __div >> 32; \ __low = __div; \ - __upper = __high; \ \ - if (__high) { \ + if (__high < __radix) { \ + __upper = __high; \ + __high = 0; \ + } else { \ __asm__("divu $0, %z1, %z2" \ : "=x" (__modquot) \ : "Jr" (__high), "Jr" (__radix)); \
We already check the high part of the divident against zero to avoid the costly DIVU instruction in that case, needed to reduce the high part of the divident, so we may well check against the divisor instead and set the high part of the quotient to zero right away. We need to treat the high part the divident in that case though as the remainder that would be calculated by the DIVU instruction we avoided. This has passed correctness verification with test_div64 and reduced the module's average execution time down to 1.0445s and 0.2619s from 1.0668s and 0.2629s respectively for an R3400 CPU @40MHz and a 5Kc CPU @160MHz. Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> --- I have made an experimental change on top of this to put `__div64_32' out of line, and that increases the averages respectively up to 1.0785s and 0.2705s. Not a terrible loss, especially compared to generic times quoted with 3/4, but still, so I think it would best be made where optimising for size, as noted in the cover letter. --- arch/mips/include/asm/div64.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)