diff mbox series

chainlint.pl: Extend regexp pattern for /proc/cpuinfo on Linux SPARC

Message ID 20240520111109.99882-1-glaubitz@physik.fu-berlin.de (mailing list archive)
State New
Headers show
Series chainlint.pl: Extend regexp pattern for /proc/cpuinfo on Linux SPARC | expand

Commit Message

John Paul Adrian Glaubitz May 20, 2024, 11:11 a.m. UTC
On SPARC systems running Linux, individual processors are denoted with
"CPUnn:" in /proc/cpuinfo instead of the usual "processor NN:" so that
the current regexp in ncores() returns 0. Extend the regexp to match
lines with "CPUnn:" as well to properly detect the number of available
cores on these systems.

Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
---
 t/chainlint.pl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Junio C Hamano May 20, 2024, 4:16 p.m. UTC | #1
John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> writes:

> On SPARC systems running Linux, individual processors are denoted with
> "CPUnn:" in /proc/cpuinfo instead of the usual "processor NN:" so that
> the current regexp in ncores() returns 0. Extend the regexp to match
> lines with "CPUnn:" as well to properly detect the number of available
> cores on these systems.
>
> Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
> ---
>  t/chainlint.pl | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/t/chainlint.pl b/t/chainlint.pl
> index 556ee91a15..63cac942ac 100755
> --- a/t/chainlint.pl
> +++ b/t/chainlint.pl
> @@ -718,7 +718,7 @@ sub ncores {
>  	# Windows
>  	return $ENV{NUMBER_OF_PROCESSORS} if exists($ENV{NUMBER_OF_PROCESSORS});
>  	# Linux / MSYS2 / Cygwin / WSL
> -	do { local @ARGV='/proc/cpuinfo'; return scalar(grep(/^processor[\s\d]*:/, <>)); } if -r '/proc/cpuinfo';
> +	do { local @ARGV='/proc/cpuinfo'; return scalar(grep(/^processor[\s\d]*:||^CPU[\d]*:/, <>)); } if -r '/proc/cpuinfo';

Is the doubled || intended?  Doesn't it introduce an empty pattern
that slurps every single line of /proc/cpuinfo?

I was wondering if we want to first add the "reasonable fallback"
Eric mentioned ealier, and then build on top, whose result may look
like the attached.  You can enable the STDERR thing with your double
"||" added back and see what "cd t && perl chainlint.pl" produces.

Thanks.

diff --git i/t/chainlint.pl w/t/chainlint.pl
index 556ee91a15..775f06281b 100755
--- i/t/chainlint.pl
+++ w/t/chainlint.pl
@@ -718,7 +718,13 @@ sub ncores {
 	# Windows
 	return $ENV{NUMBER_OF_PROCESSORS} if exists($ENV{NUMBER_OF_PROCESSORS});
 	# Linux / MSYS2 / Cygwin / WSL
-	do { local @ARGV='/proc/cpuinfo'; return scalar(grep(/^processor[\s\d]*:/, <>)); } if -r '/proc/cpuinfo';
+	do {
+		local @ARGV='/proc/cpuinfo';
+		my @num = grep(/^processor[\s\d]*:|^CPU[\d]*:/, <>);
+# print STDERR "FOUND <@num>\n";
+		return 1 if (!@num);
+		return scalar(@num);
+	} if -r '/proc/cpuinfo';
 	# macOS & BSD
 	return qx/sysctl -n hw.ncpu/ if $^O =~ /(?:^darwin$|bsd)/;
 	return 1;
John Paul Adrian Glaubitz May 20, 2024, 4:48 p.m. UTC | #2
Hi Junio,

On Mon, 2024-05-20 at 09:16 -0700, Junio C Hamano wrote:
> John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> writes:
> 
> > On SPARC systems running Linux, individual processors are denoted with
> > "CPUnn:" in /proc/cpuinfo instead of the usual "processor NN:" so that
> > the current regexp in ncores() returns 0. Extend the regexp to match
> > lines with "CPUnn:" as well to properly detect the number of available
> > cores on these systems.
> > 
> > Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
> > ---
> >  t/chainlint.pl | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/t/chainlint.pl b/t/chainlint.pl
> > index 556ee91a15..63cac942ac 100755
> > --- a/t/chainlint.pl
> > +++ b/t/chainlint.pl
> > @@ -718,7 +718,7 @@ sub ncores {
> >  	# Windows
> >  	return $ENV{NUMBER_OF_PROCESSORS} if exists($ENV{NUMBER_OF_PROCESSORS});
> >  	# Linux / MSYS2 / Cygwin / WSL
> > -	do { local @ARGV='/proc/cpuinfo'; return scalar(grep(/^processor[\s\d]*:/, <>)); } if -r '/proc/cpuinfo';
> > +	do { local @ARGV='/proc/cpuinfo'; return scalar(grep(/^processor[\s\d]*:||^CPU[\d]*:/, <>)); } if -r '/proc/cpuinfo';
> 
> Is the doubled || intended?  Doesn't it introduce an empty pattern
> that slurps every single line of /proc/cpuinfo?

I'm not a Perl expert by any means, so I wasn't sure what the correct logical OR
operator would be. If it turns out to be wrong, let's fix that.

> I was wondering if we want to first add the "reasonable fallback"
> Eric mentioned ealier, and then build on top, whose result may look
> like the attached.  You can enable the STDERR thing with your double
> "||" added back and see what "cd t && perl chainlint.pl" produces.
> 
> Thanks.
> 
> diff --git i/t/chainlint.pl w/t/chainlint.pl
> index 556ee91a15..775f06281b 100755
> --- i/t/chainlint.pl
> +++ w/t/chainlint.pl
> @@ -718,7 +718,13 @@ sub ncores {
>  	# Windows
>  	return $ENV{NUMBER_OF_PROCESSORS} if exists($ENV{NUMBER_OF_PROCESSORS});
>  	# Linux / MSYS2 / Cygwin / WSL
> -	do { local @ARGV='/proc/cpuinfo'; return scalar(grep(/^processor[\s\d]*:/, <>)); } if -r '/proc/cpuinfo';
> +	do {
> +		local @ARGV='/proc/cpuinfo';
> +		my @num = grep(/^processor[\s\d]*:|^CPU[\d]*:/, <>);
> +# print STDERR "FOUND <@num>\n";
> +		return 1 if (!@num);
> +		return scalar(@num);
> +	} if -r '/proc/cpuinfo';
>  	# macOS & BSD
>  	return qx/sysctl -n hw.ncpu/ if $^O =~ /(?:^darwin$|bsd)/;
>  	return 1;

This seems to work fine for me as well. If you post it as a patch, I'm more than
happy to give it a Tested-By.

Btw, it would be great if this could be extended to support the output for the
Alpha architecture as well since the testsuite fails the same way [1]. The output
for /proc/cpuinfo looks like this [2]:

(alpha-chroot)root@p100:/# cat /proc/cpuinfo
cpu                     : Alpha
cpu model               : ev67
cpu variation           : 0
cpu revision            : 0
cpu serial number       : JA00000000
system type             : QEMU
system variation        : QEMU_v8.0.92
system revision         : 0
system serial number    : AY00000000
cycle frequency [Hz]    : 250000000
timer frequency [Hz]    : 250.00
page size [bytes]       : 8192
phys. address bits      : 44
max. addr. space #      : 255
BogoMIPS                : 2500.00
platform string         : AlphaServer QEMU user-mode VM
cpus detected           : 8
cpus active             : 4
cpu active mask         : 0000000000000095
L1 Icache               : n/a
L1 Dcache               : n/a
L2 cache                : n/a
L3 cache                : n/a

Thanks so much for helping with the fix!

Adrian

> [1] https://buildd.debian.org/status/fetch.php?pkg=git&arch=alpha&ver=1%3A2.45.1-1&stamp=1716194983&raw=0
> [2] https://lore.kernel.org/all/20230901204251.137307-4-richard.henderson@linaro.org/
Eric Sunshine May 20, 2024, 4:50 p.m. UTC | #3
On Mon, May 20, 2024 at 12:16 PM Junio C Hamano <gitster@pobox.com> wrote:
> John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> writes:
> > On SPARC systems running Linux, individual processors are denoted with
> > "CPUnn:" in /proc/cpuinfo instead of the usual "processor NN:" so that
> > the current regexp in ncores() returns 0. Extend the regexp to match
> > lines with "CPUnn:" as well to properly detect the number of available
> > cores on these systems.
> >
> > Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
> > ---
> > diff --git a/t/chainlint.pl b/t/chainlint.pl
> > @@ -718,7 +718,7 @@ sub ncores {
> >       # Windows
> >       return $ENV{NUMBER_OF_PROCESSORS} if exists($ENV{NUMBER_OF_PROCESSORS});
> >       # Linux / MSYS2 / Cygwin / WSL
> > -     do { local @ARGV='/proc/cpuinfo'; return scalar(grep(/^processor[\s\d]*:/, <>)); } if -r '/proc/cpuinfo';
> > +     do { local @ARGV='/proc/cpuinfo'; return scalar(grep(/^processor[\s\d]*:||^CPU[\d]*:/, <>)); } if -r '/proc/cpuinfo';
>
> Is the doubled || intended?  Doesn't it introduce an empty pattern
> that slurps every single line of /proc/cpuinfo?

I was also wondering about `||`; it looks like a typo.

More importantly, though, we can simplify the pattern to:

    /^(processor|CPU)[\s\d]*:/

which is much easier to comprehend and gives correct results from the
SPARC /proc/cpuinfo output.

> I was wondering if we want to first add the "reasonable fallback"
> Eric mentioned ealier, and then build on top, whose result may look
> like the attached.

I'm fine with a well-focused patch which just fixes the reported
problem; the "reasonable fallback" change can be layered atop at any
time. But, of course, if Adrian wants to tackle it as part of this
series, I would not object.

> diff --git i/t/chainlint.pl w/t/chainlint.pl
> @@ -718,7 +718,13 @@ sub ncores {
>         # Windows
>         return $ENV{NUMBER_OF_PROCESSORS} if exists($ENV{NUMBER_OF_PROCESSORS});
>         # Linux / MSYS2 / Cygwin / WSL
> -       do { local @ARGV='/proc/cpuinfo'; return scalar(grep(/^processor[\s\d]*:/, <>)); } if -r '/proc/cpuinfo';
> +       do {
> +               local @ARGV='/proc/cpuinfo';
> +               my @num = grep(/^processor[\s\d]*:|^CPU[\d]*:/, <>);
> +# print STDERR "FOUND <@num>\n";
> +               return 1 if (!@num);
> +               return scalar(@num);
> +       } if -r '/proc/cpuinfo';
>         # macOS & BSD
>         return qx/sysctl -n hw.ncpu/ if $^O =~ /(?:^darwin$|bsd)/;
>         return 1;

I had a more all-inclusive change in mind. These number-of-cpu checks
are in order from least to most costly but they are not necessarily
mutually exclusive. As such, my thinking was that the logic would fall
through to the next check if the preceding check produced zero or
nonsense.
Eric Sunshine May 20, 2024, 4:52 p.m. UTC | #4
On Mon, May 20, 2024 at 12:48 PM John Paul Adrian Glaubitz
<glaubitz@physik.fu-berlin.de> wrote:
> Btw, it would be great if this could be extended to support the output for the
> Alpha architecture as well since the testsuite fails the same way [1]. The output
> for /proc/cpuinfo looks like this [2]:
>
> (alpha-chroot)root@p100:/# cat /proc/cpuinfo
> cpus detected           : 8
> cpus active             : 4

What would be the correct answer for this case? 4 or 8?
Eric Sunshine May 20, 2024, 5:07 p.m. UTC | #5
On Mon, May 20, 2024 at 12:50 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Mon, May 20, 2024 at 12:16 PM Junio C Hamano <gitster@pobox.com> wrote:
> >         # Windows
> >         return $ENV{NUMBER_OF_PROCESSORS} if exists($ENV{NUMBER_OF_PROCESSORS});
> >         # Linux / MSYS2 / Cygwin / WSL
> > -       do { local @ARGV='/proc/cpuinfo'; return scalar(grep(/^processor[\s\d]*:/, <>)); } if -r '/proc/cpuinfo';
> > +       do {
> > +               local @ARGV='/proc/cpuinfo';
> > +               my @num = grep(/^processor[\s\d]*:|^CPU[\d]*:/, <>);
> > +# print STDERR "FOUND <@num>\n";
> > +               return 1 if (!@num);
> > +               return scalar(@num);
> > +       } if -r '/proc/cpuinfo';
> >         # macOS & BSD
> >         return qx/sysctl -n hw.ncpu/ if $^O =~ /(?:^darwin$|bsd)/;
> >         return 1;
>
> I had a more all-inclusive change in mind. These number-of-cpu checks
> are in order from least to most costly but they are not necessarily
> mutually exclusive. As such, my thinking was that the logic would fall
> through to the next check if the preceding check produced zero or
> nonsense.

Hmph. Looking at this more closely, I guess I did make these more
mutually-exclusive than I had thought, so falling through to the next
check probably doesn't buy us much. In any case, for robustness, I
still think that each check needs to have a sensible fallback. An
alternative would be for the caller of ncores() to fallback to 1 if
ncores() returns 0 (or nonsense).
Junio C Hamano May 20, 2024, 5:23 p.m. UTC | #6
Eric Sunshine <sunshine@sunshineco.com> writes:

>> I was wondering if we want to first add the "reasonable fallback"
>> Eric mentioned ealier, and then build on top, whose result may look
>> like the attached.
>
> I'm fine with a well-focused patch which just fixes the reported
> problem; the "reasonable fallback" change can be layered atop at any
> time.

Yeah, I never suggested to do these in a single patch.  Since I
would think that it is easier to do and review a patch that cleans
up the code and adds a reasonable fallback before adding new support
for sparc or alpha (after all, such a clean-up is also for longer
term maintainability---by definition, it must be easier to add new
support to the result of a clean-up than the original, or it is not
a clean-up), I suggested to first add such a change.  What you saw
was how the result of "then build on top" would have looked like.

> I had a more all-inclusive change in mind. These number-of-cpu checks
> are in order from least to most costly but they are not necessarily
> mutually exclusive. As such, my thinking was that the logic would fall
> through to the next check if the preceding check produced zero or
> nonsense.

OK.  All the more reason to clean-up first, then?  If we pile more
on top of the current structure, it would make the later clean-up
more cumbersome, wouldn't it?

Thanks.
diff mbox series

Patch

diff --git a/t/chainlint.pl b/t/chainlint.pl
index 556ee91a15..63cac942ac 100755
--- a/t/chainlint.pl
+++ b/t/chainlint.pl
@@ -718,7 +718,7 @@  sub ncores {
 	# Windows
 	return $ENV{NUMBER_OF_PROCESSORS} if exists($ENV{NUMBER_OF_PROCESSORS});
 	# Linux / MSYS2 / Cygwin / WSL
-	do { local @ARGV='/proc/cpuinfo'; return scalar(grep(/^processor[\s\d]*:/, <>)); } if -r '/proc/cpuinfo';
+	do { local @ARGV='/proc/cpuinfo'; return scalar(grep(/^processor[\s\d]*:||^CPU[\d]*:/, <>)); } if -r '/proc/cpuinfo';
 	# macOS & BSD
 	return qx/sysctl -n hw.ncpu/ if $^O =~ /(?:^darwin$|bsd)/;
 	return 1;