diff mbox series

[RFH] t9500 failing???

Message ID xmqqr0d2p1x3.fsf@gitster.g (mailing list archive)
State New
Headers show
Series [RFH] t9500 failing??? | expand

Commit Message

Junio C Hamano June 12, 2024, 4:21 p.m. UTC
Suddenly t9500.70 has started failing for me in my local environment
(but $Corp IT folks control pretty much which version of base software
is installed and updated at what time, once I choose "I want to do CGI
in Perl" by selecting the libcgi-pm-perl module, I am not sure from
which version the thing was updated from.  The Debian version claims
to be 4.62-1.

It fails with path-info test, starting at t9500.70 with extra
warning in the log.  This code

	if ($path_info) {
		# $path_info has already been URL-decoded by the web server, but
		# $my_url and $my_uri have not. URL-decode them so we can properly
		# strip $path_info.
		$my_url = unescape($my_url);
		$my_uri = unescape($my_uri);
		if ($my_url =~ s,\Q$path_info\E$,, &&
		    $my_uri =~ s,\Q$path_info\E$,, &&
		    defined $ENV{'SCRIPT_NAME'}) {
			$base_url = $cgi->url(-base => 1) . $ENV{'SCRIPT_NAME'};
		}
	}

before it calls unescape(), I know $my_url is a
http://localhost/gitweb.cgi and after it calls unescape, it becomes
undefined.  That will trigger a "Use of uninitialized value $my_url
in substitution (s///)" warning.

unescape comes from CGI::Util because we do

	use CGI::Util qw(unescape);

early in the program.

As a workaround I locally have the attached patch to disable calling
CGI::Util::unescape implicitly as a sub, and instead make an
explicit call to it as a class method, and it seems to make the
tests pass.  Please do not ask me why it works---the reason why I am
posting this message is to find somebody who can explain it to me ;-)

The "unescape" thing in CGI::Util.pm begins with the standard
boilerplate that lets you call it as a plain-vanilla sub as well as
a class method, like so:

    # unescape URL-encoded data
    sub unescape {
      shift() if @_ > 0 and (ref($[0]) || (defined $[1] && $_[0] eq $CGI::DefaultClass));

but it seems that it has been that way since 2009, so it does not
explain why it started breaking for me all of sudden, even though
it _is_ curious that its counterpart in the same file, escape,
starts slightly differently to (presumably) achieve the same thing.

    sub escape {
      # If we being called in an OO-context, discard the first argument.
      shift() if @_ > 1 and ( ref($[0]) || (defined $[1] && $_[0] eq $CGI::DefaultClass));

Notice that the former does "shift" as long as there is even a
single argument, while the latter does so only when there are at
least two arguments.  Both presumably would take a single argument,
the string to either escape or unescape, and the shift is presumably
to shift away the class object if they are called as class methods,
so the guard at the beginning of unscape looks suspect, but I am not
a Perl person, and as I said, it seems that the code has been that
way since 2009, so it is very likely that I am barking up a wrong
tree.

Anyway.  TIA for whoever explains the solution to this puzzle to me.


 gitweb/gitweb.perl | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Kyle Lippincott June 12, 2024, 6:38 p.m. UTC | #1
On Wed, Jun 12, 2024 at 9:23 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Suddenly t9500.70 has started failing for me in my local environment
> (but $Corp IT folks control pretty much which version of base software
> is installed and updated at what time, once I choose "I want to do CGI
> in Perl" by selecting the libcgi-pm-perl module, I am not sure from
> which version the thing was updated from.  The Debian version claims
> to be 4.62-1.

I have the same version (I'm on the same $Corp IT setup). I downgraded
to 4.61 and it succeeded. Upgraded to 4.62 again, and it started
failing. The only difference in 4.62 besides a version number change
in several files is:

```
diff -ur 4.61/usr/share/perl5/CGI.pm 4.62/usr/share/perl5/CGI.pm
--- 4.61/usr/share/perl5/CGI.pm 2024-01-08 07:13:22.000000000 -0800
+++ 4.62/usr/share/perl5/CGI.pm 2024-03-01 05:43:03.000000000 -0800
@@ -1,13 +1,14 @@
 package CGI;
 require 5.008001;
 use Carp 'croak';
+use URI;

 my $appease_cpants_kwalitee = q/
 use strict;
 use warnings;
 #/;

-$CGI::VERSION='4.61';
+$CGI::VERSION='4.62';

 use CGI::Util qw(rearrange rearrange_header make_attributes unescape
escape expires ebcdic2ascii ascii2ebcdic);

@@ -2747,8 +2748,10 @@
     $url .= $path         if $path_info and defined $path;
     $url .= "?$query_str" if $query     and $query_str ne '';
     $url ||= '';
-    $url =~ s/([^a-zA-Z0-9_.%;&?\/\\:+=~-])/sprintf("%%%02X",ord($1))/eg;
-    return $url;
+
+ $url = URI->new( $url )->canonical;
+ $url =~ s!/$!!;
+ return $url
 }

 #### Method: cookie
```

I can confirm that backing that out fixes the issue. I also don't know
enough perl to know what's wrong with that statement, but looking at
the current head version, my guess is that it's the wrong type. The
current head version
(https://github.com/leejo/CGI.pm/blob/89c51a088db2a45b1c759e02c8d4772f5b6a36a9/lib/CGI.pm#L2752)
has `$url = URI->new( $url )->canonical->as_string;`, applying that
patch locally makes this work again. Running blame points us at
https://github.com/leejo/CGI.pm/issues/263 ("->url being a object
breaks everything"). One person even references this gitweb.perl
breakage in the issue comments :)

So my conclusion is that 4.62 is broken, and that newer versions are unbroken.

>
> It fails with path-info test, starting at t9500.70 with extra
> warning in the log.  This code
>
>         if ($path_info) {
>                 # $path_info has already been URL-decoded by the web server, but
>                 # $my_url and $my_uri have not. URL-decode them so we can properly
>                 # strip $path_info.
>                 $my_url = unescape($my_url);
>                 $my_uri = unescape($my_uri);
>                 if ($my_url =~ s,\Q$path_info\E$,, &&
>                     $my_uri =~ s,\Q$path_info\E$,, &&
>                     defined $ENV{'SCRIPT_NAME'}) {
>                         $base_url = $cgi->url(-base => 1) . $ENV{'SCRIPT_NAME'};
>                 }
>         }
>
> before it calls unescape(), I know $my_url is a
> http://localhost/gitweb.cgi and after it calls unescape, it becomes
> undefined.  That will trigger a "Use of uninitialized value $my_url
> in substitution (s///)" warning.
>
> unescape comes from CGI::Util because we do
>
>         use CGI::Util qw(unescape);
>
> early in the program.
>
> As a workaround I locally have the attached patch to disable calling
> CGI::Util::unescape implicitly as a sub, and instead make an
> explicit call to it as a class method, and it seems to make the
> tests pass.  Please do not ask me why it works---the reason why I am
> posting this message is to find somebody who can explain it to me ;-)

I also don't know why this fixes the issue, unfortunately.

>
> The "unescape" thing in CGI::Util.pm begins with the standard
> boilerplate that lets you call it as a plain-vanilla sub as well as
> a class method, like so:
>
>     # unescape URL-encoded data
>     sub unescape {
>       shift() if @_ > 0 and (ref($[0]) || (defined $[1] && $_[0] eq $CGI::DefaultClass));
>
> but it seems that it has been that way since 2009, so it does not
> explain why it started breaking for me all of sudden, even though
> it _is_ curious that its counterpart in the same file, escape,
> starts slightly differently to (presumably) achieve the same thing.
>
>     sub escape {
>       # If we being called in an OO-context, discard the first argument.
>       shift() if @_ > 1 and ( ref($[0]) || (defined $[1] && $_[0] eq $CGI::DefaultClass));
>
> Notice that the former does "shift" as long as there is even a
> single argument, while the latter does so only when there are at
> least two arguments.  Both presumably would take a single argument,
> the string to either escape or unescape, and the shift is presumably
> to shift away the class object if they are called as class methods,
> so the guard at the beginning of unscape looks suspect, but I am not
> a Perl person, and as I said, it seems that the code has been that
> way since 2009, so it is very likely that I am barking up a wrong
> tree.
>
> Anyway.  TIA for whoever explains the solution to this puzzle to me.
>
>
>  gitweb/gitweb.perl | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git c/gitweb/gitweb.perl w/gitweb/gitweb.perl
> index ccd14e0e30..a0a8b79ef4 100755
> --- c/gitweb/gitweb.perl
> +++ w/gitweb/gitweb.perl
> @@ -13,7 +13,7 @@
>  # handle ACL in file access tests
>  use filetest 'access';
>  use CGI qw(:standard :escapeHTML -nosticky);
> -use CGI::Util qw(unescape);
> +use CGI::Util qw();
>  use CGI::Carp qw(fatalsToBrowser set_message);
>  use Encode;
>  use Fcntl ':mode';
> @@ -22,6 +22,11 @@
>  use Time::HiRes qw(gettimeofday tv_interval);
>  use Digest::MD5 qw(md5_hex);
>
> +sub unescape {
> +       my $url = shift;
> +       return CGI::Util->unescape($url);
> +}
> +
>  binmode STDOUT, ':utf8';
>
>  if (!defined($CGI::VERSION) || $CGI::VERSION < 4.08) {
>
Junio C Hamano June 12, 2024, 8:09 p.m. UTC | #2
Kyle Lippincott <spectral@google.com> writes:

> So my conclusion is that 4.62 is broken, and that newer versions are unbroken.

Thanks for digging.
diff mbox series

Patch

diff --git c/gitweb/gitweb.perl w/gitweb/gitweb.perl
index ccd14e0e30..a0a8b79ef4 100755
--- c/gitweb/gitweb.perl
+++ w/gitweb/gitweb.perl
@@ -13,7 +13,7 @@ 
 # handle ACL in file access tests
 use filetest 'access';
 use CGI qw(:standard :escapeHTML -nosticky);
-use CGI::Util qw(unescape);
+use CGI::Util qw();
 use CGI::Carp qw(fatalsToBrowser set_message);
 use Encode;
 use Fcntl ':mode';
@@ -22,6 +22,11 @@ 
 use Time::HiRes qw(gettimeofday tv_interval);
 use Digest::MD5 qw(md5_hex);
 
+sub unescape {
+	my $url = shift;
+	return CGI::Util->unescape($url);
+}
+
 binmode STDOUT, ':utf8';
 
 if (!defined($CGI::VERSION) || $CGI::VERSION < 4.08) {