diff mbox series

[v3] gitweb: Replace <base> tag with full URLs (when using PATH_INFO)

Message ID 20200712183329.3358-1-tobi@isticktoit.net (mailing list archive)
State New, archived
Headers show
Series [v3] gitweb: Replace <base> tag with full URLs (when using PATH_INFO) | expand

Commit Message

Tobias Girstmair July 12, 2020, 6:33 p.m. UTC
using a base tag has the side-effect of not just changing the few URLs
of gitweb's static resources, but all other relative links (e.g. those
in a README.html), too.

Signed-off-by: Tobias Girstmair <tobi@isticktoit.net>
---
Apologies; missed a typo.

	tobias

 gitweb/gitweb.perl | 36 ++++++++++++++++++++++--------------
 1 file changed, 22 insertions(+), 14 deletions(-)

Comments

Junio C Hamano July 12, 2020, 9 p.m. UTC | #1
Tobias Girstmair <tobi@isticktoit.net> writes:

> using a base tag has the side-effect of not just changing the few URLs
> of gitweb's static resources, but all other relative links (e.g. those
> in a README.html), too.

Sorry, but I am not sure the description is understandable to the
intended readers of this sentence.

Where does this README.html come from?  

Is it stored in the history of the repository as a blob, and sent to
the browser with a call to git_blob_plain() sub?  Wouldn't that
codepath send the untrusted end-user data as an attachment, in which
case relative links in the blob do not get resolved relative to the
base URL anyway, no?
Tobias Girstmair July 12, 2020, 11:05 p.m. UTC | #2
On Sun, Jul 12, 2020 at 02:00:01PM -0700, Junio C Hamano wrote:
>Sorry, but I am not sure the description is understandable to the
>intended readers of this sentence.
>
>Where does this README.html come from?

gitweb reads a README.html from each repository to display on the 
summary page. 'man 1 gitweb' has a paragraph on it under "Per-repository 
gitweb configuration".

>Is it stored in the history of the repository as a blob, and sent to
>the browser with a call to git_blob_plain() sub?  Wouldn't that

No, it's a plain file in a bare repository, placed there either manually 
or by a post-update hook.

>codepath send the untrusted end-user data as an attachment, in which
>case relative links in the blob do not get resolved relative to the
>base URL anyway, no?

I'm not exactly sure what you're saying. gitweb includes the README.html 
as-is (i.e. without escaping). If the user wanted to include an image, 
they'd write <img src="blob_plain/HEAD:/image.png"> (assuming this patch 
landed). In practise, these URLs will be rewritten by the 
markdown-to-html converter.
Junio C Hamano July 13, 2020, 4:34 a.m. UTC | #3
Tobias Girstmair <tobi@isticktoit.net> writes:

> On Sun, Jul 12, 2020 at 02:00:01PM -0700, Junio C Hamano wrote:
>>Sorry, but I am not sure the description is understandable to the
>>intended readers of this sentence.
>>
>>Where does this README.html come from?
>
> gitweb reads a README.html from each repository to display on the
> summary page. 'man 1 gitweb' has a paragraph on it under
> "Per-repository gitweb configuration".
>
>>Is it stored in the history of the repository as a blob, and sent to
>>the browser with a call to git_blob_plain() sub?  Wouldn't that
>
> No, it's a plain file in a bare repository, placed there either
> manually or by a post-update hook.

OK.

> ... If the user wanted to
> include an image, they'd write <img src="blob_plain/HEAD:/image.png">
> (assuming this patch landed).

And without this patch, the src URL needs to know where this
repository appears in the site's URL namespace?

If that is the case, the change makes quite a lot of sense.

Thanks.
Tobias Girstmair July 13, 2020, 9:29 a.m. UTC | #4
On Sun, Jul 12, 2020 at 09:34:36PM -0700, Junio C Hamano wrote:
>And without this patch, the src URL needs to know where this
>repository appears in the site's URL namespace?

Exactly. Sorry if my lack of proper terminology made this confusing

>If that is the case, the change makes quite a lot of sense.
>
>Thanks.
>

	tobias
Junio C Hamano July 13, 2020, 2:44 p.m. UTC | #5
Tobias Girstmair <tobi@isticktoit.net> writes:

> On Sun, Jul 12, 2020 at 09:34:36PM -0700, Junio C Hamano wrote:
>>And without this patch, the src URL needs to know where this
>>repository appears in the site's URL namespace?
>
> Exactly. Sorry if my lack of proper terminology made this confusing

No, I was confused because I did not know Gitweb showed README.html
file in the $GIT_DIR of each repository.  Any other cases that are
affected by this, or is README the only one?
Tobias Girstmair July 13, 2020, 2:59 p.m. UTC | #6
On Mon, Jul 13, 2020 at 07:44:45AM -0700, Junio C Hamano wrote:
>Any other cases that are
>affected by this, or is README the only one?
>

README.html is the only per-repository file, but gitweb has some 
(unset/nonexisting by default) config variables to include more HTML: 
$site_header, $site_footer, $home_text. These are described in 
'man 5 gitweb.conf' under "Changing gitweb's look". grepping for 
'insert_file' reveals them in the code.

	tobias
Tobias Girstmair Nov. 20, 2020, 3:19 p.m. UTC | #7
On 13/07/2020 06:34, Junio C Hamano wrote:
[...]
> And without this patch, the src URL needs to know where this
> repository appears in the site's URL namespace?
> 
> If that is the case, the change makes quite a lot of sense.

Hi, this patch probably fell through the cracks. Can it please be 
considered for merging again?

As a reminder, here's a link to the patch:
   https://lore.kernel.org/git/20200712183329.3358-1-tobi@isticktoit.net/

Thanks,
	tobias
diff mbox series

Patch

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 0959a78..f426060 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -1616,6 +1616,19 @@  sub esc_url {
 	return $str;
 }
 
+# the stylesheet, favicon etc urls won't work correctly with path_info
+# unless we set the appropriate base URL. not using a <base> tag to not
+# also change relative URLs inserted by the user.
+sub esc_url_base {
+	my $url = shift;
+	my $prefix = $ENV{'PATH_INFO'}? esc_url($base_url)."/" : "";
+	if ($url !~ m{^(https?:)?//?}) {
+		return $prefix . esc_url($url);
+	} else {
+		return esc_url($url);
+	}
+}
+
 # quote unsafe characters in HTML attributes
 sub esc_attr {
 
@@ -2232,7 +2245,7 @@  sub git_get_avatar {
 		return $pre_white .
 		       "<img width=\"$size\" " .
 		            "class=\"avatar\" " .
-		            "src=\"".esc_url($url)."\" " .
+		            "src=\"".esc_url_base($url)."\" " .
 			    "alt=\"\" " .
 		       "/>" . $post_white;
 	} else {
@@ -4099,17 +4112,17 @@  sub print_header_links {
 	# print out each stylesheet that exist, providing backwards capability
 	# for those people who defined $stylesheet in a config file
 	if (defined $stylesheet) {
-		print '<link rel="stylesheet" type="text/css" href="'.esc_url($stylesheet).'"/>'."\n";
+		print '<link rel="stylesheet" type="text/css" href="'.esc_url_base($stylesheet).'"/>'."\n";
 	} else {
 		foreach my $stylesheet (@stylesheets) {
 			next unless $stylesheet;
-			print '<link rel="stylesheet" type="text/css" href="'.esc_url($stylesheet).'"/>'."\n";
+			print '<link rel="stylesheet" type="text/css" href="'.esc_url_base($stylesheet).'"/>'."\n";
 		}
 	}
 	print_feed_meta()
 		if ($status eq '200 OK');
 	if (defined $favicon) {
-		print qq(<link rel="shortcut icon" href=").esc_url($favicon).qq(" type="image/png" />\n);
+		print qq(<link rel="shortcut icon" href=").esc_url_base($favicon).qq(" type="image/png" />\n);
 	}
 }
 
@@ -4212,11 +4225,6 @@  sub git_header_html {
 <meta name="robots" content="index, nofollow"/>
 <title>$title</title>
 EOF
-	# the stylesheet, favicon etc urls won't work correctly with path_info
-	# unless we set the appropriate base URL
-	if ($ENV{'PATH_INFO'}) {
-		print "<base href=\"".esc_url($base_url)."\" />\n";
-	}
 	print_header_links($status);
 
 	if (defined $site_html_head_string) {
@@ -4234,7 +4242,7 @@  sub git_header_html {
 	if (defined $logo) {
 		print $cgi->a({-href => esc_url($logo_url),
 		               -title => $logo_label},
-		              $cgi->img({-src => esc_url($logo),
+		              $cgi->img({-src => esc_url_base($logo),
 		                         -width => 72, -height => 27,
 		                         -alt => "git",
 		                         -class => "logo"}));
@@ -4299,7 +4307,7 @@  sub git_footer_html {
 		insert_file($site_footer);
 	}
 
-	print qq!<script type="text/javascript" src="!.esc_url($javascript).qq!"></script>\n!;
+	print qq!<script type="text/javascript" src="!.esc_url_base($javascript).qq!"></script>\n!;
 	if (defined $action &&
 	    $action eq 'blame_incremental') {
 		print qq!<script type="text/javascript">\n!.
@@ -8273,7 +8281,7 @@  sub git_feed {
 		if (defined $logo || defined $favicon) {
 			# prefer the logo to the favicon, since RSS
 			# doesn't allow both
-			my $img = esc_url($logo || $favicon);
+			my $img = esc_url_base($logo || $favicon);
 			print "<image>\n" .
 			      "<url>$img</url>\n" .
 			      "<title>$title</title>\n" .
@@ -8299,11 +8307,11 @@  sub git_feed {
 		      # use project owner for feed author
 		      "<author><name>$owner</name></author>\n";
 		if (defined $favicon) {
-			print "<icon>" . esc_url($favicon) . "</icon>\n";
+			print "<icon>" . esc_url_base($favicon) . "</icon>\n";
 		}
 		if (defined $logo) {
 			# not twice as wide as tall: 72 x 27 pixels
-			print "<logo>" . esc_url($logo) . "</logo>\n";
+			print "<logo>" . esc_url_base($logo) . "</logo>\n";
 		}
 		if (! %latest_date) {
 			# dummy date to keep the feed valid until commits trickle in: