Message ID | 20220601012647.1439480-1-jason@jasonyundt.email (mailing list archive) |
---|---|
State | Accepted |
Commit | 0e1a85ca7558a9ec6f2e708dcc106c455a50776d |
Headers | show |
Series | gitweb: switch to a modern DOCTYPE | expand |
On 2022-06-01 at 01:26:47, Jason Yundt wrote: > According to the HTML Standard FAQ: > > “What is the DOCTYPE for modern HTML documents? > > In text/html documents: > > <!DOCTYPE html> > > In documents delivered with an XML media type: no DOCTYPE is required > and its use is generally unnecessary. However, you may use one if you > want (see the following question). Note that the above is well-formed > XML.” > > Source: [1] > > Gitweb uses an XHTML 1.0 DOCTYPE: > > <!DOCTYPE html PUBLIC > "-//W3C//DTD XHTML 1.0 Strict//EN" > "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> > > While that DOCTYPE is still valid [2], it has several disadvantages: > > 1. It’s misleading. The DTD that browsers are supposed to use with that > DOCTYPE has nothing to do with XHTML 1.0 and isn’t available at the URL > that is given [2]. While the WHATWG may claim that, an XML parser is absolutely within its rights to refer to and use that DTD, and in fact should do so unless its catalog directs it elsewhere. It may be that some browsers use an internal catalog that refers to a different DTD, however. > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl > index 606b50104c..1835487ab2 100755 > --- a/gitweb/gitweb.perl > +++ b/gitweb/gitweb.perl > @@ -4219,7 +4219,10 @@ sub git_header_html { > my $mod_perl_version = $ENV{'MOD_PERL'} ? " $ENV{'MOD_PERL'}" : ''; > print <<EOF; > <?xml version="1.0" encoding="utf-8"?> > -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> > +<!DOCTYPE html [ > + <!ENTITY nbsp " "> > + <!ENTITY sdot "⋅"> > +]> I think this should be fine. It defines the entities we need and appears to be valid XML. I don't think there should be any problem upgrading to XHTML 5 here.
"brian m. carlson" <sandals@crustytoothpaste.net> writes: >> While that DOCTYPE is still valid [2], it has several disadvantages: >> >> 1. It’s misleading. The DTD that browsers are supposed to use with that >> DOCTYPE has nothing to do with XHTML 1.0 and isn’t available at the URL >> that is given [2]. > > While the WHATWG may claim that, an XML parser is absolutely within its > rights to refer to and use that DTD, and in fact should do so unless its > catalog directs it elsewhere. It may be that some browsers use an > internal catalog that refers to a different DTD, however. > >> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl >> index 606b50104c..1835487ab2 100755 >> --- a/gitweb/gitweb.perl >> +++ b/gitweb/gitweb.perl >> @@ -4219,7 +4219,10 @@ sub git_header_html { >> my $mod_perl_version = $ENV{'MOD_PERL'} ? " $ENV{'MOD_PERL'}" : ''; >> print <<EOF; >> <?xml version="1.0" encoding="utf-8"?> >> -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> >> +<!DOCTYPE html [ >> + <!ENTITY nbsp " "> >> + <!ENTITY sdot "⋅"> >> +]> > > I think this should be fine. It defines the entities we need and > appears to be valid XML. I don't think there should be any problem > upgrading to XHTML 5 here. OK, so in short, the patch text looks OK and the proposed log message needs a bit more work? Thanks.
On 6/1/22 08:26, Jason Yundt wrote: > According to the HTML Standard FAQ: > > “What is the DOCTYPE for modern HTML documents? > > In text/html documents: > > <!DOCTYPE html> > > In documents delivered with an XML media type: no DOCTYPE is required > and its use is generally unnecessary. However, you may use one if you > want (see the following question). Note that the above is well-formed > XML.” > > Source: [1] > > Gitweb uses an XHTML 1.0 DOCTYPE: > > <!DOCTYPE html PUBLIC > "-//W3C//DTD XHTML 1.0 Strict//EN" > "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> > > While that DOCTYPE is still valid [2], it has several disadvantages: > > 1. It’s misleading. The DTD that browsers are supposed to use with that > DOCTYPE has nothing to do with XHTML 1.0 and isn’t available at the URL > that is given [2]. > 2. It’s obsolete. XHTML 1.0 was last revised in 2002 and was superseded in > 2018 [3]. > 3. It’s unreliable. Gitweb uses and ⋅ but lets an external file > define them. “[…U]using entity references for characters in XML documents > is unsafe if they are defined in an external file (except for <, >, > &, ", and ').” [4] > > [1]: <https://github.com/whatwg/html/blob/main/FAQ.md#what-is-the-doctype-for-modern-html-documents> > [2]: <https://html.spec.whatwg.org/multipage/xhtml.html#parsing-xhtml-documents> > [3]: <https://www.w3.org/TR/xhtml1/#xhtml> > [4]: <https://html.spec.whatwg.org/multipage/xhtml.html#writing-xhtml-documents> > > Signed-off-by: Jason Yundt <jason@jasonyundt.email> So basically what this patch does is switch to HTML5, right? That is because I can see DOCTYPE "upgrade" to use "<!DOCTYPE html>", which is the DOCTYPE for HTML5. If it does, then mention HTML5 in v2.
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index 606b50104c..1835487ab2 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -4219,7 +4219,10 @@ sub git_header_html { my $mod_perl_version = $ENV{'MOD_PERL'} ? " $ENV{'MOD_PERL'}" : ''; print <<EOF; <?xml version="1.0" encoding="utf-8"?> -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> +<!DOCTYPE html [ + <!ENTITY nbsp " "> + <!ENTITY sdot "⋅"> +]> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US" lang="en-US"> <!-- git web interface version $version, (C) 2005-2006, Kay Sievers <kay.sievers\@vrfy.org>, Christian Gierke --> <!-- git core binaries version $git_version --> diff --git a/t/t9502-gitweb-standalone-parse-output.sh b/t/t9502-gitweb-standalone-parse-output.sh index 8cb582f0e6..81d5625557 100755 --- a/t/t9502-gitweb-standalone-parse-output.sh +++ b/t/t9502-gitweb-standalone-parse-output.sh @@ -220,4 +220,18 @@ test_expect_success 'no http-equiv="content-type" in XHTML' ' no_http_equiv_content_type "p=.git;a=tree" ' +proper_doctype() { + gitweb_run "$@" && + grep -F "<!DOCTYPE html [" gitweb.body && + grep "<!ENTITY nbsp" gitweb.body && + grep "<!ENTITY sdot" gitweb.body +} + +test_expect_success 'Proper DOCTYPE with entity declarations' ' + proper_doctype && + proper_doctype "p=.git" && + proper_doctype "p=.git;a=log" && + proper_doctype "p=.git;a=tree" +' + test_done
According to the HTML Standard FAQ: “What is the DOCTYPE for modern HTML documents? In text/html documents: <!DOCTYPE html> In documents delivered with an XML media type: no DOCTYPE is required and its use is generally unnecessary. However, you may use one if you want (see the following question). Note that the above is well-formed XML.” Source: [1] Gitweb uses an XHTML 1.0 DOCTYPE: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> While that DOCTYPE is still valid [2], it has several disadvantages: 1. It’s misleading. The DTD that browsers are supposed to use with that DOCTYPE has nothing to do with XHTML 1.0 and isn’t available at the URL that is given [2]. 2. It’s obsolete. XHTML 1.0 was last revised in 2002 and was superseded in 2018 [3]. 3. It’s unreliable. Gitweb uses and ⋅ but lets an external file define them. “[…U]using entity references for characters in XML documents is unsafe if they are defined in an external file (except for <, >, &, ", and ').” [4] [1]: <https://github.com/whatwg/html/blob/main/FAQ.md#what-is-the-doctype-for-modern-html-documents> [2]: <https://html.spec.whatwg.org/multipage/xhtml.html#parsing-xhtml-documents> [3]: <https://www.w3.org/TR/xhtml1/#xhtml> [4]: <https://html.spec.whatwg.org/multipage/xhtml.html#writing-xhtml-documents> Signed-off-by: Jason Yundt <jason@jasonyundt.email> --- gitweb/gitweb.perl | 5 ++++- t/t9502-gitweb-standalone-parse-output.sh | 14 ++++++++++++++ 2 files changed, 18 insertions(+), 1 deletion(-)