diff mbox series

doc: make HTML manual reproducible

Message ID 20201201095037.20715-1-arnout@bzzt.net (mailing list archive)
State New, archived
Headers show
Series doc: make HTML manual reproducible | expand

Commit Message

Arnout Engelen Dec. 1, 2020, 9:50 a.m. UTC
This makes sure the generated id's inside the html version of the
documentation use the same id's when the same version of the
manual is generated twice.

Signed-off-by: Arnout Engelen <arnout@bzzt.net>
---
 Documentation/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Todd Zullinger Dec. 1, 2020, 3:41 p.m. UTC | #1
Hi Arnout,

[cc: brian, Martin, and peff, for their collective wisdom in
the area of docs and involvement in the last discussion of
docbook-xsl requirements.]

Arnout Engelen wrote:
> This makes sure the generated id's inside the html version of the
> documentation use the same id's when the same version of the
> manual is generated twice.
> 
> Signed-off-by: Arnout Engelen <arnout@bzzt.net>
> ---
>  Documentation/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/Documentation/Makefile b/Documentation/Makefile
> index 80d1908a44..4d1fd5e31f 100644
> --- a/Documentation/Makefile
> +++ b/Documentation/Makefile
> @@ -380,7 +380,7 @@ SubmittingPatches.txt: SubmittingPatches
>  	$(QUIET_GEN) cp $< $@
>  
>  XSLT = docbook.xsl
> -XSLTOPTS = --xinclude --stringparam html.stylesheet docbook-xsl.css
> +XSLTOPTS = --xinclude --stringparam html.stylesheet docbook-xsl.css --stringparam generate.consistent.ids 1
>  
>  user-manual.html: user-manual.xml $(XSLT)
>  	$(QUIET_XSLTPROC)$(RM) $@+ $@ && \

I think this would raise the minimum supported version of
docbook-xsl to 1.77.1.  That might be fine, but we'd
probably want to make sure it doesn't negatively impact
OS/distributions which build the docs as a likely group who
care about reproducible builds.  And we'd want to update the
requirement in INSTALL, of course.

The minimum docbook-xsl version was raised from 1.73 to
1.74, in 5a80d85bbe (INSTALL: drop support for docbook-xsl
before 1.74, 2020-03-29).  That change was discussed in
<cover.1585486103.git.martin.agren@gmail.com>¹.

AFAICT, the generate.consistent.ids param was added in
docbook-xsl-1.77.1 which was released in June 2012.  The
commit which added it is 74735098e (New param to support
replacing generate-id() with xsl:number for more consistent
id values., 2011-10-24).

In any case, a minimum of 1.77.1 is present in the supported
releases of CentOS/RHEL and Debian/Ubuntu, at least (most
have 1.79.x).  Those are certainly not the only systems Git
cares about; they're simply the systems with which I am at
least mildly familiar.

¹ https://lore.kernel.org/git/cover.1585486103.git.martin.agren@gmail.com/
Martin Ågren Dec. 1, 2020, 7:18 p.m. UTC | #2
On Tue, 1 Dec 2020 at 16:41, Todd Zullinger <tmz@pobox.com> wrote:

> Arnout Engelen wrote:
> > This makes sure the generated id's inside the html version of the
> > documentation use the same id's when the same version of the
> > manual is generated twice.

Thanks for this. Your mail still hasn't reached me through the list, so
I'm replying to Todd's reply here.

Looking through the word-diff of the resulting user-manual.html, I see
basically three changes. I've applied some light copy-editing to try to
make them more readable (the lack of spaces means the notion of "word"
diff is a bit off). Do you also have these three changes?

  1. <a name=[-"idp1"-]{+"id-1"+}></a>Git User Manual</h1>

  2. <a href=[-"#idp2"-]{+"#id-1.2"+}></a>

  3. <a name=[-"idp2"-]{+"id-1.2"+}></a>

These names and links don't strike me as particularly useful. But even
so, reproducibility makes sense. I guess you're not really after linking
to these, but want to have binary-identical user-manual.html when
building from the same commit?

> > @@ -380,7 +380,7 @@ SubmittingPatches.txt: SubmittingPatches
> >       $(QUIET_GEN) cp $< $@
> >
> >  XSLT = docbook.xsl
> > -XSLTOPTS = --xinclude --stringparam html.stylesheet docbook-xsl.css
> > +XSLTOPTS = --xinclude --stringparam html.stylesheet docbook-xsl.css --stringparam generate.consistent.ids 1

Nit: Maybe add a new line

  XSLTOPTS += --stringparam generate.consistent.ids 1

instead to keep within 80 characters.

> I think this would raise the minimum supported version of
> docbook-xsl to 1.77.1.  That might be fine, but we'd
> probably want to make sure it doesn't negatively impact
> OS/distributions which build the docs as a likely group who
> care about reproducible builds.  And we'd want to update the
> requirement in INSTALL, of course.

Thanks for digging out the version number. Agreed that this change
should be reflected in INSTALL.

> The minimum docbook-xsl version was raised from 1.73 to
> 1.74, in 5a80d85bbe (INSTALL: drop support for docbook-xsl
> before 1.74, 2020-03-29).  That change was discussed in
> <cover.1585486103.git.martin.agren@gmail.com>¹.

As mentioned there, it's not like I went out and tested with 1.74. So
whether we *actually* support 1.74 right now is anyone's guess.

> AFAICT, the generate.consistent.ids param was added in
> docbook-xsl-1.77.1 which was released in June 2012.  The
> commit which added it is 74735098e (New param to support
> replacing generate-id() with xsl:number for more consistent
> id values., 2011-10-24).
>
> In any case, a minimum of 1.77.1 is present in the supported
> releases of CentOS/RHEL and Debian/Ubuntu, at least (most
> have 1.79.x).  Those are certainly not the only systems Git
> cares about; they're simply the systems with which I am at
> least mildly familiar.

Seems to me like we can go with a change like this. If you're on a
pre-1.77.1 system and want user-manual.html, but don't want to update
your tool chain, you should still be able to clone the "htmldocs" repo.
I'm not sure exactly where the cut-off point is, but I think moving up
to mid-2012 should be ok.

Martin
Junio C Hamano Dec. 1, 2020, 7:27 p.m. UTC | #3
Todd Zullinger <tmz@pobox.com> writes:

> Hi Arnout,
>
> [cc: brian, Martin, and peff, for their collective wisdom in
> the area of docs and involvement in the last discussion of
> docbook-xsl requirements.]
>
> Arnout Engelen wrote:
>> This makes sure the generated id's inside the html version of the
>> documentation use the same id's when the same version of the
>> manual is generated twice.
>> 
>> Signed-off-by: Arnout Engelen <arnout@bzzt.net>
>> ---
>>  Documentation/Makefile | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/Documentation/Makefile b/Documentation/Makefile
>> index 80d1908a44..4d1fd5e31f 100644
>> --- a/Documentation/Makefile
>> +++ b/Documentation/Makefile
>> @@ -380,7 +380,7 @@ SubmittingPatches.txt: SubmittingPatches
>>  	$(QUIET_GEN) cp $< $@
>>  
>>  XSLT = docbook.xsl
>> -XSLTOPTS = --xinclude --stringparam html.stylesheet docbook-xsl.css
>> +XSLTOPTS = --xinclude --stringparam html.stylesheet docbook-xsl.css --stringparam generate.consistent.ids 1
>>  
>>  user-manual.html: user-manual.xml $(XSLT)
>>  	$(QUIET_XSLTPROC)$(RM) $@+ $@ && \
>
> I think this would raise the minimum supported version of
> docbook-xsl to 1.77.1.  That might be fine, but we'd
> probably want to make sure it doesn't negatively impact
> OS/distributions which build the docs as a likely group who
> care about reproducible builds.  And we'd want to update the
> requirement in INSTALL, of course.
>
> The minimum docbook-xsl version was raised from 1.73 to
> 1.74, in 5a80d85bbe (INSTALL: drop support for docbook-xsl
> before 1.74, 2020-03-29).  That change was discussed in
> <cover.1585486103.git.martin.agren@gmail.com>¹.
>
> AFAICT, the generate.consistent.ids param was added in
> docbook-xsl-1.77.1 which was released in June 2012.  The
> commit which added it is 74735098e (New param to support
> replacing generate-id() with xsl:number for more consistent
> id values., 2011-10-24).
>
> In any case, a minimum of 1.77.1 is present in the supported
> releases of CentOS/RHEL and Debian/Ubuntu, at least (most
> have 1.79.x).  Those are certainly not the only systems Git
> cares about; they're simply the systems with which I am at
> least mildly familiar.
>
> ¹ https://lore.kernel.org/git/cover.1585486103.git.martin.agren@gmail.com/

I think it is in general a good thing to do (can we lose some lines
from doc-diff, I wonder?) and many of the result of your study shown
above should be in the log message in a summarised form.  Here is a
starter.

    Generated ID's in the HTML version of the documentation does not
    stay the same even when they are generated from the identical
    source.  Pass generate.consistent.ids=1 to the xsltproc tool to
    make them stable, which would help reproducible build product.

    We currently only support docbook-xsl version 1.74 or newer, but
    this option requires version at least 1.77.1 (released in June
    2012), which would be OK for supported releases of CentOS/RHEL
    and Debian/Ubuntu (most have 1.79.x).

We may probably want to talk about versions commonly distributed for
macOS and BSDs as well (I do not know if we want to list minor Linux
distros).  And we definitely need to update INSTALL as you pointed
out.

Thanks and thanks for CC'ing those who may know the area well.
Martin Ågren Dec. 1, 2020, 7:57 p.m. UTC | #4
On Tue, 1 Dec 2020 at 20:27, Junio C Hamano <gitster@pobox.com> wrote:
>
> Todd Zullinger <tmz@pobox.com> writes:
>
> > Arnout Engelen wrote:
> >> This makes sure the generated id's inside the html version of the
> >> documentation use the same id's when the same version of the
> >> manual is generated twice.

> >>  XSLT = docbook.xsl
> >> -XSLTOPTS = --xinclude --stringparam html.stylesheet docbook-xsl.css
> >> +XSLTOPTS = --xinclude --stringparam html.stylesheet docbook-xsl.css --stringparam generate.consistent.ids 1
> >>
> >>  user-manual.html: user-manual.xml $(XSLT)
> >>      $(QUIET_XSLTPROC)$(RM) $@+ $@ && \
> >

> I think it is in general a good thing to do (can we lose some lines
> from doc-diff, I wonder?) and many of the result of your study shown

This is only about user-manual.html, which is outside the scope of our
doc-diff tool, both by being html not man, and by being user-manual. So
no, no reduction in any doc-diff, unfortunately.

Martin
brian m. carlson Dec. 2, 2020, 12:36 a.m. UTC | #5
On 2020-12-01 at 15:41:15, Todd Zullinger wrote:
> I think this would raise the minimum supported version of
> docbook-xsl to 1.77.1.  That might be fine, but we'd
> probably want to make sure it doesn't negatively impact
> OS/distributions which build the docs as a likely group who
> care about reproducible builds.  And we'd want to update the
> requirement in INSTALL, of course.

I don't think that's necessarily the case.  I just tested using a random
name with another DocBook project I have and it seems to work fine, so
there shouldn't be a problem with specifying a name undefined in the
stylesheet using xsltproc.

If we want this to be effective, then yes, people will need to upgrade.
But if they're happy with the old behavior on ancient systems, that
shouldn't be a problem.

Regardless, I think this is a valuable change, since there's no good
reason not to use consistent IDs and at least Debian is switching to all
reproducible builds so vendors will appreciate this change.

> AFAICT, the generate.consistent.ids param was added in
> docbook-xsl-1.77.1 which was released in June 2012.  The
> commit which added it is 74735098e (New param to support
> replacing generate-id() with xsl:number for more consistent
> id values., 2011-10-24).

If this was released in 2012, we should be fine.

> In any case, a minimum of 1.77.1 is present in the supported
> releases of CentOS/RHEL and Debian/Ubuntu, at least (most
> have 1.79.x).  Those are certainly not the only systems Git
> cares about; they're simply the systems with which I am at
> least mildly familiar.

Yeah, CentOS 6 went EOL on November 30, which is very convenient.  I'm
pretty sure nobody else is using such an old version, and even so, I
don't see a backwards compatibility problem with this change.
brian m. carlson Dec. 2, 2020, 12:41 a.m. UTC | #6
On 2020-12-01 at 09:50:37, Arnout Engelen wrote:
> This makes sure the generated id's inside the html version of the
> documentation use the same id's when the same version of the
> manual is generated twice.
> 
> Signed-off-by: Arnout Engelen <arnout@bzzt.net>
> ---
>  Documentation/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/Documentation/Makefile b/Documentation/Makefile
> index 80d1908a44..4d1fd5e31f 100644
> --- a/Documentation/Makefile
> +++ b/Documentation/Makefile
> @@ -380,7 +380,7 @@ SubmittingPatches.txt: SubmittingPatches
>  	$(QUIET_GEN) cp $< $@
>  
>  XSLT = docbook.xsl
> -XSLTOPTS = --xinclude --stringparam html.stylesheet docbook-xsl.css
> +XSLTOPTS = --xinclude --stringparam html.stylesheet docbook-xsl.css --stringparam generate.consistent.ids 1

I think we'll want --param here, not --stringparam.  This is documented
to be an integer, not a string.  The difference is that --stringparam
sets the value to "1" (that is, a string with a single character)
instead of the integer 1.

We may also want to wrap this to the next line like so:

  XSLTOPTS =  --xinclude --stringparam html.stylesheet docbook-xsl.css
  XSLTOPTS += --param generate.consistent.ids 1

Otherwise, I think this is a valuable change, as I've mentioned
downthread.  I already use this option on my own projects for the same
reason.
Jeff King Dec. 2, 2020, 1:54 a.m. UTC | #7
On Tue, Dec 01, 2020 at 08:18:45PM +0100, Martin Ågren wrote:

> Seems to me like we can go with a change like this. If you're on a
> pre-1.77.1 system and want user-manual.html, but don't want to update
> your tool chain, you should still be able to clone the "htmldocs" repo.
> I'm not sure exactly where the cut-off point is, but I think moving up
> to mid-2012 should be ok.

Yeah, I think this is the key thing. We can afford to be a bit more
aggressive with the doc toolchain requirements in general because
there's an easy fallback.

(Though I think even without that, it sounds like this version is pretty
safe, and the proposed change would not even break the old versions, but
keep them with the same behavior).

The overall proposal sounds like a good idea to me, as long as the
review comments I saw elsewhere.

-Peff
Todd Zullinger Dec. 2, 2020, 4:07 p.m. UTC | #8
brian m. carlson wrote:
> On 2020-12-01 at 15:41:15, Todd Zullinger wrote:
>> I think this would raise the minimum supported version of
>> docbook-xsl to 1.77.1.  That might be fine, but we'd
>> probably want to make sure it doesn't negatively impact
>> OS/distributions which build the docs as a likely group who
>> care about reproducible builds.  And we'd want to update the
>> requirement in INSTALL, of course.
> 
> I don't think that's necessarily the case.  I just tested using a random
> name with another DocBook project I have and it seems to work fine, so
> there shouldn't be a problem with specifying a name undefined in the
> stylesheet using xsltproc.

Oh, that's very good to know.  Thanks for testing the fine
details.  I checked that works on a CentOS 6 system where
the docbook-xsl version is 1.75.2, to test whether an older
docbook-xsl is similarly forgiving of unknown --param's.

> If we want this to be effective, then yes, people will need to upgrade.
> But if they're happy with the old behavior on ancient systems, that
> shouldn't be a problem.

Indeed.  Is it worth mentioning this at all in INSTALL?
Something like:

  -   The minimum supported version of docbook-xsl is 1.74.
  +   The minimum supported version of docbook-xsl is 1.74.  For consistent
  +   IDs in the HTML version of the user-manual, 1.79.1 or newer is
  +   necessary.

perhaps?

The explicit mention of the user-manual may be overkill,
particularly if we later apply a similar change to other
HTML docs (if any other HTML docs even need it)?
Junio C Hamano Dec. 2, 2020, 10:35 p.m. UTC | #9
Todd Zullinger <tmz@pobox.com> writes:

> brian m. carlson wrote:
> ...
>> I don't think that's necessarily the case.  I just tested using a random
>> name with another DocBook project I have and it seems to work fine, so
>> there shouldn't be a problem with specifying a name undefined in the
>> stylesheet using xsltproc.
>
> Oh, that's very good to know.  Thanks for testing the fine
> details.  I checked that works on a CentOS 6 system where
> the docbook-xsl version is 1.75.2, to test whether an older
> docbook-xsl is similarly forgiving of unknown --param's.

That is very good, indeed.
brian m. carlson Dec. 2, 2020, 11:45 p.m. UTC | #10
On 2020-12-02 at 16:07:55, Todd Zullinger wrote:
> brian m. carlson wrote:
> > If we want this to be effective, then yes, people will need to upgrade.
> > But if they're happy with the old behavior on ancient systems, that
> > shouldn't be a problem.
> 
> Indeed.  Is it worth mentioning this at all in INSTALL?
> Something like:
> 
>   -   The minimum supported version of docbook-xsl is 1.74.
>   +   The minimum supported version of docbook-xsl is 1.74.  For consistent
>   +   IDs in the HTML version of the user-manual, 1.79.1 or newer is
>   +   necessary.
> 
> perhaps?

I don't know that that's even necessary.  Anyone doing reproducible
builds is already aware of the required versions in order to do a
reproducible build, and I don't think the average user is going to be
super interested.

We can if you feel strongly about it, but I don't personally see it as a
big deal.
Junio C Hamano Dec. 3, 2020, 1:33 a.m. UTC | #11
"brian m. carlson" <sandals@crustytoothpaste.net> writes:

> On 2020-12-02 at 16:07:55, Todd Zullinger wrote:
>> brian m. carlson wrote:
>> > If we want this to be effective, then yes, people will need to upgrade.
>> > But if they're happy with the old behavior on ancient systems, that
>> > shouldn't be a problem.
>> 
>> Indeed.  Is it worth mentioning this at all in INSTALL?
>> Something like:
>> 
>>   -   The minimum supported version of docbook-xsl is 1.74.
>>   +   The minimum supported version of docbook-xsl is 1.74.  For consistent
>>   +   IDs in the HTML version of the user-manual, 1.79.1 or newer is
>>   +   necessary.
>> 
>> perhaps?
>
> I don't know that that's even necessary.  Anyone doing reproducible
> builds is already aware of the required versions in order to do a
> reproducible build, and I don't think the average user is going to be
> super interested.
>
> We can if you feel strongly about it, but I don't personally see it as a
> big deal.

I tend to agree.

The tool being lenient and ignoring a parameter from the future
makes things very simple, and the way the patch is structured,
"stable ID" is not even a build option the builder can enable.
For some version of the toolchain, the option means stable ID and
some older version, it does not mean anything, and that is fine.

We ship our Makefile with "CFLAGS = -g -O2 -Wall" and we do not say
things like "Such and such optimizations are only available if you
use GCC newer than version X.Y"; it is fine to treat the "--param"
the same way here.
Todd Zullinger Dec. 3, 2020, 2 a.m. UTC | #12
brian m. carlson wrote:
> On 2020-12-02 at 16:07:55, Todd Zullinger wrote:
>> Indeed.  Is it worth mentioning this at all in INSTALL?
>> Something like:
>> 
>>   -   The minimum supported version of docbook-xsl is 1.74.
>>   +   The minimum supported version of docbook-xsl is 1.74.  For consistent
>>   +   IDs in the HTML version of the user-manual, 1.79.1 or newer is
>>   +   necessary.
>> 
>> perhaps?
> 
> I don't know that that's even necessary.  Anyone doing reproducible
> builds is already aware of the required versions in order to do a
> reproducible build, and I don't think the average user is going to be
> super interested.
> 
> We can if you feel strongly about it, but I don't personally see it as a
> big deal.

Nah, I definitely don't feel strongly about it.  I debated
the same point you made -- that those most likely to care
about the feature would likely know the versions (perhaps
painfully) well.

This is a nice change that, conveniently, has all pros and
no cons, it seems.

I suppose we'd ideally look for a v2 with --stringparam
replaced with --param as you noted upthread?

And maybe an updated commit message to note that it requires
docbook-xsl 1.79.1 to be effective, but older versions
gracefully ignore the option?  That may be helpful to a
future reader who wonders why the requirement wasn't raised.
diff mbox series

Patch

diff --git a/Documentation/Makefile b/Documentation/Makefile
index 80d1908a44..4d1fd5e31f 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -380,7 +380,7 @@  SubmittingPatches.txt: SubmittingPatches
 	$(QUIET_GEN) cp $< $@
 
 XSLT = docbook.xsl
-XSLTOPTS = --xinclude --stringparam html.stylesheet docbook-xsl.css
+XSLTOPTS = --xinclude --stringparam html.stylesheet docbook-xsl.css --stringparam generate.consistent.ids 1
 
 user-manual.html: user-manual.xml $(XSLT)
 	$(QUIET_XSLTPROC)$(RM) $@+ $@ && \