diff mbox series

[v3] Documentation: fix build with Asciidoctor 2

Message ID 20190914194919.748935-1-sandals@crustytoothpaste.net (mailing list archive)
State New, archived
Headers show
Series [v3] Documentation: fix build with Asciidoctor 2 | expand

Commit Message

brian m. carlson Sept. 14, 2019, 7:49 p.m. UTC
Our documentation toolchain has traditionally been built around DocBook
4.5.  This version of DocBook is the last DTD-based version of DocBook.
In 2009, DocBook 5 was introduced using namespaces and its syntax is
expressed in RELAX NG, which is more expressive and allows a wider
variety of syntax forms.

Asciidoctor, one of the alternatives for building our documentation,
moved support for DocBook 4.5 out of core in its recent 2.0 release and
now only supports DocBook 5 in the main release.  The DocBoook 4.5
converter is still available as a separate component, but this is not
available in most distro packages.  This would not be a problem but for
the fact that we use xmlto, which is still stuck in the DocBook 4.5 era.

xmlto performs DTD validation as part of the build process.  This is not
problematic for DocBook 4.5, which has a valid DTD, but it clearly
cannot work for DocBook 5, since no DTD can adequately express its full
syntax.  In addition, even if xmlto did support RELAX NG validation,
that wouldn't be sufficient because it uses the libxml2-based xmllint to
do so, which has known problems with validating interleaves in RELAX NG.

Fortunately, there's an easy way forward: ask Asciidoctor to use its
DocBook 5 backend and tell xmlto to skip validation.  Asciidoctor has
supported DocBook 5 since v0.1.4 in 2013 and xmlto has supported
skipping validation for probably longer than that.

We also need to teach xmlto how to use the namespaced DocBook XSLT
stylesheets instead of the non-namespaced ones it usually uses.
Normally these stylesheets are interchangeable, but the non-namespaced
ones have a bug that causes them not to strip whitespace automatically
from certain elements when namespaces are in use.  This results in
additional whitespace at the beginning of list elements, which is
jarring and unsightly.

We can do this by passing a custom stylesheet with the -x option that
simply imports the namespaced stylesheets via a URL.  Any system with
support for XML catalogs will automatically look this URL up and
reference a local copy instead without us having to know where this
local copy is located.  We know that anyone using xmlto will already
have catalogs set up properly since the DocBook 4.5 DTD used during
validation is also looked up via catalogs.  All major Linux
distributions distribute the necessary stylesheets and have built-in
catalog support, and Homebrew does as well, albeit with a requirement to
set an environment variable to enable catalog support.

On the off chance that someone lacks support for catalogs, it is
possible for xmlto (via xmllint) to download the stylesheets from the
URLs in question, although this will likely perform poorly enough to
attract attention.  People still have the option of using the prebuilt
documentation that we ship, so happily this should not be an impediment.

Finally, we need to filter out some messages from other stylesheets that
occur when invoking dblatex in the CI job.  This tool strips namespaces
much like the unnamespaced DocBook stylesheets and prints similar
messages.  If we permit these messages to be printed to standard error,
our documentation CI job will fail because we check standard error for
unexpected output.  Due to dblatex's reliance on Python 2, we may need
to revisit its use in the future, in which case this problem may go
away, but this can be delayed until a future patch.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
Range-diff against v2:
1:  6f11a79d03 ! 1:  c4935af5de Documentation: fix build with Asciidoctor 2
    @@ Commit message
         documentation that we ship, so happily this should not be an impediment.
     
         Finally, we need to filter out some messages from other stylesheets that
    -    when invoking dblatex in the CI job.  This tool strips namespaces much
    -    like the unnamespaced DocBook stylesheets and prints similar messages.
    -    If we permit these messages to be printed to standard error, our
    -    documentation CI job will because we check standard error for unexpected
    -    output.  Due to dblatex's reliance on Python 2, we may need to revisit
    -    its use in the future, in which case this problem may go away, but this
    -    can be delayed until a future patch.
    +    occur when invoking dblatex in the CI job.  This tool strips namespaces
    +    much like the unnamespaced DocBook stylesheets and prints similar
    +    messages.  If we permit these messages to be printed to standard error,
    +    our documentation CI job will fail because we check standard error for
    +    unexpected output.  Due to dblatex's reliance on Python 2, we may need
    +    to revisit its use in the future, in which case this problem may go
    +    away, but this can be delayed until a future patch.
     
         Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
     
    @@ Documentation/manpage.xsl (new)
     +	<xsl:import href="http://docbook.sourceforge.net/release/xsl-ns/current/manpages/docbook.xsl" />
     +</xsl:stylesheet>
     
    + ## azure-pipelines.yml ##
    +@@ azure-pipelines.yml: jobs:
    +        test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1
    + 
    +        sudo apt-get update &&
    +-       sudo apt-get install -y asciidoc xmlto asciidoctor &&
    ++       sudo apt-get install -y asciidoc xmlto asciidoctor docbook-xsl-ns &&
    + 
    +        export ALREADY_HAVE_ASCIIDOCTOR=yes. &&
    +        export jobname=Documentation &&
    +
    + ## ci/install-dependencies.sh ##
    +@@ ci/install-dependencies.sh: StaticAnalysis)
    + 	;;
    + Documentation)
    + 	sudo apt-get -q update
    +-	sudo apt-get -q -y install asciidoc xmlto
    ++	sudo apt-get -q -y install asciidoc xmlto docbook-xsl-ns
    + 
    + 	test -n "$ALREADY_HAVE_ASCIIDOCTOR" ||
    + 	gem install --version 1.5.8 asciidoctor
    +
      ## ci/test-documentation.sh ##
     @@
      filter_log () {

 Documentation/Makefile     | 4 +++-
 Documentation/manpage.xsl  | 3 +++
 azure-pipelines.yml        | 2 +-
 ci/install-dependencies.sh | 2 +-
 ci/test-documentation.sh   | 2 ++
 5 files changed, 10 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/manpage.xsl

Comments

SZEDER Gábor Sept. 15, 2019, 9:59 a.m. UTC | #1
On Sat, Sep 14, 2019 at 07:49:19PM +0000, brian m. carlson wrote:
> Our documentation toolchain has traditionally been built around DocBook
> 4.5.  This version of DocBook is the last DTD-based version of DocBook.
> In 2009, DocBook 5 was introduced using namespaces and its syntax is
> expressed in RELAX NG, which is more expressive and allows a wider
> variety of syntax forms.
> 
> Asciidoctor, one of the alternatives for building our documentation,
> moved support for DocBook 4.5 out of core in its recent 2.0 release and
> now only supports DocBook 5 in the main release.  The DocBoook 4.5
> converter is still available as a separate component, but this is not
> available in most distro packages.  This would not be a problem but for
> the fact that we use xmlto, which is still stuck in the DocBook 4.5 era.
> 
> xmlto performs DTD validation as part of the build process.  This is not
> problematic for DocBook 4.5, which has a valid DTD, but it clearly
> cannot work for DocBook 5, since no DTD can adequately express its full
> syntax.  In addition, even if xmlto did support RELAX NG validation,
> that wouldn't be sufficient because it uses the libxml2-based xmllint to
> do so, which has known problems with validating interleaves in RELAX NG.
> 
> Fortunately, there's an easy way forward: ask Asciidoctor to use its
> DocBook 5 backend and tell xmlto to skip validation.  Asciidoctor has
> supported DocBook 5 since v0.1.4 in 2013 and xmlto has supported
> skipping validation for probably longer than that.
> 
> We also need to teach xmlto how to use the namespaced DocBook XSLT
> stylesheets instead of the non-namespaced ones it usually uses.
> Normally these stylesheets are interchangeable, but the non-namespaced
> ones have a bug that causes them not to strip whitespace automatically
> from certain elements when namespaces are in use.  This results in
> additional whitespace at the beginning of list elements, which is
> jarring and unsightly.
> 
> We can do this by passing a custom stylesheet with the -x option that
> simply imports the namespaced stylesheets via a URL.  Any system with
> support for XML catalogs will automatically look this URL up and
> reference a local copy instead without us having to know where this
> local copy is located.  We know that anyone using xmlto will already
> have catalogs set up properly since the DocBook 4.5 DTD used during
> validation is also looked up via catalogs.  All major Linux
> distributions distribute the necessary stylesheets and have built-in
> catalog support, and Homebrew does as well, albeit with a requirement to
> set an environment variable to enable catalog support.
> 
> On the off chance that someone lacks support for catalogs, it is
> possible for xmlto (via xmllint) to download the stylesheets from the
> URLs in question, although this will likely perform poorly enough to
> attract attention.  People still have the option of using the prebuilt
> documentation that we ship, so happily this should not be an impediment.
> 
> Finally, we need to filter out some messages from other stylesheets that
> occur when invoking dblatex in the CI job.  This tool strips namespaces
> much like the unnamespaced DocBook stylesheets and prints similar
> messages.  If we permit these messages to be printed to standard error,
> our documentation CI job will fail because we check standard error for
> unexpected output.  Due to dblatex's reliance on Python 2, we may need
> to revisit its use in the future, in which case this problem may go
> away, but this can be delayed until a future patch.
> 
> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>

>  Documentation/Makefile     | 4 +++-
>  Documentation/manpage.xsl  | 3 +++
>  azure-pipelines.yml        | 2 +-
>  ci/install-dependencies.sh | 2 +-
>  ci/test-documentation.sh   | 2 ++
>  5 files changed, 10 insertions(+), 3 deletions(-)
>  create mode 100644 Documentation/manpage.xsl
> 
> diff --git a/Documentation/Makefile b/Documentation/Makefile
> index 76f2ecfc1b..d94f47c5c9 100644
> --- a/Documentation/Makefile
> +++ b/Documentation/Makefile
> @@ -197,11 +197,13 @@ ifdef USE_ASCIIDOCTOR
>  ASCIIDOC = asciidoctor
>  ASCIIDOC_CONF =
>  ASCIIDOC_HTML = xhtml5
> -ASCIIDOC_DOCBOOK = docbook45
> +ASCIIDOC_DOCBOOK = docbook5
>  ASCIIDOC_EXTRA += -acompat-mode -atabsize=8
>  ASCIIDOC_EXTRA += -I. -rasciidoctor-extensions
>  ASCIIDOC_EXTRA += -alitdd='&\#x2d;&\#x2d;'
>  DBLATEX_COMMON =
> +XMLTO_EXTRA += --skip-validation
> +XMLTO_EXTRA += -x manpage.xsl
>  endif
>  
>  SHELL_PATH ?= $(SHELL)
> diff --git a/Documentation/manpage.xsl b/Documentation/manpage.xsl
> new file mode 100644
> index 0000000000..ef64bab17a
> --- /dev/null
> +++ b/Documentation/manpage.xsl
> @@ -0,0 +1,3 @@
> +<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
> +	<xsl:import href="http://docbook.sourceforge.net/release/xsl-ns/current/manpages/docbook.xsl" />
> +</xsl:stylesheet>
> diff --git a/azure-pipelines.yml b/azure-pipelines.yml
> index c329b7218b..34031b182a 100644
> --- a/azure-pipelines.yml
> +++ b/azure-pipelines.yml
> @@ -374,7 +374,7 @@ jobs:
>         test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1
>  
>         sudo apt-get update &&
> -       sudo apt-get install -y asciidoc xmlto asciidoctor &&
> +       sudo apt-get install -y asciidoc xmlto asciidoctor docbook-xsl-ns &&
>  
>         export ALREADY_HAVE_ASCIIDOCTOR=yes. &&
>         export jobname=Documentation &&
> diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
> index 8cc72503cb..a76f348484 100755
> --- a/ci/install-dependencies.sh
> +++ b/ci/install-dependencies.sh
> @@ -53,7 +53,7 @@ StaticAnalysis)
>  	;;
>  Documentation)
>  	sudo apt-get -q update
> -	sudo apt-get -q -y install asciidoc xmlto
> +	sudo apt-get -q -y install asciidoc xmlto docbook-xsl-ns

Ok, with this package installed the build passed on Travis CI.

>  	test -n "$ALREADY_HAVE_ASCIIDOCTOR" ||
>  	gem install --version 1.5.8 asciidoctor

So, since the documentation can now be built with Asciidoctor v2, is
it already time to remove this '--version 1.5.8'?

> diff --git a/ci/test-documentation.sh b/ci/test-documentation.sh
> index d49089832d..b3e76ef863 100755
> --- a/ci/test-documentation.sh
> +++ b/ci/test-documentation.sh
> @@ -8,6 +8,8 @@
>  filter_log () {
>  	sed -e '/^GIT_VERSION = /d' \
>  	    -e '/^    \* new asciidoc flags$/d' \
> +	    -e '/stripped namespace before processing/d' \
> +	    -e '/Attributed.*IDs for element/d' \

I haven't seen this latter message in the CI builds, neither with
Asciidoctor v1.5.8 nor with v2.  Do we really need this filter, then?
Where does this message come from?

>  	    "$1"
>  }
>
brian m. carlson Sept. 15, 2019, 9:26 p.m. UTC | #2
On 2019-09-15 at 09:59:52, SZEDER Gábor wrote:
> On Sat, Sep 14, 2019 at 07:49:19PM +0000, brian m. carlson wrote:
> >  	test -n "$ALREADY_HAVE_ASCIIDOCTOR" ||
> >  	gem install --version 1.5.8 asciidoctor
> 
> So, since the documentation can now be built with Asciidoctor v2, is
> it already time to remove this '--version 1.5.8'?

I think Martin was going to send in some more patches before we did
that.

> > diff --git a/ci/test-documentation.sh b/ci/test-documentation.sh
> > index d49089832d..b3e76ef863 100755
> > --- a/ci/test-documentation.sh
> > +++ b/ci/test-documentation.sh
> > @@ -8,6 +8,8 @@
> >  filter_log () {
> >  	sed -e '/^GIT_VERSION = /d' \
> >  	    -e '/^    \* new asciidoc flags$/d' \
> > +	    -e '/stripped namespace before processing/d' \
> > +	    -e '/Attributed.*IDs for element/d' \
> 
> I haven't seen this latter message in the CI builds, neither with
> Asciidoctor v1.5.8 nor with v2.  Do we really need this filter, then?
> Where does this message come from?

I see it and it definitely fails on my system without it.  It comes from
libxslt, which has been patched in Debian to produce deterministic IDs.
I suspect we may not have seen it on Ubuntu systems because they are
running 16.04, which is likely older than the patch.  If Travis updates
to 18.04, we may be more likely to have a problem.
SZEDER Gábor Sept. 15, 2019, 10:05 p.m. UTC | #3
On Sun, Sep 15, 2019 at 09:26:21PM +0000, brian m. carlson wrote:
> > > diff --git a/ci/test-documentation.sh b/ci/test-documentation.sh
> > > index d49089832d..b3e76ef863 100755
> > > --- a/ci/test-documentation.sh
> > > +++ b/ci/test-documentation.sh
> > > @@ -8,6 +8,8 @@
> > >  filter_log () {
> > >  	sed -e '/^GIT_VERSION = /d' \
> > >  	    -e '/^    \* new asciidoc flags$/d' \
> > > +	    -e '/stripped namespace before processing/d' \
> > > +	    -e '/Attributed.*IDs for element/d' \
> > 
> > I haven't seen this latter message in the CI builds, neither with
> > Asciidoctor v1.5.8 nor with v2.  Do we really need this filter, then?
> > Where does this message come from?
> 
> I see it and it definitely fails on my system without it.  It comes from
> libxslt, which has been patched in Debian to produce deterministic IDs.
> I suspect we may not have seen it on Ubuntu systems because they are
> running 16.04, which is likely older than the patch.  If Travis updates
> to 18.04, we may be more likely to have a problem.

Thanks.  Indeed, I kicked off a Travis CI build using their Ubuntu
18.04 image, and that "Attributed..." message was there.

I think this future-proofing is a good idea, but I also think that
this should be clarified in the commit message.
brian m. carlson Sept. 15, 2019, 10:14 p.m. UTC | #4
On 2019-09-15 at 22:05:55, SZEDER Gábor wrote:
> On Sun, Sep 15, 2019 at 09:26:21PM +0000, brian m. carlson wrote:
> > > > diff --git a/ci/test-documentation.sh b/ci/test-documentation.sh
> > > > index d49089832d..b3e76ef863 100755
> > > > --- a/ci/test-documentation.sh
> > > > +++ b/ci/test-documentation.sh
> > > > @@ -8,6 +8,8 @@
> > > >  filter_log () {
> > > >  	sed -e '/^GIT_VERSION = /d' \
> > > >  	    -e '/^    \* new asciidoc flags$/d' \
> > > > +	    -e '/stripped namespace before processing/d' \
> > > > +	    -e '/Attributed.*IDs for element/d' \
> > > 
> > > I haven't seen this latter message in the CI builds, neither with
> > > Asciidoctor v1.5.8 nor with v2.  Do we really need this filter, then?
> > > Where does this message come from?
> > 
> > I see it and it definitely fails on my system without it.  It comes from
> > libxslt, which has been patched in Debian to produce deterministic IDs.
> > I suspect we may not have seen it on Ubuntu systems because they are
> > running 16.04, which is likely older than the patch.  If Travis updates
> > to 18.04, we may be more likely to have a problem.
> 
> Thanks.  Indeed, I kicked off a Travis CI build using their Ubuntu
> 18.04 image, and that "Attributed..." message was there.
> 
> I think this future-proofing is a good idea, but I also think that
> this should be clarified in the commit message.

I can do that.  I just noticed it failed on my laptop and added it,
assuming it was the stylesheets.  I had to search Google for the output
to find out that it was libxslt.
Martin Ågren Sept. 16, 2019, 10:51 a.m. UTC | #5
On Sun, 15 Sep 2019 at 23:26, brian m. carlson
<sandals@crustytoothpaste.net> wrote:
>
> On 2019-09-15 at 09:59:52, SZEDER Gábor wrote:
> > On Sat, Sep 14, 2019 at 07:49:19PM +0000, brian m. carlson wrote:
> > >     test -n "$ALREADY_HAVE_ASCIIDOCTOR" ||
> > >     gem install --version 1.5.8 asciidoctor
> >
> > So, since the documentation can now be built with Asciidoctor v2, is
> > it already time to remove this '--version 1.5.8'?
>
> I think Martin was going to send in some more patches before we did
> that.

I've got two series floating around [1] [2]. I'll be rerolling [2]
hopefully tonight and then I feel pretty good about the manpages. I've
got a third series that I'll get to then, which fixes up the rendering
of user-manual.pdf/html with Asciidoctor. Assuming those three series
graduate, I'm not aware of any reason to hold off on the switch to
"Asciidoctor by default".

As for "v2.x vs 1.5.8", that seems like a separate issue to me, though
-- and one that your patch does a very good job at! My series [2] will
make the rendering with v2.x /prettier/, but since your series makes the
docs build /at all/, maybe it's worth switching the CI-job sooner rather
than later to make sure we keep it so.

[1] https://public-inbox.org/git/cover.1567707999.git.martin.agren@gmail.com/

[2] https://public-inbox.org/git/cover.1567534373.git.martin.agren@gmail.com/

Martin
diff mbox series

Patch

diff --git a/Documentation/Makefile b/Documentation/Makefile
index 76f2ecfc1b..d94f47c5c9 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -197,11 +197,13 @@  ifdef USE_ASCIIDOCTOR
 ASCIIDOC = asciidoctor
 ASCIIDOC_CONF =
 ASCIIDOC_HTML = xhtml5
-ASCIIDOC_DOCBOOK = docbook45
+ASCIIDOC_DOCBOOK = docbook5
 ASCIIDOC_EXTRA += -acompat-mode -atabsize=8
 ASCIIDOC_EXTRA += -I. -rasciidoctor-extensions
 ASCIIDOC_EXTRA += -alitdd='&\#x2d;&\#x2d;'
 DBLATEX_COMMON =
+XMLTO_EXTRA += --skip-validation
+XMLTO_EXTRA += -x manpage.xsl
 endif
 
 SHELL_PATH ?= $(SHELL)
diff --git a/Documentation/manpage.xsl b/Documentation/manpage.xsl
new file mode 100644
index 0000000000..ef64bab17a
--- /dev/null
+++ b/Documentation/manpage.xsl
@@ -0,0 +1,3 @@ 
+<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
+	<xsl:import href="http://docbook.sourceforge.net/release/xsl-ns/current/manpages/docbook.xsl" />
+</xsl:stylesheet>
diff --git a/azure-pipelines.yml b/azure-pipelines.yml
index c329b7218b..34031b182a 100644
--- a/azure-pipelines.yml
+++ b/azure-pipelines.yml
@@ -374,7 +374,7 @@  jobs:
        test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1
 
        sudo apt-get update &&
-       sudo apt-get install -y asciidoc xmlto asciidoctor &&
+       sudo apt-get install -y asciidoc xmlto asciidoctor docbook-xsl-ns &&
 
        export ALREADY_HAVE_ASCIIDOCTOR=yes. &&
        export jobname=Documentation &&
diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
index 8cc72503cb..a76f348484 100755
--- a/ci/install-dependencies.sh
+++ b/ci/install-dependencies.sh
@@ -53,7 +53,7 @@  StaticAnalysis)
 	;;
 Documentation)
 	sudo apt-get -q update
-	sudo apt-get -q -y install asciidoc xmlto
+	sudo apt-get -q -y install asciidoc xmlto docbook-xsl-ns
 
 	test -n "$ALREADY_HAVE_ASCIIDOCTOR" ||
 	gem install --version 1.5.8 asciidoctor
diff --git a/ci/test-documentation.sh b/ci/test-documentation.sh
index d49089832d..b3e76ef863 100755
--- a/ci/test-documentation.sh
+++ b/ci/test-documentation.sh
@@ -8,6 +8,8 @@ 
 filter_log () {
 	sed -e '/^GIT_VERSION = /d' \
 	    -e '/^    \* new asciidoc flags$/d' \
+	    -e '/stripped namespace before processing/d' \
+	    -e '/Attributed.*IDs for element/d' \
 	    "$1"
 }