mbox series

[bpf-next,00/17] Improve BPF syscall command documentation

Message ID 20210217010821.1810741-1-joe@wand.net.nz (mailing list archive)
Headers show
Series Improve BPF syscall command documentation | expand

Message

Joe Stringer Feb. 17, 2021, 1:08 a.m. UTC
From: Joe Stringer <joe@cilium.io>

The state of bpf(2) manual pages today is not exactly ideal. For the
most part, it was written several years ago and has not kept up with the
pace of development in the kernel tree. For instance, out of a total of
~35 commands to the BPF syscall available today, when I pull the
kernel-man-pages tree today I find just 6 documented commands: The very
basics of map interaction and program load.

In contrast, looking at bpf-helpers(7), I am able today to run one
command[0] to fetch API documentation of the very latest eBPF helpers
that have been added to the kernel. This documentation is up to date
because kernel maintainers enforce documenting the APIs as part of
the feature submission process. As far as I can tell, we rely on manual
synchronization from the kernel tree to the kernel-man-pages tree to
distribute these more widely, so all locations may not be completely up
to date. That said, the documentation does in fact exist in the first
place which is a major initial hurdle to overcome.

Given the relative success of the process around bpf-helpers(7) to
encourage developers to document their user-facing changes, in this
patch series I explore applying this technique to bpf(2) as well.
Unfortunately, even with bpf(2) being so out-of-date, there is still a
lot of content to convert over. In particular, I've identified at least
the following aspects of the bpf syscall which could individually be
generated from separate documentation in the header:
* BPF syscall commands
* BPF map types
* BPF program types
* BPF attachment points

Rather than tackle everything at once, I have focused in this series on
the syscall commands, "enum bpf_cmd". This series is structured to first
import what useful descriptions there are from the kernel-man-pages
tree, then piece-by-piece document a few of the syscalls in more detail
in cases where I could find useful documentation from the git tree or
from a casual read of the code. Not all documentation is comprehensive
at this point, but a basis is provided with examples that can be further
enhanced with subsequent follow-up patches. Note, the series in its
current state only includes documentation around the syscall commands
themselves, so in the short term it doesn't allow us to automate bpf(2)
generation; Only one section of the man page could be replaced. Though
if there is appetite for this approach, this should be trivial to
improve on, even if just by importing the remaining static text from the
kernel-man-pages tree.

Following that, the series enhances the python scripting around parsing
the descriptions from the header files and generating dedicated
ReStructured Text and troff output. Finally, to expose the new text and
reduce the likelihood of having it get out of date or break the docs
parser, it is added to the selftests and exposed through the kernel
documentation web pages.

At this point I'd like to put this out for comments. In my mind, the
ideal eventuation of this work would be to extend kernel UAPI headers
such that each of the categories I had listed above (commands, maps,
progs, hooks) have dedicated documentation in the kernel tree, and that
developers must update the comments in the headers to document the APIs
prior to patch acceptance, and that we could auto-generate the latest
version of the bpf(2) manual pages based on a few static description
sections combined with the dynamically-generated output from the header.

Thanks also to Quentin Monnet for initial review.

[0]: make -C tools/bpf -f Makefile.docs bpf-helpers.7

Joe Stringer (17):
  bpf: Import syscall arg documentation
  bpf: Add minimal bpf() command documentation
  bpf: Document BPF_F_LOCK in syscall commands
  bpf: Document BPF_PROG_PIN syscall command
  bpf: Document BPF_PROG_ATTACH syscall command
  bpf: Document BPF_PROG_TEST_RUN syscall command
  bpf: Document BPF_PROG_QUERY syscall command
  bpf: Document BPF_MAP_*_BATCH syscall commands
  scripts/bpf: Rename bpf_helpers_doc.py -> bpf_doc.py
  scripts/bpf: Abstract eBPF API target parameter
  scripts/bpf: Add syscall commands printer
  tools/bpf: Rename Makefile.{helpers,docs}
  tools/bpf: Templatize man page generation
  tools/bpf: Build bpf-sycall.2 in Makefile.docs
  selftests/bpf: Add docs target
  docs/bpf: Add bpf() syscall command reference
  tools: Sync uapi bpf.h header with latest changes

 Documentation/Makefile                        |   2 +
 Documentation/bpf/Makefile                    |  28 +
 Documentation/bpf/bpf_commands.rst            |   5 +
 Documentation/bpf/index.rst                   |  14 +-
 include/uapi/linux/bpf.h                      | 709 +++++++++++++++++-
 scripts/{bpf_helpers_doc.py => bpf_doc.py}    | 189 ++++-
 tools/bpf/Makefile.docs                       |  88 +++
 tools/bpf/Makefile.helpers                    |  60 --
 tools/bpf/bpftool/Documentation/Makefile      |  12 +-
 tools/include/uapi/linux/bpf.h                | 709 +++++++++++++++++-
 tools/lib/bpf/Makefile                        |   2 +-
 tools/perf/MANIFEST                           |   2 +-
 tools/testing/selftests/bpf/Makefile          |  20 +-
 .../selftests/bpf/test_bpftool_build.sh       |  21 -
 tools/testing/selftests/bpf/test_doc_build.sh |  13 +
 15 files changed, 1736 insertions(+), 138 deletions(-)
 create mode 100644 Documentation/bpf/Makefile
 create mode 100644 Documentation/bpf/bpf_commands.rst
 rename scripts/{bpf_helpers_doc.py => bpf_doc.py} (82%)
 create mode 100644 tools/bpf/Makefile.docs
 delete mode 100644 tools/bpf/Makefile.helpers
 create mode 100755 tools/testing/selftests/bpf/test_doc_build.sh

Comments

Toke Høiland-Jørgensen Feb. 17, 2021, 1:55 p.m. UTC | #1
Joe Stringer <joe@wand.net.nz> writes:

> From: Joe Stringer <joe@cilium.io>
>
> The state of bpf(2) manual pages today is not exactly ideal. For the
> most part, it was written several years ago and has not kept up with the
> pace of development in the kernel tree. For instance, out of a total of
> ~35 commands to the BPF syscall available today, when I pull the
> kernel-man-pages tree today I find just 6 documented commands: The very
> basics of map interaction and program load.

Yes indeed! Thank you for tackling this! :)

> In contrast, looking at bpf-helpers(7), I am able today to run one
> command[0] to fetch API documentation of the very latest eBPF helpers
> that have been added to the kernel. This documentation is up to date
> because kernel maintainers enforce documenting the APIs as part of
> the feature submission process. As far as I can tell, we rely on manual
> synchronization from the kernel tree to the kernel-man-pages tree to
> distribute these more widely, so all locations may not be completely up
> to date. That said, the documentation does in fact exist in the first
> place which is a major initial hurdle to overcome.
>
> Given the relative success of the process around bpf-helpers(7) to
> encourage developers to document their user-facing changes, in this
> patch series I explore applying this technique to bpf(2) as well.
> Unfortunately, even with bpf(2) being so out-of-date, there is still a
> lot of content to convert over. In particular, I've identified at least
> the following aspects of the bpf syscall which could individually be
> generated from separate documentation in the header:
> * BPF syscall commands
> * BPF map types
> * BPF program types
> * BPF attachment points

Does this also include program subtypes (AKA expected_attach_type?)

> Rather than tackle everything at once, I have focused in this series on
> the syscall commands, "enum bpf_cmd". This series is structured to first
> import what useful descriptions there are from the kernel-man-pages
> tree, then piece-by-piece document a few of the syscalls in more detail
> in cases where I could find useful documentation from the git tree or
> from a casual read of the code. Not all documentation is comprehensive
> at this point, but a basis is provided with examples that can be further
> enhanced with subsequent follow-up patches. Note, the series in its
> current state only includes documentation around the syscall commands
> themselves, so in the short term it doesn't allow us to automate bpf(2)
> generation; Only one section of the man page could be replaced. Though
> if there is appetite for this approach, this should be trivial to
> improve on, even if just by importing the remaining static text from the
> kernel-man-pages tree.
>
> Following that, the series enhances the python scripting around parsing
> the descriptions from the header files and generating dedicated
> ReStructured Text and troff output. Finally, to expose the new text and
> reduce the likelihood of having it get out of date or break the docs
> parser, it is added to the selftests and exposed through the kernel
> documentation web pages.
>
> At this point I'd like to put this out for comments. In my mind, the
> ideal eventuation of this work would be to extend kernel UAPI headers
> such that each of the categories I had listed above (commands, maps,
> progs, hooks) have dedicated documentation in the kernel tree, and that
> developers must update the comments in the headers to document the APIs
> prior to patch acceptance, and that we could auto-generate the latest
> version of the bpf(2) manual pages based on a few static description
> sections combined with the dynamically-generated output from the header.

I like the approach, and I don't think it's too onerous to require
updates to the documentation everywhere like we (as you note) already do
for helpers.

So with that, please feel free to add my enthusiastic:

Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Jonathan Corbet Feb. 17, 2021, 5:32 p.m. UTC | #2
[CC += linux-doc]

Joe Stringer <joe@wand.net.nz> writes:

> From: Joe Stringer <joe@cilium.io>
>
> The state of bpf(2) manual pages today is not exactly ideal. For the
> most part, it was written several years ago and has not kept up with the
> pace of development in the kernel tree. For instance, out of a total of
> ~35 commands to the BPF syscall available today, when I pull the
> kernel-man-pages tree today I find just 6 documented commands: The very
> basics of map interaction and program load.
>
> In contrast, looking at bpf-helpers(7), I am able today to run one
> command[0] to fetch API documentation of the very latest eBPF helpers
> that have been added to the kernel. This documentation is up to date
> because kernel maintainers enforce documenting the APIs as part of
> the feature submission process. As far as I can tell, we rely on manual
> synchronization from the kernel tree to the kernel-man-pages tree to
> distribute these more widely, so all locations may not be completely up
> to date. That said, the documentation does in fact exist in the first
> place which is a major initial hurdle to overcome.
>
> Given the relative success of the process around bpf-helpers(7) to
> encourage developers to document their user-facing changes, in this
> patch series I explore applying this technique to bpf(2) as well.

So I am totally in favor of improving the BPF docs, this is great work.

That said, I am a bit less thrilled about creating a new, parallel
documentation-build system in the kernel.  I don't think that BPF is so
special that it needs to do its own thing here.

In particular, I would love to have the bpf() syscall API information
incorporated into the userspace-api book with all the rest of the
user-space API docs.  That could be done now by formatting your
information as a DOC: block.

If you started that way, you'd get the whole existing build system for
free.  You would also have started down a path that could, some bright
shining day, lead to this kind of documentation for *all* of our system
calls.  That would be a huge improvement in how we do things.

The troff output would still need implementation, but we'd like to have
that anyway.  We used to create man pages for internal kernel APIs; that
was lost in the sphinx transition and hasn't been a priority since
people haven't been screaming, but it could still be nice to have it
back.

So...could I ask you to have a look at doing this within the kernel's
docs system instead of in addition to it?  Even if it means digging into
scripts/kernel-doc, which isn't all that high on my list of Fun Things
To Do either?  I'm willing to try to help, and maybe we can get some
other assistance too - I'm ever the optimist.

Thanks,

jon
Joe Stringer Feb. 18, 2021, 4:08 a.m. UTC | #3
On Wed, Feb 17, 2021 at 5:55 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Joe Stringer <joe@wand.net.nz> writes:
> > Given the relative success of the process around bpf-helpers(7) to
> > encourage developers to document their user-facing changes, in this
> > patch series I explore applying this technique to bpf(2) as well.
> > Unfortunately, even with bpf(2) being so out-of-date, there is still a
> > lot of content to convert over. In particular, I've identified at least
> > the following aspects of the bpf syscall which could individually be
> > generated from separate documentation in the header:
> > * BPF syscall commands
> > * BPF map types
> > * BPF program types
> > * BPF attachment points
>
> Does this also include program subtypes (AKA expected_attach_type?)

I seem to have left my lawyerly "including, but not limited to..."
language at home today ;-) . Of course, I can add that to the list.

> > At this point I'd like to put this out for comments. In my mind, the
> > ideal eventuation of this work would be to extend kernel UAPI headers
> > such that each of the categories I had listed above (commands, maps,
> > progs, hooks) have dedicated documentation in the kernel tree, and that
> > developers must update the comments in the headers to document the APIs
> > prior to patch acceptance, and that we could auto-generate the latest
> > version of the bpf(2) manual pages based on a few static description
> > sections combined with the dynamically-generated output from the header.
>
> I like the approach, and I don't think it's too onerous to require
> updates to the documentation everywhere like we (as you note) already do
> for helpers.
>
> So with that, please feel free to add my enthusiastic:
>
> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>

Thanks Toke.
Joe Stringer Feb. 18, 2021, 4:22 a.m. UTC | #4
On Wed, Feb 17, 2021 at 9:32 AM Jonathan Corbet <corbet@lwn.net> wrote:
>
> [CC += linux-doc]
>
> Joe Stringer <joe@wand.net.nz> writes:
>
> > From: Joe Stringer <joe@cilium.io>
> >
> > The state of bpf(2) manual pages today is not exactly ideal. For the
> > most part, it was written several years ago and has not kept up with the
> > pace of development in the kernel tree. For instance, out of a total of
> > ~35 commands to the BPF syscall available today, when I pull the
> > kernel-man-pages tree today I find just 6 documented commands: The very
> > basics of map interaction and program load.
> >
> > In contrast, looking at bpf-helpers(7), I am able today to run one
> > command[0] to fetch API documentation of the very latest eBPF helpers
> > that have been added to the kernel. This documentation is up to date
> > because kernel maintainers enforce documenting the APIs as part of
> > the feature submission process. As far as I can tell, we rely on manual
> > synchronization from the kernel tree to the kernel-man-pages tree to
> > distribute these more widely, so all locations may not be completely up
> > to date. That said, the documentation does in fact exist in the first
> > place which is a major initial hurdle to overcome.
> >
> > Given the relative success of the process around bpf-helpers(7) to
> > encourage developers to document their user-facing changes, in this
> > patch series I explore applying this technique to bpf(2) as well.
>
> So I am totally in favor of improving the BPF docs, this is great work.
>
> That said, I am a bit less thrilled about creating a new, parallel
> documentation-build system in the kernel.  I don't think that BPF is so
> special that it needs to do its own thing here.
>
> If you started that way, you'd get the whole existing build system for
> free.  You would also have started down a path that could, some bright
> shining day, lead to this kind of documentation for *all* of our system
> calls.  That would be a huge improvement in how we do things.
>
> The troff output would still need implementation, but we'd like to have
> that anyway.  We used to create man pages for internal kernel APIs; that
> was lost in the sphinx transition and hasn't been a priority since
> people haven't been screaming, but it could still be nice to have it
> back.
>
> So...could I ask you to have a look at doing this within the kernel's
> docs system instead of in addition to it?  Even if it means digging into
> scripts/kernel-doc, which isn't all that high on my list of Fun Things
> To Do either?  I'm willing to try to help, and maybe we can get some
> other assistance too - I'm ever the optimist.

Hey Jon, thanks for the feedback. Absolutely, what you say makes
sense. The intent here wasn't to come up with something new. Based on
your prompt from this email (and a quick look at your KR '19
presentation), I'm hearing a few observations:
* Storing the documentation in the code next to the things that
contributors edit is a reasonable approach to documentation of this
kind.
* This series currently proposes adding some new Makefile
infrastructure. However, good use of the "kernel-doc" sphinx directive
+ "DOC: " incantations in the header should be able to achieve the
same without adding such dedicated build system logic to the tree.
* The changes in patch 16 here extended Documentation/bpf/index.rst,
but to assist in improving the overall kernel documentation
organisation / hierarchy, you would prefer to instead introduce a
dedicated Documentation/userspace-api/bpf/ directory where the bpf
uAPI portions can be documented.

From the above, there's a couple of clear actionable items I can look
into for a series v2 which should tidy things up.

In addition to this, today the bpf helpers documentation is built
through the bpftool build process as well as the runtime bpf
selftests, mostly as a way to ensure that the API documentation
conforms to a particular style, which then assists with the generation
of ReStructured Text and troff output. I can probably simplify the
make infrastructure involved in triggering the bpf docs build for bpf
subsystem developers and maintainers. I think there's likely still
interest from bpf folks to keep that particular dependency in the
selftests like today and even extend it to include this new
Documentation, so that we don't either introduce text that fails
against the parser or in some other way break the parser. Whether that
validation is done by scripts/kernel-doc or scripts/bpf_helpers_doc.py
doesn't make a big difference to me, other than I have zero experience
with Perl. My first impressions are that the bpf_helpers_doc.py is
providing stricter formatting requirements than what "DOC: " +
kernel-doc would provide, so my baseline inclination would be to keep
those patches to enhance that script and use that for the validation
side (help developers with stronger linting feedback), then use
kernel-doc for the actual html docs generation side, which would help
to satisfy your concern around duplication of the documentation build
systems.

Cheers,
Joe
Toke Høiland-Jørgensen Feb. 18, 2021, 11:33 a.m. UTC | #5
Joe Stringer <joe@cilium.io> writes:

> On Wed, Feb 17, 2021 at 5:55 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Joe Stringer <joe@wand.net.nz> writes:
>> > Given the relative success of the process around bpf-helpers(7) to
>> > encourage developers to document their user-facing changes, in this
>> > patch series I explore applying this technique to bpf(2) as well.
>> > Unfortunately, even with bpf(2) being so out-of-date, there is still a
>> > lot of content to convert over. In particular, I've identified at least
>> > the following aspects of the bpf syscall which could individually be
>> > generated from separate documentation in the header:
>> > * BPF syscall commands
>> > * BPF map types
>> > * BPF program types
>> > * BPF attachment points
>>
>> Does this also include program subtypes (AKA expected_attach_type?)
>
> I seem to have left my lawyerly "including, but not limited to..."
> language at home today ;-) . Of course, I can add that to the list.

Great, thanks! :)

-Toke
Jonathan Corbet Feb. 18, 2021, 7:26 p.m. UTC | #6
Joe Stringer <joe@cilium.io> writes:

> Hey Jon, thanks for the feedback. Absolutely, what you say makes
> sense. The intent here wasn't to come up with something new. Based on
> your prompt from this email (and a quick look at your KR '19
> presentation), I'm hearing a few observations:
> * Storing the documentation in the code next to the things that
> contributors edit is a reasonable approach to documentation of this
> kind.

Yes.  At least, it's what we do for a lot of our other documentation in
the kernel.  The assumption is that it will encourage developers to keep
the docs current; in my experience that's somewhat optimistic, but
optimism is good...:)

> * This series currently proposes adding some new Makefile
> infrastructure. However, good use of the "kernel-doc" sphinx directive
> + "DOC: " incantations in the header should be able to achieve the
> same without adding such dedicated build system logic to the tree.

If it can, I would certainly prefer to see it used - or extended, if
need be, to meet your needs.

> * The changes in patch 16 here extended Documentation/bpf/index.rst,
> but to assist in improving the overall kernel documentation
> organisation / hierarchy, you would prefer to instead introduce a
> dedicated Documentation/userspace-api/bpf/ directory where the bpf
> uAPI portions can be documented.

An objective I've been working on for some years is reorienting the
documentation with a focus on who the readers are.  We've tended to
organize it by subsystem, requiring people to wade through a lot of
stuff that isn't useful to them.  So yes, my preference would be to
document the kernel's user-space API in the relevant manual.

That said, I do tend to get pushback here at times, and the BPF API is
arguably a bit different that much of the rest.  So while the above
preference exists and is reasonably strong, the higher priority is to
get good, current documentation in *somewhere* so that it's available to
users.  I don't want to make life too difficult for people working
toward that goal, even if I would paint it a different color.

> In addition to this, today the bpf helpers documentation is built
> through the bpftool build process as well as the runtime bpf
> selftests, mostly as a way to ensure that the API documentation
> conforms to a particular style, which then assists with the generation
> of ReStructured Text and troff output. I can probably simplify the
> make infrastructure involved in triggering the bpf docs build for bpf
> subsystem developers and maintainers. I think there's likely still
> interest from bpf folks to keep that particular dependency in the
> selftests like today and even extend it to include this new
> Documentation, so that we don't either introduce text that fails
> against the parser or in some other way break the parser. Whether that
> validation is done by scripts/kernel-doc or scripts/bpf_helpers_doc.py
> doesn't make a big difference to me, other than I have zero experience
> with Perl. My first impressions are that the bpf_helpers_doc.py is
> providing stricter formatting requirements than what "DOC: " +
> kernel-doc would provide, so my baseline inclination would be to keep
> those patches to enhance that script and use that for the validation
> side (help developers with stronger linting feedback), then use
> kernel-doc for the actual html docs generation side, which would help
> to satisfy your concern around duplication of the documentation build
> systems.

This doesn't sound entirely unreasonable.  I wonder if the BPF helper
could be built into an sphinx extension to make it easy to pull that
information into the docs build.  The advantage there is that it can be
done in Python :)

Looking forward to the next set.

Thanks,

jon
Joe Stringer Feb. 18, 2021, 9:53 p.m. UTC | #7
On Thu, Feb 18, 2021 at 11:49 AM Jonathan Corbet <corbet@lwn.net> wrote:
>
> Joe Stringer <joe@cilium.io> writes:
> > * The changes in patch 16 here extended Documentation/bpf/index.rst,
> > but to assist in improving the overall kernel documentation
> > organisation / hierarchy, you would prefer to instead introduce a
> > dedicated Documentation/userspace-api/bpf/ directory where the bpf
> > uAPI portions can be documented.
>
> An objective I've been working on for some years is reorienting the
> documentation with a focus on who the readers are.  We've tended to
> organize it by subsystem, requiring people to wade through a lot of
> stuff that isn't useful to them.  So yes, my preference would be to
> document the kernel's user-space API in the relevant manual.
>
> That said, I do tend to get pushback here at times, and the BPF API is
> arguably a bit different that much of the rest.  So while the above
> preference exists and is reasonably strong, the higher priority is to
> get good, current documentation in *somewhere* so that it's available to
> users.  I don't want to make life too difficult for people working
> toward that goal, even if I would paint it a different color.

Sure, I'm all for it. Unless I hear alternative feedback I'll roll it
under Documentation/userspace-api/bpf in the next revision.

> > In addition to this, today the bpf helpers documentation is built
> > through the bpftool build process as well as the runtime bpf
> > selftests, mostly as a way to ensure that the API documentation
> > conforms to a particular style, which then assists with the generation
> > of ReStructured Text and troff output. I can probably simplify the
> > make infrastructure involved in triggering the bpf docs build for bpf
> > subsystem developers and maintainers. I think there's likely still
> > interest from bpf folks to keep that particular dependency in the
> > selftests like today and even extend it to include this new
> > Documentation, so that we don't either introduce text that fails
> > against the parser or in some other way break the parser. Whether that
> > validation is done by scripts/kernel-doc or scripts/bpf_helpers_doc.py
> > doesn't make a big difference to me, other than I have zero experience
> > with Perl. My first impressions are that the bpf_helpers_doc.py is
> > providing stricter formatting requirements than what "DOC: " +
> > kernel-doc would provide, so my baseline inclination would be to keep
> > those patches to enhance that script and use that for the validation
> > side (help developers with stronger linting feedback), then use
> > kernel-doc for the actual html docs generation side, which would help
> > to satisfy your concern around duplication of the documentation build
> > systems.
>
> This doesn't sound entirely unreasonable.  I wonder if the BPF helper
> could be built into an sphinx extension to make it easy to pull that
> information into the docs build.  The advantage there is that it can be
> done in Python :)

Probably doable, it's already written in python. One thing at a time
though... :)

Cheers,
Joe