mbox series

[0/2] docs: define policy forbidding use of "AI" / LLM code generators

Message ID 20231123114026.3589272-1-berrange@redhat.com (mailing list archive)
Headers show
Series docs: define policy forbidding use of "AI" / LLM code generators | expand

Message

Daniel P. Berrangé Nov. 23, 2023, 11:40 a.m. UTC
This patch kicks the hornet's nest of AI / LLM code generators.

With the increasing interest in code generators in recent times,
it is inevitable that QEMU contributions will include AI generated
code. Thus far we have remained silent on the matter. Given that
everyone knows these tools exist, our current position has to be
considered tacit acceptance of the use of AI generated code in QEMU.

The question for the project is whether that is a good position for
QEMU to take or not ?

IANAL, but I like to think I'm reasonably proficient at understanding
open source licensing. I am not inherantly against the use of AI tools,
rather I am anti-risk. I also want to see OSS licenses respected and
complied with.

AFAICT at its current state of (im)maturity the question of licensing
of AI code generator output does not have a broadly accepted / settled
legal position. This is an inherant bias/self-interest from the vendors
promoting their usage, who tend to minimize/dismiss the legal questions.
From my POV, this puts such tools in a position of elevated legal risk.

Given the fuzziness over the legal position of generated code from
such tools, I don't consider it credible (today) for a contributor
to assert compliance with the DCO terms (b) or (c) (which is a stated
pre-requisite for QEMU accepting patches) when a patch includes (or is
derived from) AI generated code.

By implication, I think that QEMU must (for now) explicitly decline
to (knowingly) accept AI generated code.

Perhaps a few years down the line the legal uncertainty will have
reduced and we can re-evaluate this policy.

NB I say "knowingly" because as reviewers we do ultimately have to
trust what contributors tell us about their patch origins, and this
has always been the case. Our policies and the use of the DCO, serve
to shift legal risk/exposure away from the project. They let us as a
project demonstrate that we took steps to set out our expectations /
requirements, and thus any contravention is the responsibility of the
contributor invovled, not the project.

Discuss...

Daniel P. Berrangé (2):
  docs: introduce dedicated page about code provenance / sign-off
  docs: define policy forbidding use of "AI" / LLM code generators

 docs/devel/code-provenance.rst    | 237 ++++++++++++++++++++++++++++++
 docs/devel/index-process.rst      |   1 +
 docs/devel/submitting-a-patch.rst |  18 +--
 3 files changed, 241 insertions(+), 15 deletions(-)
 create mode 100644 docs/devel/code-provenance.rst