From patchwork Thu Nov 23 11:40:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= X-Patchwork-Id: 13466138 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 898F3C5AD4C for ; Thu, 23 Nov 2023 11:41:42 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r684g-0005w4-BW; Thu, 23 Nov 2023 06:40:50 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r684d-0005vO-T2 for qemu-devel@nongnu.org; Thu, 23 Nov 2023 06:40:47 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r684U-0004Hy-Lt for qemu-devel@nongnu.org; Thu, 23 Nov 2023 06:40:47 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1700739635; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NSGIKfprX2QpznBovjIN7HcBVUdvBDseKBE8O7SSePo=; b=GgzVUIA6BHwWwt/iaYv2tIS12qwvt9btsofvgyf4BundfKEMjIRtJeLNvXBXL3ULyqR9+z p8Szq8L47+PfYBU+LRteL1wlfmu0J75jIKLTQTEwHl7kvkutaXkVzjNf1D5YZukkburw7z QsKl0mbt32FrrR855Dxshl+GFbFynGs= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-663-fQFTABnfPZOnF0c7HErzJg-1; Thu, 23 Nov 2023 06:40:34 -0500 X-MC-Unique: fQFTABnfPZOnF0c7HErzJg-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 90A128058ED; Thu, 23 Nov 2023 11:40:33 +0000 (UTC) Received: from toolbox.redhat.com (unknown [10.42.28.44]) by smtp.corp.redhat.com (Postfix) with ESMTP id D60982166B26; Thu, 23 Nov 2023 11:40:30 +0000 (UTC) From: =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= To: qemu-devel@nongnu.org Cc: Richard Henderson , Alexander Graf , =?utf-8?q?Alex_Benn=C3=A9e?= , Paolo Bonzini , "Michael S. Tsirkin" , Markus Armbruster , =?utf-8?q?Phil_Mathieu-Daud=C3=A9?= , Stefan Hajnoczi , Thomas Huth , Kevin Wolf , Gerd Hoffmann , Mark Cave-Ayland , Peter Maydell , =?utf-8?q?Daniel_P=2E_Berrang?= =?utf-8?q?=C3=A9?= Subject: [PATCH 1/2] docs: introduce dedicated page about code provenance / sign-off Date: Thu, 23 Nov 2023 11:40:25 +0000 Message-ID: <20231123114026.3589272-2-berrange@redhat.com> In-Reply-To: <20231123114026.3589272-1-berrange@redhat.com> References: <20231123114026.3589272-1-berrange@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6 Received-SPF: pass client-ip=170.10.129.124; envelope-from=berrange@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -21 X-Spam_score: -2.2 X-Spam_bar: -- X-Spam_report: (-2.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.058, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_FILL_THIS_FORM_SHORT=0.01, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Currently we have a short paragraph saying that patches must include a Signed-off-by line, and merely link to the kernel documentation. The linked kernel docs have alot of content beyond the part about sign-off an thus is misleading/distracting to QEMU contributors. This introduces a dedicated 'code-provenance' page in QEMU talking about why we require sign-off, explaining the other tags we commonly use, and what to do in some edge cases. Signed-off-by: Daniel P. Berrangé --- docs/devel/code-provenance.rst | 197 ++++++++++++++++++++++++++++++ docs/devel/index-process.rst | 1 + docs/devel/submitting-a-patch.rst | 18 +-- 3 files changed, 201 insertions(+), 15 deletions(-) create mode 100644 docs/devel/code-provenance.rst diff --git a/docs/devel/code-provenance.rst b/docs/devel/code-provenance.rst new file mode 100644 index 0000000000..b4591a2dec --- /dev/null +++ b/docs/devel/code-provenance.rst @@ -0,0 +1,197 @@ +.. _code-provenance: + +Code provenance +=============== + +Certifying patch submissions +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The QEMU community **mandates** all contributors to certify provenance +of patch submissions they make to the project. To put it another way, +contributors must indicate that they are legally permitted to contribute +to the project. + +Certification is achieved with a low overhead by adding a single line +to the bottom of every git commit:: + + Signed-off-by: YOUR NAME + +This existence of this line asserts that the author of the patch is +contributing in accordance with the `Developer's Certificate of +Origin `__: + +.. _dco: + +:: + Developer's Certificate of Origin 1.1 + + By making a contribution to this project, I certify that: + + (a) The contribution was created in whole or in part by me and I + have the right to submit it under the open source license + indicated in the file; or + + (b) The contribution is based upon previous work that, to the best + of my knowledge, is covered under an appropriate open source + license and I have the right under that license to submit that + work with modifications, whether created in whole or in part + by me, under the same open source license (unless I am + permitted to submit under a different license), as indicated + in the file; or + + (c) The contribution was provided directly to me by some other + person who certified (a), (b) or (c) and I have not modified + it. + + (d) I understand and agree that this project and the contribution + are public and that a record of the contribution (including all + personal information I submit with it, including my sign-off) is + maintained indefinitely and may be redistributed consistent with + this project or the open source license(s) involved. + +It is generally expected that the name and email addresses used in one +of the ``Signed-off-by`` lines, matches that of the git commit ``Author`` +field. If the person sending the mail is also one of the patch authors, +it is further expected that the mail ``From:`` line name & address match +one of the ``Signed-off-by`` lines. + +Multiple authorship +~~~~~~~~~~~~~~~~~~~ + +It is not uncommon for a patch to have contributions from multiple +authors. In such a scenario, a git commit will usually be expected +to have a ``Signed-off-by`` line for each contributor involved in +creatin of the patch. Some edge cases: + + * The non-primary author's contributions were so trivial that + they can be considered not subject to copyright. In this case + the secondary authors need not include a ``Signed-off-by``. + + This case most commonly applies where QEMU reviewers give short + snippets of code as suggested fixes to a patch. The reviewers + don't need to have their own ``Signed-off-by`` added unless + their code suggestion was unusually large. + + * Both contributors work for the same employer and the employer + requires copyright assignment. + + It can be said that in this case a ``Signed-off-by`` is indicating + that the person has permission to contributeo from their employer + who is the copyright holder. It is none the less still preferrable + to include a ``Signed-off-by`` for each contributor, as in some + countries employees are not able to assign copyright to their + employer, and it also covers any time invested outside working + hours. + +Other commit tags +~~~~~~~~~~~~~~~~~ + +While the ``Signed-off-by`` tag is mandatory, there are a number of +other tags that are commonly used during QEMU development + + * **``Reviewed-by``**: when a QEMU community member reviews a patch + on the mailing list, if they consider the patch acceptable, they + should send an email reply containing a ``Reviewed-by`` tag. + + NB: a subsystem maintainer sending a pull request would replace + their own ``Reviewed-by`` with another ``Signed-off-by`` + + * **``Acked-by``**: when a QEMU subsystem maintainer approves a patch + that touches their subsystem, but intends to allow a different + maintainer to queue it and send a pull request, they would send + a mail containing a ``Acked-by`` tag. + + * **``Tested-by``**: when a QEMU community member has functionally + tested the behaviour of the patch in some manner, they should + send an email reply conmtaning a ``Tested-by`` tag. + + * **``Reported-by``**: when a QEMU community member reports a problem + via the mailing list, or some other informal channel that is not + the issue tracker, it is good practice to credit them by including + a ``Reported-by`` tag on any patch fixing the issue. When the + problem is reported via the GitLab issue tracker, however, it is + sufficient to just include a link to the issue. + +Subsystem maintainer requirements +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +When a subsystem maintainer accepts a patch from a contributor, in +addition to the normal code review points, they are expected to validate +the presence of suitable ``Signed-off-by`` tags. + +At the time they queue the patch in their subsystem tree, the maintainer +**MUST** also then add their own ``Signed-off-by`` to indicate that they +have done the aforementioned validation. + +The subsystem maintainer submitting a pull request is **NOT** expected to +have a ``Reviewed-by`` tag on the patch, since this is implied by their +own ``Signed-off-by``. + +Tools for adding ``Signed-of-by`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +There are a variety of ways tools can support adding ``Signed-off-by`` +tags for patches, avoiding the need for contributors to manually +type in this repetitive text each time. + +git commands +^^^^^^^^^^^^ + +When creating, or amending, a commit the ``-s`` flag to ``git commit`` +will append a suitable line matching the configuring git author +details. + +If preparing patches using the ``git format-patch`` tool, the ``-s`` +flag can be used to append a suitable line in the emails it creates, +without modifying the local commits. Alternatively to modify the +local commits on a branch en-mass:: + + git rebase master -x 'git commit --amend --no-edit -s' + +emacs +^^^^^ + +In the file ``$HOME/.emacs.d/abbrev_defs`` add:: + + (define-abbrev-table 'global-abbrev-table + '( + ("8rev" "Reviewed-by: YOUR NAME " nil 1) + ("8ack" "Acked-by: YOUR NAME " nil 1) + ("8test" "Tested-by: YOUR NAME " nil 1) + ("8sob" "Signed-off-by: YOUR NAME " nil 1) + )) + +with this change, if you type (for example) ``8rev`` followed +by ```` or ```` it will expand to the whole phrase. + +vim +^^^ + +In the file ``$HOME/.vimrc`` add:: + + iabbrev 8rev Reviewed-by: YOUR NAME + iabbrev 8ack Acked-by: YOUR NAME + iabbrev 8test Tested-by: YOUR NAME + iabbrev 8sob Signed-off-by: YOUR NAME + +with this change, if you type (for example) ``8rev`` followed +by ```` or ```` it will expand to the whole phrase. + +Re-starting abandoned work +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +For a variety of reasons there are some patches that get submitted to +QEMU but never merged. An unrelated contributor may decide (months or +years later) to continue working from the abandoned patch and re-submit +it with extra changes. + +If the abandoned patch already had a ``Signed-off-by`` from the original +author this **must** be preserved. The new contributor **must** then add +their own ``Signed-off-by`` after the original one if they made any +further changes to it. It is common to include a comment just prior to +the new ``Signed-off-by`` indicating what extra changes were made. For +example:: + + Signed-off-by: Some Person + [Rebased and added support for 'foo'] + Signed-off-by: New Person diff --git a/docs/devel/index-process.rst b/docs/devel/index-process.rst index 362f97ee30..b54e58105e 100644 --- a/docs/devel/index-process.rst +++ b/docs/devel/index-process.rst @@ -13,6 +13,7 @@ Notes about how to interact with the community and how and where to submit patch maintainers style submitting-a-patch + code-provenance trivial-patches stable-process submitting-a-pull-request diff --git a/docs/devel/submitting-a-patch.rst b/docs/devel/submitting-a-patch.rst index c641d948f1..ec541b3d15 100644 --- a/docs/devel/submitting-a-patch.rst +++ b/docs/devel/submitting-a-patch.rst @@ -322,21 +322,9 @@ Patch emails must include a ``Signed-off-by:`` line Your patches **must** include a Signed-off-by: line. This is a hard requirement because it's how you say "I'm legally okay to contribute -this and happy for it to go into QEMU". The process is modelled after -the `Linux kernel -`__ -policy. - -If you wrote the patch, make sure your "From:" and "Signed-off-by:" -lines use the same spelling. It's okay if you subscribe or contribute to -the list via more than one address, but using multiple addresses in one -commit just confuses things. If someone else wrote the patch, git will -include a "From:" line in the body of the email (different from your -envelope From:) that will give credit to the correct author; but again, -that author's Signed-off-by: line is mandatory, with the same spelling. - -There are various tooling options for automatically adding these tags -include using ``git commit -s`` or ``git format-patch -s``. For more +this and happy for it to go into QEMU". For full guidance, read the +:ref:`code-provenance` documentation. + information see `SubmittingPatches 1.12 `__. From patchwork Thu Nov 23 11:40:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= X-Patchwork-Id: 13466137 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 384A3C5AD4C for ; Thu, 23 Nov 2023 11:41:40 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r684f-0005w2-Mn; Thu, 23 Nov 2023 06:40:49 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r684d-0005vC-AN for qemu-devel@nongnu.org; Thu, 23 Nov 2023 06:40:47 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r684X-0004II-OU for qemu-devel@nongnu.org; Thu, 23 Nov 2023 06:40:47 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1700739641; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TU+mWXdvCG/zKRvuLfb9OFK0nSxAvYHF8L24T7DhUtU=; b=dLHzGvP9CrmhiDyYMX3RTpGE2WkJw6zAcx1ItE9yUwwKtThdBE338GNIAKYQYB1qYZhXgf tKCspYy9wpKc0EXo/TyqFmU6oX/xDHbyZzpDhD6o9vwGYeMC0psL1A1D0LM0tx0kogRzzo dYaIIrOGjTGtCP2xSxBmVOeiMeuJVkk= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-175-vYD_6QCPN2aA-EgEvlgnzg-1; Thu, 23 Nov 2023 06:40:37 -0500 X-MC-Unique: vYD_6QCPN2aA-EgEvlgnzg-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E5B123806702; Thu, 23 Nov 2023 11:40:36 +0000 (UTC) Received: from toolbox.redhat.com (unknown [10.42.28.44]) by smtp.corp.redhat.com (Postfix) with ESMTP id E64B72166B26; Thu, 23 Nov 2023 11:40:33 +0000 (UTC) From: =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= To: qemu-devel@nongnu.org Cc: Richard Henderson , Alexander Graf , =?utf-8?q?Alex_Benn=C3=A9e?= , Paolo Bonzini , "Michael S. Tsirkin" , Markus Armbruster , =?utf-8?q?Phil_Mathieu-Daud=C3=A9?= , Stefan Hajnoczi , Thomas Huth , Kevin Wolf , Gerd Hoffmann , Mark Cave-Ayland , Peter Maydell , =?utf-8?q?Daniel_P=2E_Berrang?= =?utf-8?q?=C3=A9?= Subject: [PATCH 2/2] docs: define policy forbidding use of "AI" / LLM code generators Date: Thu, 23 Nov 2023 11:40:26 +0000 Message-ID: <20231123114026.3589272-3-berrange@redhat.com> In-Reply-To: <20231123114026.3589272-1-berrange@redhat.com> References: <20231123114026.3589272-1-berrange@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6 Received-SPF: pass client-ip=170.10.129.124; envelope-from=berrange@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -21 X-Spam_score: -2.2 X-Spam_bar: -- X-Spam_report: (-2.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.058, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org There has been an explosion of interest in so called "AI" (LLM) code generators in the past year or so. Thus far though, this is has not been matched by a broadly accepted legal interpretation of the licensing implications for code generator outputs. While the vendors may claim there is no problem and a free choice of license is possible, they have an inherent conflict of interest in promoting this interpretation. More broadly there is, as yet, no broad consensus on the licensing implications of code generators trained on inputs under a wide variety of licenses. The DCO requires contributors to assert they have the right to contribute under the designated project license. Given the lack of consensus on the licensing of "AI" (LLM) code generator output, it is not considered credible to assert compliance with the DCO clause (b) or (c) where a patch includes such generated code. This patch thus defines a policy that the QEMU project will not accept contributions where use of "AI" (LLM) code generators is either known, or suspected. Signed-off-by: Daniel P. Berrangé Acked-by: Kevin Wolf Reviewed-by: Stefan Hajnoczi --- docs/devel/code-provenance.rst | 40 ++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) diff --git a/docs/devel/code-provenance.rst b/docs/devel/code-provenance.rst index b4591a2dec..a6e42c6b1b 100644 --- a/docs/devel/code-provenance.rst +++ b/docs/devel/code-provenance.rst @@ -195,3 +195,43 @@ example:: Signed-off-by: Some Person [Rebased and added support for 'foo'] Signed-off-by: New Person + +Use of "AI" (LLM) code generators +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +TL;DR: + + **Current QEMU project policy is to DECLINE any contributions + which are believed to include or derive from "AI" (LLM) + generated code.** + +The existence of "AI" (`Large Language Model `__ +/ LLM) code generators raises a number of difficult legal questions, a +number of which impact on Open Source projects. As noted earlier, the +QEMU community requires that contributors certify their patch submissions +are made in accordance with the rules of the :ref:`dco` (DCO). When a +patch contains "AI" generated code this raises difficulties with code +provenence and thus DCO compliance. + +To satisfy the DCO, the patch contributor has to fully understand +the origins and license of code they are contributing to QEMU. The +license terms that should apply to the output of an "AI" code generator +are ill-defined, given that both training data and operation of the +"AI" are typically opaque to the user. Even where the training data +is said to all be open source, it will likely be under a wide variety +of license terms. + +While the vendor's of "AI" code generators may promote the idea that +code output can be taken under a free choice of license, this is not +yet considered to be a generally accepted, nor tested, legal opinion. + +With this in mind, the QEMU maintainers does not consider it is +currently possible to comply with DCO terms (b) or (c) for most "AI" +generated code. + +The QEMU maintainers thus require that contributors refrain from using +"AI" code generators on patches intended to be submitted to the project, +and will decline any contribution if use of "AI" is known or suspected. + +Examples of tools impacted by this policy includes both GitHub CoPilot, +and ChatGPT, amongst many others which are less well known.