diff mbox series

[v3,1/1] scripts: Add add-maintainer.py

Message ID 141b9fcab2208ace3001df4fc10e3dfd42b9f5d9.1693037031.git.quic_gurus@quicinc.com (mailing list archive)
State Not Applicable
Headers show
Series Add add-maintainer.py script | expand

Commit Message

Guru Das Srinagesh Aug. 26, 2023, 8:07 a.m. UTC
This script runs get_maintainer.py on a given patch file (or multiple
patch files) and adds its output to the patch file in place with the
appropriate email headers "To: " or "Cc: " as the case may be. These new
headers are added after the "From: " line in the patch.

Currently, for a single patch, maintainers and reviewers are added as
"To: ", mailing lists and all other roles are added as "Cc: ".

For a series of patches, however, a set-union scheme is employed in
order to solve the all-too-common problem of ending up sending only
subsets of a patch series to some lists, which results in important
pieces of context such as the cover letter (or other patches in the
series) being dropped from those lists. This scheme is as follows:

- Create set-union of all maintainers and reviewers from all patches and
  use this to do the following per patch:
  - add only that specific patch's maintainers and reviewers as "To: "
  - add the other maintainers and reviewers from the other patches as "Cc: "

- Create set-union of all mailing lists corresponding to all patches and
  add this to all patches as "Cc: "

- Create set-union of all other roles corresponding to all patches and
  add this to all patches as "Cc: "

Please note that patch files that don't have any "Maintainer"s or
"Reviewers" explicitly listed in their `get_maintainer.pl` output will
not have any "To: " entries added to them; developers are expected to
manually make edits to the added entries in such cases to convert some
"Cc: " entries to "To: " as desired.

The script is quiet by default (only prints errors) and its verbosity
can be adjusted via an optional parameter.

Signed-off-by: Guru Das Srinagesh <quic_gurus@quicinc.com>
---
 MAINTAINERS               |   5 ++
 scripts/add-maintainer.py | 164 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 169 insertions(+)
 create mode 100755 scripts/add-maintainer.py

Comments

Randy Dunlap Aug. 27, 2023, 4:44 p.m. UTC | #1
Hi,

On 8/26/23 01:07, Guru Das Srinagesh wrote:
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 0903d87b17cb..b670e9733f03 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -8721,6 +8721,11 @@ M:	Joe Perches <joe@perches.com>
>  S:	Maintained
>  F:	scripts/get_maintainer.pl
>  

The MAINTAINERS file should be maintained in alphabetical order,
so this is not in the correct place.

> +ADD MAINTAINER SCRIPT
> +M:	Guru Das Srinagesh <quic_gurus@quicinc.com>
> +S:	Maintained
> +F:	scripts/add-maintainer.py
> +
>  GFS2 FILE SYSTEM
>  M:	Bob Peterson <rpeterso@redhat.com>
>  M:	Andreas Gruenbacher <agruenba@redhat.com>
Jani Nikula Aug. 28, 2023, 8:14 a.m. UTC | #2
On Sat, 26 Aug 2023, Guru Das Srinagesh <quic_gurus@quicinc.com> wrote:
> This script runs get_maintainer.py on a given patch file (or multiple
> patch files) and adds its output to the patch file in place with the
> appropriate email headers "To: " or "Cc: " as the case may be. These new
> headers are added after the "From: " line in the patch.

FWIW, I personally prefer tooling to operate on git branches and commits
than patches. For me, the patches are just an intermediate step in
getting the commits from my git branch to the mailing list. That's not
where I add the Cc's, but rather in the commits in my local branch,
where they're preserved. YMMV.

BR,
Jani.


>
> Currently, for a single patch, maintainers and reviewers are added as
> "To: ", mailing lists and all other roles are added as "Cc: ".
>
> For a series of patches, however, a set-union scheme is employed in
> order to solve the all-too-common problem of ending up sending only
> subsets of a patch series to some lists, which results in important
> pieces of context such as the cover letter (or other patches in the
> series) being dropped from those lists. This scheme is as follows:
>
> - Create set-union of all maintainers and reviewers from all patches and
>   use this to do the following per patch:
>   - add only that specific patch's maintainers and reviewers as "To: "
>   - add the other maintainers and reviewers from the other patches as "Cc: "
>
> - Create set-union of all mailing lists corresponding to all patches and
>   add this to all patches as "Cc: "
>
> - Create set-union of all other roles corresponding to all patches and
>   add this to all patches as "Cc: "
>
> Please note that patch files that don't have any "Maintainer"s or
> "Reviewers" explicitly listed in their `get_maintainer.pl` output will
> not have any "To: " entries added to them; developers are expected to
> manually make edits to the added entries in such cases to convert some
> "Cc: " entries to "To: " as desired.
>
> The script is quiet by default (only prints errors) and its verbosity
> can be adjusted via an optional parameter.
>
> Signed-off-by: Guru Das Srinagesh <quic_gurus@quicinc.com>
> ---
>  MAINTAINERS               |   5 ++
>  scripts/add-maintainer.py | 164 ++++++++++++++++++++++++++++++++++++++
>  2 files changed, 169 insertions(+)
>  create mode 100755 scripts/add-maintainer.py
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 0903d87b17cb..b670e9733f03 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -8721,6 +8721,11 @@ M:	Joe Perches <joe@perches.com>
>  S:	Maintained
>  F:	scripts/get_maintainer.pl
>  
> +ADD MAINTAINER SCRIPT
> +M:	Guru Das Srinagesh <quic_gurus@quicinc.com>
> +S:	Maintained
> +F:	scripts/add-maintainer.py
> +
>  GFS2 FILE SYSTEM
>  M:	Bob Peterson <rpeterso@redhat.com>
>  M:	Andreas Gruenbacher <agruenba@redhat.com>
> diff --git a/scripts/add-maintainer.py b/scripts/add-maintainer.py
> new file mode 100755
> index 000000000000..5a5cc9482b06
> --- /dev/null
> +++ b/scripts/add-maintainer.py
> @@ -0,0 +1,164 @@
> +#! /usr/bin/env python3
> +
> +import argparse
> +import logging
> +import os
> +import sys
> +import subprocess
> +import re
> +
> +def gather_maintainers_of_file(patch_file):
> +    all_entities_of_patch = dict()
> +
> +    # Run get_maintainer.pl on patch file
> +    logging.info("GET: Patch: {}".format(os.path.basename(patch_file)))
> +    cmd = ['scripts/get_maintainer.pl']
> +    cmd.extend([patch_file])
> +
> +    try:
> +        p = subprocess.run(cmd, stdout=subprocess.PIPE, check=True)
> +    except:
> +        sys.exit(1)
> +
> +    logging.debug("\n{}".format(p.stdout.decode()))
> +
> +    entries = p.stdout.decode().splitlines()
> +
> +    maintainers = []
> +    lists = []
> +    others = []
> +
> +    for entry in entries:
> +        entity = entry.split('(')[0].strip()
> +        if any(role in entry for role in ["maintainer", "reviewer"]):
> +            maintainers.append(entity)
> +        elif "list" in entry:
> +            lists.append(entity)
> +        else:
> +            others.append(entity)
> +
> +    all_entities_of_patch["maintainers"] = set(maintainers)
> +    all_entities_of_patch["lists"] = set(lists)
> +    all_entities_of_patch["others"] = set(others)
> +
> +    return all_entities_of_patch
> +
> +def find_pattern_in_lines(pattern, lines):
> +    index = 0
> +    for line in lines:
> +        if re.search(pattern, line):
> +            break;
> +        index = index + 1
> +
> +    if index == len(lines):
> +        logging.error("Couldn't find pattern {} in patch".format(pattern))
> +        sys.exit(1)
> +
> +    return index
> +
> +def add_maintainers_to_file(patch_file, entities_per_file, all_entities_union):
> +    logging.info("ADD: Patch: {}".format(os.path.basename(patch_file)))
> +
> +    # For each patch:
> +    # - Add all lists from all patches in series as Cc:
> +    # - Add all others from all patches in series as Cc:
> +    # - Add only maintainers of that patch as To:
> +    # - Add maintainers of other patches in series as Cc:
> +
> +    lists = list(all_entities_union["all_lists"])
> +    others = list(all_entities_union["all_others"])
> +    file_maintainers = all_entities_union["all_maintainers"].intersection(entities_per_file[os.path.basename(patch_file)].get("maintainers"))
> +    other_maintainers = all_entities_union["all_maintainers"].difference(entities_per_file[os.path.basename(patch_file)].get("maintainers"))
> +
> +    # Specify email headers appropriately
> +    cc_lists        = ["Cc: " + l for l in lists]
> +    cc_others       = ["Cc: " + o for o in others]
> +    to_maintainers  = ["To: " + m for m in file_maintainers]
> +    cc_maintainers  = ["Cc: " + om for om in other_maintainers]
> +    logging.debug("Cc Lists:\n{}".format('\n'.join(cc_lists)))
> +    logging.debug("Cc Others:\n{}".format('\n'.join(cc_others)))
> +    logging.debug("Cc Maintainers:\n{}".format('\n'.join(cc_maintainers) or None))
> +    logging.debug("To Maintainers:\n{}\n".format('\n'.join(to_maintainers) or None))
> +
> +    # Edit patch file in place to add maintainers
> +    with open(patch_file, "r") as pf:
> +        lines = pf.readlines()
> +
> +    # Get the index of the first "From: <email address>" line in patch
> +    from_line = find_pattern_in_lines("^(From: )(.*)<(.*)@(.*)>", lines)
> +
> +    # Insert our To: and Cc: headers after it.
> +    next_line_after_from = from_line + 1
> +
> +    for l in cc_lists:
> +        lines.insert(next_line_after_from, l + "\n")
> +    for o in cc_others:
> +        lines.insert(next_line_after_from, o + "\n")
> +    for om in cc_maintainers:
> +        lines.insert(next_line_after_from, om + "\n")
> +    for m in to_maintainers:
> +        lines.insert(next_line_after_from, m + "\n")
> +
> +    with open(patch_file, "w") as pf:
> +        pf.writelines(lines)
> +
> +def add_maintainers(patch_files):
> +    entities_per_file = dict()
> +
> +    for patch in patch_files:
> +        entities_per_file[os.path.basename(patch)] = gather_maintainers_of_file(patch)
> +
> +    all_entities_union = {"all_maintainers": set(), "all_lists": set(), "all_others": set()}
> +    for patch in patch_files:
> +        all_entities_union["all_maintainers"] = all_entities_union["all_maintainers"].union(entities_per_file[os.path.basename(patch)].get("maintainers"))
> +        all_entities_union["all_lists"] = all_entities_union["all_lists"].union(entities_per_file[os.path.basename(patch)].get("lists"))
> +        all_entities_union["all_others"] = all_entities_union["all_others"].union(entities_per_file[os.path.basename(patch)].get("others"))
> +
> +    for patch in patch_files:
> +        add_maintainers_to_file(patch, entities_per_file, all_entities_union)
> +
> +    logging.info("Maintainers added to all patch files successfully")
> +
> +def remove_to_cc_from_header(patch_files):
> +    for patch in patch_files:
> +        logging.info("UNDO: Patch: {}".format(os.path.basename(patch)))
> +        with open(patch, "r") as pf:
> +            lines = pf.readlines()
> +
> +        # Get the index of the first "From: <email address>" line in patch
> +        from_line = find_pattern_in_lines("^(From: )(.*)<(.*)@(.*)>", lines)
> +
> +        # Get the index of the first "Date: " line in patch
> +        date_line = find_pattern_in_lines("^(Date: )", lines)
> +
> +        # Delete everything in between From: and Date:
> +        # These are the lines that this script adds - any To: or Cc: anywhere
> +        # else in the patch will not be removed.
> +        del lines[(from_line + 1):date_line]
> +
> +        with open(patch, "w") as pf:
> +            pf.writelines(lines)
> +
> +    logging.info("Maintainers removed from all patch files successfully")
> +
> +def main():
> +    parser = argparse.ArgumentParser(description='Add the respective maintainers and mailing lists to patch files')
> +    parser.add_argument('patches', nargs='+', help="One or more patch files")
> +    parser.add_argument('-v', '--verbosity', choices=['debug', 'info', 'error'], default='error', help="Verbosity level of script output")
> +    parser.add_argument('-u', '--undo', action='store_true', help="Remove maintainers added by this script from patch(es)")
> +    args = parser.parse_args()
> +
> +    logging.basicConfig(level=args.verbosity.upper(), format='%(levelname)s: %(message)s')
> +
> +    for patch in args.patches:
> +        if not os.path.isfile(patch):
> +            logging.error("File does not exist: {}".format(patch))
> +            sys.exit(1)
> +
> +    if args.undo:
> +        remove_to_cc_from_header(args.patches)
> +    else:
> +        add_maintainers(args.patches)
> +
> +if __name__ == "__main__":
> +    main()
Krzysztof Kozlowski Aug. 28, 2023, 8:21 a.m. UTC | #3
On 26/08/2023 10:07, Guru Das Srinagesh wrote:
> This script runs get_maintainer.py on a given patch file (or multiple
> patch files) and adds its output to the patch file in place with the
> appropriate email headers "To: " or "Cc: " as the case may be. These new
> headers are added after the "From: " line in the patch.
> 
> Currently, for a single patch, maintainers and reviewers are added as
> "To: ", mailing lists and all other roles are added as "Cc: ".
> 
> For a series of patches, however, a set-union scheme is employed in
> order to solve the all-too-common problem of ending up sending only
> subsets of a patch series to some lists, which results in important
> pieces of context such as the cover letter (or other patches in the
> series) being dropped from those lists. This scheme is as follows:
> 
> - Create set-union of all maintainers and reviewers from all patches and
>   use this to do the following per patch:
>   - add only that specific patch's maintainers and reviewers as "To: "
>   - add the other maintainers and reviewers from the other patches as "Cc: "
> 
> - Create set-union of all mailing lists corresponding to all patches and
>   add this to all patches as "Cc: "
> 
> - Create set-union of all other roles corresponding to all patches and
>   add this to all patches as "Cc: "
> 
> Please note that patch files that don't have any "Maintainer"s or
> "Reviewers" explicitly listed in their `get_maintainer.pl` output will

So before you will ignoring the reviewers, right? One more reason to not
get it right...

> not have any "To: " entries added to them; developers are expected to
> manually make edits to the added entries in such cases to convert some
> "Cc: " entries to "To: " as desired.
> 
> The script is quiet by default (only prints errors) and its verbosity
> can be adjusted via an optional parameter.
> 
> Signed-off-by: Guru Das Srinagesh <quic_gurus@quicinc.com>
> ---
>  MAINTAINERS               |   5 ++
>  scripts/add-maintainer.py | 164 ++++++++++++++++++++++++++++++++++++++
>  2 files changed, 169 insertions(+)
>  create mode 100755 scripts/add-maintainer.py
> 

I do not see the benefits of this script. For me - it's unnecessarily
more complicated instead of my simple bash function which makes
everything one command less than here.
One more thing to maintain.

I don't see the benefits of this for newcomers, either. They should use
b4. b4 can do much, much more, so anyone creating his workflow should
start from b4, not from this script.

Best regards,
Krzysztof
Bjorn Andersson Aug. 28, 2023, 1:35 p.m. UTC | #4
On Mon, Aug 28, 2023 at 11:14:41AM +0300, Jani Nikula wrote:
> On Sat, 26 Aug 2023, Guru Das Srinagesh <quic_gurus@quicinc.com> wrote:
> > This script runs get_maintainer.py on a given patch file (or multiple
> > patch files) and adds its output to the patch file in place with the
> > appropriate email headers "To: " or "Cc: " as the case may be. These new
> > headers are added after the "From: " line in the patch.
> 
> FWIW, I personally prefer tooling to operate on git branches and commits
> than patches. For me, the patches are just an intermediate step in
> getting the commits from my git branch to the mailing list. That's not
> where I add the Cc's, but rather in the commits in my local branch,
> where they're preserved. YMMV.
> 

May I ask how you add/carry the recipients in a commit?

Regards,
Bjorn
Geert Uytterhoeven Aug. 28, 2023, 1:48 p.m. UTC | #5
Hi Bjorn,

On Mon, Aug 28, 2023 at 3:37 PM Bjorn Andersson
<quic_bjorande@quicinc.com> wrote:
> On Mon, Aug 28, 2023 at 11:14:41AM +0300, Jani Nikula wrote:
> > On Sat, 26 Aug 2023, Guru Das Srinagesh <quic_gurus@quicinc.com> wrote:
> > > This script runs get_maintainer.py on a given patch file (or multiple
> > > patch files) and adds its output to the patch file in place with the
> > > appropriate email headers "To: " or "Cc: " as the case may be. These new
> > > headers are added after the "From: " line in the patch.
> >
> > FWIW, I personally prefer tooling to operate on git branches and commits
> > than patches. For me, the patches are just an intermediate step in
> > getting the commits from my git branch to the mailing list. That's not
> > where I add the Cc's, but rather in the commits in my local branch,
> > where they're preserved. YMMV.
> >
>
> May I ask how you add/carry the recipients in a commit?

I guess below a "---" line in the commit description?

Gr{oetje,eeting}s,

                        Geert
Vlastimil Babka Aug. 28, 2023, 3:14 p.m. UTC | #6
On 8/28/23 15:48, Geert Uytterhoeven wrote:
> Hi Bjorn,
> 
> On Mon, Aug 28, 2023 at 3:37 PM Bjorn Andersson
> <quic_bjorande@quicinc.com> wrote:
>> On Mon, Aug 28, 2023 at 11:14:41AM +0300, Jani Nikula wrote:
>> > On Sat, 26 Aug 2023, Guru Das Srinagesh <quic_gurus@quicinc.com> wrote:
>> > > This script runs get_maintainer.py on a given patch file (or multiple
>> > > patch files) and adds its output to the patch file in place with the
>> > > appropriate email headers "To: " or "Cc: " as the case may be. These new
>> > > headers are added after the "From: " line in the patch.
>> >
>> > FWIW, I personally prefer tooling to operate on git branches and commits
>> > than patches. For me, the patches are just an intermediate step in
>> > getting the commits from my git branch to the mailing list. That's not
>> > where I add the Cc's, but rather in the commits in my local branch,
>> > where they're preserved. YMMV.
>> >
>>
>> May I ask how you add/carry the recipients in a commit?
> 
> I guess below a "---" line in the commit description?

Does that do anything special in commit log? I'd expect (and I do it that
way) it's rather just adding a

Cc: Name <email>

in the tag area where s-o-b, reviewed-by etc are added.

> Gr{oetje,eeting}s,
> 
>                         Geert
>
Krzysztof Kozlowski Aug. 28, 2023, 3:23 p.m. UTC | #7
On 28/08/2023 17:14, Vlastimil Babka wrote:
> On 8/28/23 15:48, Geert Uytterhoeven wrote:
>> Hi Bjorn,
>>
>> On Mon, Aug 28, 2023 at 3:37 PM Bjorn Andersson
>> <quic_bjorande@quicinc.com> wrote:
>>> On Mon, Aug 28, 2023 at 11:14:41AM +0300, Jani Nikula wrote:
>>>> On Sat, 26 Aug 2023, Guru Das Srinagesh <quic_gurus@quicinc.com> wrote:
>>>>> This script runs get_maintainer.py on a given patch file (or multiple
>>>>> patch files) and adds its output to the patch file in place with the
>>>>> appropriate email headers "To: " or "Cc: " as the case may be. These new
>>>>> headers are added after the "From: " line in the patch.
>>>>
>>>> FWIW, I personally prefer tooling to operate on git branches and commits
>>>> than patches. For me, the patches are just an intermediate step in
>>>> getting the commits from my git branch to the mailing list. That's not
>>>> where I add the Cc's, but rather in the commits in my local branch,
>>>> where they're preserved. YMMV.
>>>>
>>>
>>> May I ask how you add/carry the recipients in a commit?
>>
>> I guess below a "---" line in the commit description?
> 
> Does that do anything special in commit log? I'd expect (and I do it that
> way) it's rather just adding a

It does. It goes away.
> 
> Cc: Name <email>
> 
> in the tag area where s-o-b, reviewed-by etc are added.

Why storing autogenerated scripts/get_maintainer.pl CC-entries in commit
msg? The non-maintainer-output but the automated output? There is no
single need to store automated output of get_maintainers.pl in the git
log. It can be easily re-created at any given time, thus its presence in
the git history is redundant and obfuscates the log.

Best regards,
Krzysztof
Guru Das Srinagesh Aug. 28, 2023, 4:45 p.m. UTC | #8
On Aug 27 2023 09:44, Randy Dunlap wrote:
> Hi,
> 
> On 8/26/23 01:07, Guru Das Srinagesh wrote:
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 0903d87b17cb..b670e9733f03 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -8721,6 +8721,11 @@ M:	Joe Perches <joe@perches.com>
> >  S:	Maintained
> >  F:	scripts/get_maintainer.pl
> >  
> 
> The MAINTAINERS file should be maintained in alphabetical order,
> so this is not in the correct place.

Thank you - didn't know about that. Will fix.

Guru Das.
Bjorn Andersson Aug. 28, 2023, 4:50 p.m. UTC | #9
On Mon, Aug 28, 2023 at 05:23:58PM +0200, Krzysztof Kozlowski wrote:
> On 28/08/2023 17:14, Vlastimil Babka wrote:
> > On 8/28/23 15:48, Geert Uytterhoeven wrote:
> >> Hi Bjorn,
> >>
> >> On Mon, Aug 28, 2023 at 3:37 PM Bjorn Andersson
> >> <quic_bjorande@quicinc.com> wrote:
> >>> On Mon, Aug 28, 2023 at 11:14:41AM +0300, Jani Nikula wrote:
> >>>> On Sat, 26 Aug 2023, Guru Das Srinagesh <quic_gurus@quicinc.com> wrote:
> >>>>> This script runs get_maintainer.py on a given patch file (or multiple
> >>>>> patch files) and adds its output to the patch file in place with the
> >>>>> appropriate email headers "To: " or "Cc: " as the case may be. These new
> >>>>> headers are added after the "From: " line in the patch.
> >>>>
> >>>> FWIW, I personally prefer tooling to operate on git branches and commits
> >>>> than patches. For me, the patches are just an intermediate step in
> >>>> getting the commits from my git branch to the mailing list. That's not
> >>>> where I add the Cc's, but rather in the commits in my local branch,
> >>>> where they're preserved. YMMV.
> >>>>
> >>>
> >>> May I ask how you add/carry the recipients in a commit?
> >>
> >> I guess below a "---" line in the commit description?
> > 
> > Does that do anything special in commit log? I'd expect (and I do it that
> > way) it's rather just adding a
> 
> It does. It goes away.

Afaict, it's verbatim copied into the .patch, which would mean that it
goes away when the patch is applied on the other side.

But it's still going to be in the email (followed by another ---), so
unless there's another step later in the process that cleans this up I
it looks ugly, and not very useful - unless I'm missing something.

> > 
> > Cc: Name <email>
> > 
> > in the tag area where s-o-b, reviewed-by etc are added.
> 
> Why storing autogenerated scripts/get_maintainer.pl CC-entries in commit
> msg? The non-maintainer-output but the automated output? There is no
> single need to store automated output of get_maintainers.pl in the git
> log. It can be easily re-created at any given time, thus its presence in
> the git history is redundant and obfuscates the log.
> 

Fully agree to this. In particular if the patch is going to be sent as
part of a series the recipients list won't be accurate for any patch.

The case I was looking for was the case where I want to make sure to
include a specific person, beyond the get_maintainers output. So pretty
much the usual Cc: tag in the commit message, but I don't necessarily
want to write this fact into the git history.

Regards,
Bjorn
Guru Das Srinagesh Aug. 28, 2023, 4:50 p.m. UTC | #10
On Aug 28 2023 11:14, Jani Nikula wrote:
> On Sat, 26 Aug 2023, Guru Das Srinagesh <quic_gurus@quicinc.com> wrote:
> > This script runs get_maintainer.py on a given patch file (or multiple
> > patch files) and adds its output to the patch file in place with the
> > appropriate email headers "To: " or "Cc: " as the case may be. These new
> > headers are added after the "From: " line in the patch.
> 
> FWIW, I personally prefer tooling to operate on git branches and commits
> than patches. For me, the patches are just an intermediate step in
> getting the commits from my git branch to the mailing list. That's not
> where I add the Cc's, but rather in the commits in my local branch,
> where they're preserved. YMMV.

Could you please share more details about your workflow? Specifically, how you
fit `get_maintainer.pl` in it.

Thank you.

Guru Das.
Guru Das Srinagesh Aug. 28, 2023, 5:56 p.m. UTC | #11
On Aug 28 2023 10:21, Krzysztof Kozlowski wrote:
> On 26/08/2023 10:07, Guru Das Srinagesh wrote:
> > This script runs get_maintainer.py on a given patch file (or multiple
> > patch files) and adds its output to the patch file in place with the
> > appropriate email headers "To: " or "Cc: " as the case may be. These new
> > headers are added after the "From: " line in the patch.
> > 
> > Currently, for a single patch, maintainers and reviewers are added as
> > "To: ", mailing lists and all other roles are added as "Cc: ".
> > 
> > For a series of patches, however, a set-union scheme is employed in
> > order to solve the all-too-common problem of ending up sending only
> > subsets of a patch series to some lists, which results in important
> > pieces of context such as the cover letter (or other patches in the
> > series) being dropped from those lists. This scheme is as follows:
> > 
> > - Create set-union of all maintainers and reviewers from all patches and
> >   use this to do the following per patch:
> >   - add only that specific patch's maintainers and reviewers as "To: "
> >   - add the other maintainers and reviewers from the other patches as "Cc: "
> > 
> > - Create set-union of all mailing lists corresponding to all patches and
> >   add this to all patches as "Cc: "
> > 
> > - Create set-union of all other roles corresponding to all patches and
> >   add this to all patches as "Cc: "
> > 
> > Please note that patch files that don't have any "Maintainer"s or
> > "Reviewers" explicitly listed in their `get_maintainer.pl` output will
> 
> So before you will ignoring the reviewers, right? One more reason to not
> get it right...

In v2, Reviewers were added as "Cc:" whereas here in v3 they are added as
"To:". Not sure where you're getting "ignoring the reviewers" from.

> > not have any "To: " entries added to them; developers are expected to
> > manually make edits to the added entries in such cases to convert some
> > "Cc: " entries to "To: " as desired.
> > 
> > The script is quiet by default (only prints errors) and its verbosity
> > can be adjusted via an optional parameter.
> > 
> > Signed-off-by: Guru Das Srinagesh <quic_gurus@quicinc.com>
> > ---
> >  MAINTAINERS               |   5 ++
> >  scripts/add-maintainer.py | 164 ++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 169 insertions(+)
> >  create mode 100755 scripts/add-maintainer.py
> > 
> 
> I do not see the benefits of this script. For me - it's unnecessarily
> more complicated instead of my simple bash function which makes

Your function adds mailing lists also in "To:" which is not ideal, in my view.
You've mentioned before that To or Cc doesn't matter [1] which I disagree
with: it doesn't matter, why does Cc exist as a concept at all?

[1] https://lore.kernel.org/lkml/af1eca37-9fd2-1e83-ab27-ebb51480904b@linaro.org/

Thank you.

Guru Das.
Krzysztof Kozlowski Aug. 28, 2023, 5:59 p.m. UTC | #12
On 28/08/2023 19:56, Guru Das Srinagesh wrote:
> On Aug 28 2023 10:21, Krzysztof Kozlowski wrote:
>> On 26/08/2023 10:07, Guru Das Srinagesh wrote:
>>> This script runs get_maintainer.py on a given patch file (or multiple
>>> patch files) and adds its output to the patch file in place with the
>>> appropriate email headers "To: " or "Cc: " as the case may be. These new
>>> headers are added after the "From: " line in the patch.
>>>
>>> Currently, for a single patch, maintainers and reviewers are added as
>>> "To: ", mailing lists and all other roles are added as "Cc: ".
>>>
>>> For a series of patches, however, a set-union scheme is employed in
>>> order to solve the all-too-common problem of ending up sending only
>>> subsets of a patch series to some lists, which results in important
>>> pieces of context such as the cover letter (or other patches in the
>>> series) being dropped from those lists. This scheme is as follows:
>>>
>>> - Create set-union of all maintainers and reviewers from all patches and
>>>   use this to do the following per patch:
>>>   - add only that specific patch's maintainers and reviewers as "To: "
>>>   - add the other maintainers and reviewers from the other patches as "Cc: "
>>>
>>> - Create set-union of all mailing lists corresponding to all patches and
>>>   add this to all patches as "Cc: "
>>>
>>> - Create set-union of all other roles corresponding to all patches and
>>>   add this to all patches as "Cc: "
>>>
>>> Please note that patch files that don't have any "Maintainer"s or
>>> "Reviewers" explicitly listed in their `get_maintainer.pl` output will
>>
>> So before you will ignoring the reviewers, right? One more reason to not
>> get it right...
> 
> In v2, Reviewers were added as "Cc:" whereas here in v3 they are added as
> "To:". Not sure where you're getting "ignoring the reviewers" from.
> 
>>> not have any "To: " entries added to them; developers are expected to
>>> manually make edits to the added entries in such cases to convert some
>>> "Cc: " entries to "To: " as desired.
>>>
>>> The script is quiet by default (only prints errors) and its verbosity
>>> can be adjusted via an optional parameter.
>>>
>>> Signed-off-by: Guru Das Srinagesh <quic_gurus@quicinc.com>
>>> ---
>>>  MAINTAINERS               |   5 ++
>>>  scripts/add-maintainer.py | 164 ++++++++++++++++++++++++++++++++++++++
>>>  2 files changed, 169 insertions(+)
>>>  create mode 100755 scripts/add-maintainer.py
>>>
>>
>> I do not see the benefits of this script. For me - it's unnecessarily
>> more complicated instead of my simple bash function which makes
> 
> Your function adds mailing lists also in "To:" which is not ideal, in my view.
> You've mentioned before that To or Cc doesn't matter [1] which I disagree
> with: it doesn't matter, why does Cc exist as a concept at all?

To/Cc does not matter when sending new patch, because maintainers know
they are maintainers of which parts. I know what I handle.

To/Cc still makes sense in other cases, when for example you ping
someone asking for reviews. It also makes much more sense in all
corpo-worlds where such distinction is obvious. We are not a corpo-world
here.


Best regards,
Krzysztof
Mark Brown Aug. 28, 2023, 7:41 p.m. UTC | #13
On Mon, Aug 28, 2023 at 07:59:54PM +0200, Krzysztof Kozlowski wrote:
> On 28/08/2023 19:56, Guru Das Srinagesh wrote:

> > Your function adds mailing lists also in "To:" which is not ideal, in my view.
> > You've mentioned before that To or Cc doesn't matter [1] which I disagree
> > with: it doesn't matter, why does Cc exist as a concept at all?

> To/Cc does not matter when sending new patch, because maintainers know
> they are maintainers of which parts. I know what I handle.

That might be true for you (and also is for me) but I know there are
people who pay attention to if they're in the To: for various reasons, I
gather it's mostly about triaging their emails and is especially likely
in cases where trees have overlaps in the code they cover.
Krzysztof Kozlowski Aug. 28, 2023, 7:45 p.m. UTC | #14
On 28/08/2023 21:41, Mark Brown wrote:
> On Mon, Aug 28, 2023 at 07:59:54PM +0200, Krzysztof Kozlowski wrote:
>> On 28/08/2023 19:56, Guru Das Srinagesh wrote:
> 
>>> Your function adds mailing lists also in "To:" which is not ideal, in my view.
>>> You've mentioned before that To or Cc doesn't matter [1] which I disagree
>>> with: it doesn't matter, why does Cc exist as a concept at all?
> 
>> To/Cc does not matter when sending new patch, because maintainers know
>> they are maintainers of which parts. I know what I handle.
> 
> That might be true for you (and also is for me) but I know there are
> people who pay attention to if they're in the To: for various reasons, I
> gather it's mostly about triaging their emails and is especially likely
> in cases where trees have overlaps in the code they cover.

True, there can be cases where people pay attention to addresses of
emails. Just like there are cases where people pay attention to "To/Cc"
difference.

In my short experience with a few patches sent, no one complained to me
that I put him/her/they in "To" field of a patch instead of "Cc" (with
remark to not spamming to much, so imagine I send a patch for regulator
and DTS). Big, multi-subsystem patchsets are different case and this
script does not solve it either.

Anyway, if it is not ideal for Guru, I wonder how his LKML maintainer
filters work that it is not ideal? What is exactly not ideal in
maintainer workflow?

Best regards,
Krzysztof
Jani Nikula Aug. 29, 2023, 7:38 a.m. UTC | #15
On Mon, 28 Aug 2023, Bjorn Andersson <quic_bjorande@quicinc.com> wrote:
> On Mon, Aug 28, 2023 at 05:23:58PM +0200, Krzysztof Kozlowski wrote:
>> On 28/08/2023 17:14, Vlastimil Babka wrote:
>> > On 8/28/23 15:48, Geert Uytterhoeven wrote:
>> >> Hi Bjorn,
>> >>
>> >> On Mon, Aug 28, 2023 at 3:37 PM Bjorn Andersson
>> >> <quic_bjorande@quicinc.com> wrote:
>> >>> On Mon, Aug 28, 2023 at 11:14:41AM +0300, Jani Nikula wrote:
>> >>>> On Sat, 26 Aug 2023, Guru Das Srinagesh <quic_gurus@quicinc.com> wrote:
>> >>>>> This script runs get_maintainer.py on a given patch file (or multiple
>> >>>>> patch files) and adds its output to the patch file in place with the
>> >>>>> appropriate email headers "To: " or "Cc: " as the case may be. These new
>> >>>>> headers are added after the "From: " line in the patch.
>> >>>>
>> >>>> FWIW, I personally prefer tooling to operate on git branches and commits
>> >>>> than patches. For me, the patches are just an intermediate step in
>> >>>> getting the commits from my git branch to the mailing list. That's not
>> >>>> where I add the Cc's, but rather in the commits in my local branch,
>> >>>> where they're preserved. YMMV.
>> >>>>
>> >>>
>> >>> May I ask how you add/carry the recipients in a commit?
>> >>
>> >> I guess below a "---" line in the commit description?
>> > 
>> > Does that do anything special in commit log? I'd expect (and I do it that
>> > way) it's rather just adding a
>> 
>> It does. It goes away.
>
> Afaict, it's verbatim copied into the .patch, which would mean that it
> goes away when the patch is applied on the other side.
>
> But it's still going to be in the email (followed by another ---), so
> unless there's another step later in the process that cleans this up I
> it looks ugly, and not very useful - unless I'm missing something.
>
>> > 
>> > Cc: Name <email>
>> > 
>> > in the tag area where s-o-b, reviewed-by etc are added.
>> 
>> Why storing autogenerated scripts/get_maintainer.pl CC-entries in commit
>> msg? The non-maintainer-output but the automated output? There is no
>> single need to store automated output of get_maintainers.pl in the git
>> log. It can be easily re-created at any given time, thus its presence in
>> the git history is redundant and obfuscates the log.
>> 
>
> Fully agree to this. In particular if the patch is going to be sent as
> part of a series the recipients list won't be accurate for any patch.
>
> The case I was looking for was the case where I want to make sure to
> include a specific person, beyond the get_maintainers output. So pretty
> much the usual Cc: tag in the commit message, but I don't necessarily
> want to write this fact into the git history.

The point is, I *never* use get_maintainer.pl output as-is.

BR,
Jani.
Guru Das Srinagesh Aug. 29, 2023, 11:16 p.m. UTC | #16
On Aug 28 2023 21:45, Krzysztof Kozlowski wrote:
> On 28/08/2023 21:41, Mark Brown wrote:
> > On Mon, Aug 28, 2023 at 07:59:54PM +0200, Krzysztof Kozlowski wrote:
> >> On 28/08/2023 19:56, Guru Das Srinagesh wrote:
> > 
> >>> Your function adds mailing lists also in "To:" which is not ideal, in my view.
> >>> You've mentioned before that To or Cc doesn't matter [1] which I disagree
> >>> with: it doesn't matter, why does Cc exist as a concept at all?
> > 
> >> To/Cc does not matter when sending new patch, because maintainers know
> >> they are maintainers of which parts. I know what I handle.
> > 
> > That might be true for you (and also is for me) but I know there are
> > people who pay attention to if they're in the To: for various reasons, I
> > gather it's mostly about triaging their emails and is especially likely
> > in cases where trees have overlaps in the code they cover.
> 
> True, there can be cases where people pay attention to addresses of
> emails. Just like there are cases where people pay attention to "To/Cc"
> difference.
> 
> In my short experience with a few patches sent, no one complained to me
> that I put him/her/they in "To" field of a patch instead of "Cc" (with
> remark to not spamming to much, so imagine I send a patch for regulator
> and DTS). Big, multi-subsystem patchsets are different case and this
> script does not solve it either.

Not sure what you mean by "does not solve it" - what is the problem being
referred to here?

In case of multi-subsystem patches in a series, the commit message of this
patch explains exactly the actions taken.

> Anyway, if it is not ideal for Guru, I wonder how his LKML maintainer
> filters work that it is not ideal? What is exactly not ideal in
> maintainer workflow?

I am not a maintainer - only an individual contributor - and as such, even
though I may get patches on files I've contributed to, I deeply appreciate the
distinction between being Cc-ed in a patch vs To-ed in one. The distinction
being that if I'm in "To:" I ascribe higher priority to it and lesser if I'm in
"Cc:".

If this script is accepted and gains adoption, maintainers like yourself will
only be To-ed in patches that touch files that you're a direct "Maintainer" or
"Reviewer" of. For all other patches in the series you'll be in "Cc:". I
imagine that this can be very useful regardless of the specifics of your
workflow.

Also, lists should just be in "Cc:" - that's just my personal preference, but
one that I'm sure others also share.

Guru Das.
Krzysztof Kozlowski Aug. 30, 2023, 7:11 a.m. UTC | #17
On 30/08/2023 01:16, Guru Das Srinagesh wrote:
> On Aug 28 2023 21:45, Krzysztof Kozlowski wrote:
>> On 28/08/2023 21:41, Mark Brown wrote:
>>> On Mon, Aug 28, 2023 at 07:59:54PM +0200, Krzysztof Kozlowski wrote:
>>>> On 28/08/2023 19:56, Guru Das Srinagesh wrote:
>>>
>>>>> Your function adds mailing lists also in "To:" which is not ideal, in my view.
>>>>> You've mentioned before that To or Cc doesn't matter [1] which I disagree
>>>>> with: it doesn't matter, why does Cc exist as a concept at all?
>>>
>>>> To/Cc does not matter when sending new patch, because maintainers know
>>>> they are maintainers of which parts. I know what I handle.
>>>
>>> That might be true for you (and also is for me) but I know there are
>>> people who pay attention to if they're in the To: for various reasons, I
>>> gather it's mostly about triaging their emails and is especially likely
>>> in cases where trees have overlaps in the code they cover.
>>
>> True, there can be cases where people pay attention to addresses of
>> emails. Just like there are cases where people pay attention to "To/Cc"
>> difference.
>>
>> In my short experience with a few patches sent, no one complained to me
>> that I put him/her/they in "To" field of a patch instead of "Cc" (with
>> remark to not spamming to much, so imagine I send a patch for regulator
>> and DTS). Big, multi-subsystem patchsets are different case and this
>> script does not solve it either.
> 
> Not sure what you mean by "does not solve it" - what is the problem being
> referred to here?

Exactly, no one even knows what problem you want to solve by swapping
To-Cc between patches...

> 
> In case of multi-subsystem patches in a series, the commit message of this
> patch explains exactly the actions taken.
> 
>> Anyway, if it is not ideal for Guru, I wonder how his LKML maintainer
>> filters work that it is not ideal? What is exactly not ideal in
>> maintainer workflow?
> 
> I am not a maintainer - only an individual contributor - and as such, even
> though I may get patches on files I've contributed to, I deeply appreciate the
> distinction between being Cc-ed in a patch vs To-ed in one. The distinction
> being that if I'm in "To:" I ascribe higher priority to it and lesser if I'm in
> "Cc:".

That's your feeling, quite subjective. I understand it comes from
corporate world, but again...

> 
> If this script is accepted and gains adoption, maintainers like yourself will
> only be To-ed in patches that touch files that you're a direct "Maintainer" or
> "Reviewer" of. 

It will not get traction because:
1. People should use b4, not this script.
2. Remaining people will just use get_maintainers.pl.
3. People cannot get right even basic commands, so we will never be able
to rely on To or Cc distinction. I can give you example: my email
address in get_maintainers.pl is a bit different. Does it matter? Often
not. Entire bunch of folks were Ccing me on different address. Even
though every tool told them not to...

> For all other patches in the series you'll be in "Cc:". I
> imagine that this can be very useful regardless of the specifics of your
> workflow.

Zero usefulness for me.

Best regards,
Krzysztof
Mark Brown Aug. 30, 2023, 11:22 a.m. UTC | #18
On Tue, Aug 29, 2023 at 04:16:39PM -0700, Guru Das Srinagesh wrote:

> If this script is accepted and gains adoption, maintainers like yourself will
> only be To-ed in patches that touch files that you're a direct "Maintainer" or
> "Reviewer" of. For all other patches in the series you'll be in "Cc:". I
> imagine that this can be very useful regardless of the specifics of your
> workflow.

Given that b4 solves a lot more problems and is getting quite widely
adopted it's probably going to be more effective to look at trying to
get this implemented there.  That might still mean a separate script
that b4 can hook into, but it's probably important that whatever you do
can be used easily with b4.
Jeff Johnson Aug. 30, 2023, 2:16 p.m. UTC | #19
On 8/30/2023 4:22 AM, Mark Brown wrote:
> On Tue, Aug 29, 2023 at 04:16:39PM -0700, Guru Das Srinagesh wrote:
> 
>> If this script is accepted and gains adoption, maintainers like yourself will
>> only be To-ed in patches that touch files that you're a direct "Maintainer" or
>> "Reviewer" of. For all other patches in the series you'll be in "Cc:". I
>> imagine that this can be very useful regardless of the specifics of your
>> workflow.
> 
> Given that b4 solves a lot more problems and is getting quite widely
> adopted it's probably going to be more effective to look at trying to
> get this implemented there.  That might still mean a separate script
> that b4 can hook into, but it's probably important that whatever you do
> can be used easily with b4.

As someone who has recently moved to using b4 I second this comment.
b4 makes it so much easier to maintain patch versioning and to add the 
right folks to the review. And most folks aren't performing tree-wide 
changes so the per-patch customization doesn't seem to be a big win.

/jeff
Pavan Kondeti Sept. 26, 2023, 12:02 p.m. UTC | #20
On Sat, Aug 26, 2023 at 01:07:42AM -0700, Guru Das Srinagesh wrote:
> +def gather_maintainers_of_file(patch_file):
> +    all_entities_of_patch = dict()
> +
> +    # Run get_maintainer.pl on patch file
> +    logging.info("GET: Patch: {}".format(os.path.basename(patch_file)))
> +    cmd = ['scripts/get_maintainer.pl']
> +    cmd.extend([patch_file])
> +
> +    try:
> +        p = subprocess.run(cmd, stdout=subprocess.PIPE, check=True)
> +    except:
> +        sys.exit(1)
> +
> +    logging.debug("\n{}".format(p.stdout.decode()))
> +
> +    entries = p.stdout.decode().splitlines()
> +
> +    maintainers = []
> +    lists = []
> +    others = []
> +
> +    for entry in entries:
> +        entity = entry.split('(')[0].strip()
> +        if any(role in entry for role in ["maintainer", "reviewer"]):
> +            maintainers.append(entity)
> +        elif "list" in entry:
> +            lists.append(entity)
> +        else:
> +            others.append(entity)
> +
> +    all_entities_of_patch["maintainers"] = set(maintainers)
> +    all_entities_of_patch["lists"] = set(lists)
> +    all_entities_of_patch["others"] = set(others)
> +
> +    return all_entities_of_patch
> +

FYI, there are couple of issues found while playing around.

- Some entries in MAINTAINERS could be "supporter"
- When names contain ("company"), the script fails to extract name.

Thanks,
Pavan

diff --git a/scripts/add-maintainer.py b/scripts/add-maintainer.py
index 5a5cc9482b06..6aa5e7941172 100755
--- a/scripts/add-maintainer.py
+++ b/scripts/add-maintainer.py
@@ -29,8 +29,8 @@ def gather_maintainers_of_file(patch_file):
     others = []

     for entry in entries:
-        entity = entry.split('(')[0].strip()
-        if any(role in entry for role in ["maintainer", "reviewer"]):
+        entity = entry.rsplit('(', 1)[0].strip()
+        if any(role in entry for role in ["maintainer", "reviewer", "supporter"]):
             maintainers.append(entity)
         elif "list" in entry:
             lists.append(entity)


Thanks,
Pavan
Pavan Kondeti Sept. 27, 2023, 4:51 a.m. UTC | #21
On Mon, Aug 28, 2023 at 10:21:32AM +0200, Krzysztof Kozlowski wrote:
> On 26/08/2023 10:07, Guru Das Srinagesh wrote:
> > This script runs get_maintainer.py on a given patch file (or multiple
> > patch files) and adds its output to the patch file in place with the
> > appropriate email headers "To: " or "Cc: " as the case may be. These new
> > headers are added after the "From: " line in the patch.
> > 
> > Currently, for a single patch, maintainers and reviewers are added as
> > "To: ", mailing lists and all other roles are added as "Cc: ".
> > 
> > For a series of patches, however, a set-union scheme is employed in
> > order to solve the all-too-common problem of ending up sending only
> > subsets of a patch series to some lists, which results in important
> > pieces of context such as the cover letter (or other patches in the
> > series) being dropped from those lists. This scheme is as follows:
> > 
> > - Create set-union of all maintainers and reviewers from all patches and
> >   use this to do the following per patch:
> >   - add only that specific patch's maintainers and reviewers as "To: "
> >   - add the other maintainers and reviewers from the other patches as "Cc: "
> > 
> > - Create set-union of all mailing lists corresponding to all patches and
> >   add this to all patches as "Cc: "
> > 
> > - Create set-union of all other roles corresponding to all patches and
> >   add this to all patches as "Cc: "
> > 
> > Please note that patch files that don't have any "Maintainer"s or
> > "Reviewers" explicitly listed in their `get_maintainer.pl` output will
> 
> So before you will ignoring the reviewers, right? One more reason to not
> get it right...
> 
> > not have any "To: " entries added to them; developers are expected to
> > manually make edits to the added entries in such cases to convert some
> > "Cc: " entries to "To: " as desired.
> > 
> > The script is quiet by default (only prints errors) and its verbosity
> > can be adjusted via an optional parameter.
> > 
> > Signed-off-by: Guru Das Srinagesh <quic_gurus@quicinc.com>
> > ---
> >  MAINTAINERS               |   5 ++
> >  scripts/add-maintainer.py | 164 ++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 169 insertions(+)
> >  create mode 100755 scripts/add-maintainer.py
> > 
> 
> I do not see the benefits of this script. For me - it's unnecessarily
> more complicated instead of my simple bash function which makes
> everything one command less than here.
> One more thing to maintain.

Thanks for your bash script. I slightly modified it to keep maintainers
in To and rest in Cc. 

git send-email --dry-run --to="$(scripts/get_maintainer.pl --no-multiline --separator=, --no-r --no-l --no-git --no-roles --no-rolestats --no-git-fallback *.patch)" --cc="$(scripts/get_maintainer.pl --no-multiline --separator=, --no-m --no-git --no-roles --no-rolestats --no-git-fallback *.patch)" *.patch

> 
> I don't see the benefits of this for newcomers, either. They should use
> b4. b4 can do much, much more, so anyone creating his workflow should
> start from b4, not from this script.

The ROI from b4 is huge. Totally agree that one should definitely consider b4 for
kernel patch submissions. I really liked b4 appending a unique "change-id"
across patch-versions. This is the single most feature I miss out from gerrit where
all revisions of a patch are glued with a common change-id. If everyone uses, a common
change-id for all versions of series, it is super easy to dig into archives.

Thanks,
Pavan

Thanks,
Pavan
Pavan Kondeti Sept. 27, 2023, 4:54 a.m. UTC | #22
On Tue, Sep 26, 2023 at 05:32:10PM +0530, Pavan Kondeti wrote:
> On Sat, Aug 26, 2023 at 01:07:42AM -0700, Guru Das Srinagesh wrote:
> > +def gather_maintainers_of_file(patch_file):
> > +    all_entities_of_patch = dict()
> > +
> > +    # Run get_maintainer.pl on patch file
> > +    logging.info("GET: Patch: {}".format(os.path.basename(patch_file)))
> > +    cmd = ['scripts/get_maintainer.pl']
> > +    cmd.extend([patch_file])
> > +
> > +    try:
> > +        p = subprocess.run(cmd, stdout=subprocess.PIPE, check=True)
> > +    except:
> > +        sys.exit(1)
> > +
> > +    logging.debug("\n{}".format(p.stdout.decode()))
> > +
> > +    entries = p.stdout.decode().splitlines()
> > +
> > +    maintainers = []
> > +    lists = []
> > +    others = []
> > +
> > +    for entry in entries:
> > +        entity = entry.split('(')[0].strip()
> > +        if any(role in entry for role in ["maintainer", "reviewer"]):
> > +            maintainers.append(entity)
> > +        elif "list" in entry:
> > +            lists.append(entity)
> > +        else:
> > +            others.append(entity)
> > +
> > +    all_entities_of_patch["maintainers"] = set(maintainers)
> > +    all_entities_of_patch["lists"] = set(lists)
> > +    all_entities_of_patch["others"] = set(others)
> > +
> > +    return all_entities_of_patch
> > +
> 
> FYI, there are couple of issues found while playing around.
> 
> - Some entries in MAINTAINERS could be "supporter"
> - When names contain ("company"), the script fails to extract name.
> 
> Thanks,
> Pavan
> 
> diff --git a/scripts/add-maintainer.py b/scripts/add-maintainer.py
> index 5a5cc9482b06..6aa5e7941172 100755
> --- a/scripts/add-maintainer.py
> +++ b/scripts/add-maintainer.py
> @@ -29,8 +29,8 @@ def gather_maintainers_of_file(patch_file):
>      others = []
> 
>      for entry in entries:
> -        entity = entry.split('(')[0].strip()
> -        if any(role in entry for role in ["maintainer", "reviewer"]):
> +        entity = entry.rsplit('(', 1)[0].strip()
> +        if any(role in entry for role in ["maintainer", "reviewer", "supporter"]):
>              maintainers.append(entity)
>          elif "list" in entry:
>              lists.append(entity)
> 
> 

The %s/split/rsplit trades one bug for another :-( , pls ignore the
diff, when entries have multiple braces "()" , the script does not work
as epxected.

Thanks,
Pavan
Guru Das Srinagesh Sept. 27, 2023, 10:44 p.m. UTC | #23
On Sep 27 2023 10:21, Pavan Kondeti wrote:
> On Mon, Aug 28, 2023 at 10:21:32AM +0200, Krzysztof Kozlowski wrote:
> > On 26/08/2023 10:07, Guru Das Srinagesh wrote:
> > > This script runs get_maintainer.py on a given patch file (or multiple
> > > patch files) and adds its output to the patch file in place with the
> > > appropriate email headers "To: " or "Cc: " as the case may be. These new
> > > headers are added after the "From: " line in the patch.
> > > 
> > > Currently, for a single patch, maintainers and reviewers are added as
> > > "To: ", mailing lists and all other roles are added as "Cc: ".
> > > 
> > > For a series of patches, however, a set-union scheme is employed in
> > > order to solve the all-too-common problem of ending up sending only
> > > subsets of a patch series to some lists, which results in important
> > > pieces of context such as the cover letter (or other patches in the
> > > series) being dropped from those lists. This scheme is as follows:
> > > 
> > > - Create set-union of all maintainers and reviewers from all patches and
> > >   use this to do the following per patch:
> > >   - add only that specific patch's maintainers and reviewers as "To: "
> > >   - add the other maintainers and reviewers from the other patches as "Cc: "
> > > 
> > > - Create set-union of all mailing lists corresponding to all patches and
> > >   add this to all patches as "Cc: "
> > > 
> > > - Create set-union of all other roles corresponding to all patches and
> > >   add this to all patches as "Cc: "
> > > 
> > > Please note that patch files that don't have any "Maintainer"s or
> > > "Reviewers" explicitly listed in their `get_maintainer.pl` output will
> > 
> > So before you will ignoring the reviewers, right? One more reason to not
> > get it right...
> > 
> > > not have any "To: " entries added to them; developers are expected to
> > > manually make edits to the added entries in such cases to convert some
> > > "Cc: " entries to "To: " as desired.
> > > 
> > > The script is quiet by default (only prints errors) and its verbosity
> > > can be adjusted via an optional parameter.
> > > 
> > > Signed-off-by: Guru Das Srinagesh <quic_gurus@quicinc.com>
> > > ---
> > >  MAINTAINERS               |   5 ++
> > >  scripts/add-maintainer.py | 164 ++++++++++++++++++++++++++++++++++++++
> > >  2 files changed, 169 insertions(+)
> > >  create mode 100755 scripts/add-maintainer.py
> > > 
> > 
> > I do not see the benefits of this script. For me - it's unnecessarily
> > more complicated instead of my simple bash function which makes
> > everything one command less than here.
> > One more thing to maintain.
> 
> Thanks for your bash script. I slightly modified it to keep maintainers
> in To and rest in Cc. 
> 
> git send-email --dry-run --to="$(scripts/get_maintainer.pl --no-multiline --separator=, --no-r --no-l --no-git --no-roles --no-rolestats --no-git-fallback *.patch)" --cc="$(scripts/get_maintainer.pl --no-multiline --separator=, --no-m --no-git --no-roles --no-rolestats --no-git-fallback *.patch)" *.patch

FYI, b4 has "b4.send-auto-to-cmd" and "b4.send-auto-cc-cmd" [1] as well to
override its default behaviour. You can just set the above two b4 config
options as you like and it will do the right thing.

Guru Das.
Guru Das Srinagesh Sept. 27, 2023, 10:47 p.m. UTC | #24
On Sep 26 2023 17:32, Pavan Kondeti wrote:
> On Sat, Aug 26, 2023 at 01:07:42AM -0700, Guru Das Srinagesh wrote:
> > +def gather_maintainers_of_file(patch_file):
> > +    all_entities_of_patch = dict()
> > +
> > +    # Run get_maintainer.pl on patch file
> > +    logging.info("GET: Patch: {}".format(os.path.basename(patch_file)))
> > +    cmd = ['scripts/get_maintainer.pl']
> > +    cmd.extend([patch_file])
> > +
> > +    try:
> > +        p = subprocess.run(cmd, stdout=subprocess.PIPE, check=True)
> > +    except:
> > +        sys.exit(1)
> > +
> > +    logging.debug("\n{}".format(p.stdout.decode()))
> > +
> > +    entries = p.stdout.decode().splitlines()
> > +
> > +    maintainers = []
> > +    lists = []
> > +    others = []
> > +
> > +    for entry in entries:
> > +        entity = entry.split('(')[0].strip()
> > +        if any(role in entry for role in ["maintainer", "reviewer"]):
> > +            maintainers.append(entity)
> > +        elif "list" in entry:
> > +            lists.append(entity)
> > +        else:
> > +            others.append(entity)
> > +
> > +    all_entities_of_patch["maintainers"] = set(maintainers)
> > +    all_entities_of_patch["lists"] = set(lists)
> > +    all_entities_of_patch["others"] = set(others)
> > +
> > +    return all_entities_of_patch
> > +
> 
> FYI, there are couple of issues found while playing around.

Thanks for testing this out :) I am no longer working on this due to pushback
from the maintainers in favour of b4.

> 
> - Some entries in MAINTAINERS could be "supporter"

This was intentional - I didn't want to include "supporter"s.

> - When names contain ("company"), the script fails to extract name.

Interesting... I had not tested this out.

In any case, I am not devoting resources to work on this unless I see some
interest from maintainers, which, as it stands currently, is nil.

Thanks for the support, Pavan.

Guru Das.
diff mbox series

Patch

diff --git a/MAINTAINERS b/MAINTAINERS
index 0903d87b17cb..b670e9733f03 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8721,6 +8721,11 @@  M:	Joe Perches <joe@perches.com>
 S:	Maintained
 F:	scripts/get_maintainer.pl
 
+ADD MAINTAINER SCRIPT
+M:	Guru Das Srinagesh <quic_gurus@quicinc.com>
+S:	Maintained
+F:	scripts/add-maintainer.py
+
 GFS2 FILE SYSTEM
 M:	Bob Peterson <rpeterso@redhat.com>
 M:	Andreas Gruenbacher <agruenba@redhat.com>
diff --git a/scripts/add-maintainer.py b/scripts/add-maintainer.py
new file mode 100755
index 000000000000..5a5cc9482b06
--- /dev/null
+++ b/scripts/add-maintainer.py
@@ -0,0 +1,164 @@ 
+#! /usr/bin/env python3
+
+import argparse
+import logging
+import os
+import sys
+import subprocess
+import re
+
+def gather_maintainers_of_file(patch_file):
+    all_entities_of_patch = dict()
+
+    # Run get_maintainer.pl on patch file
+    logging.info("GET: Patch: {}".format(os.path.basename(patch_file)))
+    cmd = ['scripts/get_maintainer.pl']
+    cmd.extend([patch_file])
+
+    try:
+        p = subprocess.run(cmd, stdout=subprocess.PIPE, check=True)
+    except:
+        sys.exit(1)
+
+    logging.debug("\n{}".format(p.stdout.decode()))
+
+    entries = p.stdout.decode().splitlines()
+
+    maintainers = []
+    lists = []
+    others = []
+
+    for entry in entries:
+        entity = entry.split('(')[0].strip()
+        if any(role in entry for role in ["maintainer", "reviewer"]):
+            maintainers.append(entity)
+        elif "list" in entry:
+            lists.append(entity)
+        else:
+            others.append(entity)
+
+    all_entities_of_patch["maintainers"] = set(maintainers)
+    all_entities_of_patch["lists"] = set(lists)
+    all_entities_of_patch["others"] = set(others)
+
+    return all_entities_of_patch
+
+def find_pattern_in_lines(pattern, lines):
+    index = 0
+    for line in lines:
+        if re.search(pattern, line):
+            break;
+        index = index + 1
+
+    if index == len(lines):
+        logging.error("Couldn't find pattern {} in patch".format(pattern))
+        sys.exit(1)
+
+    return index
+
+def add_maintainers_to_file(patch_file, entities_per_file, all_entities_union):
+    logging.info("ADD: Patch: {}".format(os.path.basename(patch_file)))
+
+    # For each patch:
+    # - Add all lists from all patches in series as Cc:
+    # - Add all others from all patches in series as Cc:
+    # - Add only maintainers of that patch as To:
+    # - Add maintainers of other patches in series as Cc:
+
+    lists = list(all_entities_union["all_lists"])
+    others = list(all_entities_union["all_others"])
+    file_maintainers = all_entities_union["all_maintainers"].intersection(entities_per_file[os.path.basename(patch_file)].get("maintainers"))
+    other_maintainers = all_entities_union["all_maintainers"].difference(entities_per_file[os.path.basename(patch_file)].get("maintainers"))
+
+    # Specify email headers appropriately
+    cc_lists        = ["Cc: " + l for l in lists]
+    cc_others       = ["Cc: " + o for o in others]
+    to_maintainers  = ["To: " + m for m in file_maintainers]
+    cc_maintainers  = ["Cc: " + om for om in other_maintainers]
+    logging.debug("Cc Lists:\n{}".format('\n'.join(cc_lists)))
+    logging.debug("Cc Others:\n{}".format('\n'.join(cc_others)))
+    logging.debug("Cc Maintainers:\n{}".format('\n'.join(cc_maintainers) or None))
+    logging.debug("To Maintainers:\n{}\n".format('\n'.join(to_maintainers) or None))
+
+    # Edit patch file in place to add maintainers
+    with open(patch_file, "r") as pf:
+        lines = pf.readlines()
+
+    # Get the index of the first "From: <email address>" line in patch
+    from_line = find_pattern_in_lines("^(From: )(.*)<(.*)@(.*)>", lines)
+
+    # Insert our To: and Cc: headers after it.
+    next_line_after_from = from_line + 1
+
+    for l in cc_lists:
+        lines.insert(next_line_after_from, l + "\n")
+    for o in cc_others:
+        lines.insert(next_line_after_from, o + "\n")
+    for om in cc_maintainers:
+        lines.insert(next_line_after_from, om + "\n")
+    for m in to_maintainers:
+        lines.insert(next_line_after_from, m + "\n")
+
+    with open(patch_file, "w") as pf:
+        pf.writelines(lines)
+
+def add_maintainers(patch_files):
+    entities_per_file = dict()
+
+    for patch in patch_files:
+        entities_per_file[os.path.basename(patch)] = gather_maintainers_of_file(patch)
+
+    all_entities_union = {"all_maintainers": set(), "all_lists": set(), "all_others": set()}
+    for patch in patch_files:
+        all_entities_union["all_maintainers"] = all_entities_union["all_maintainers"].union(entities_per_file[os.path.basename(patch)].get("maintainers"))
+        all_entities_union["all_lists"] = all_entities_union["all_lists"].union(entities_per_file[os.path.basename(patch)].get("lists"))
+        all_entities_union["all_others"] = all_entities_union["all_others"].union(entities_per_file[os.path.basename(patch)].get("others"))
+
+    for patch in patch_files:
+        add_maintainers_to_file(patch, entities_per_file, all_entities_union)
+
+    logging.info("Maintainers added to all patch files successfully")
+
+def remove_to_cc_from_header(patch_files):
+    for patch in patch_files:
+        logging.info("UNDO: Patch: {}".format(os.path.basename(patch)))
+        with open(patch, "r") as pf:
+            lines = pf.readlines()
+
+        # Get the index of the first "From: <email address>" line in patch
+        from_line = find_pattern_in_lines("^(From: )(.*)<(.*)@(.*)>", lines)
+
+        # Get the index of the first "Date: " line in patch
+        date_line = find_pattern_in_lines("^(Date: )", lines)
+
+        # Delete everything in between From: and Date:
+        # These are the lines that this script adds - any To: or Cc: anywhere
+        # else in the patch will not be removed.
+        del lines[(from_line + 1):date_line]
+
+        with open(patch, "w") as pf:
+            pf.writelines(lines)
+
+    logging.info("Maintainers removed from all patch files successfully")
+
+def main():
+    parser = argparse.ArgumentParser(description='Add the respective maintainers and mailing lists to patch files')
+    parser.add_argument('patches', nargs='+', help="One or more patch files")
+    parser.add_argument('-v', '--verbosity', choices=['debug', 'info', 'error'], default='error', help="Verbosity level of script output")
+    parser.add_argument('-u', '--undo', action='store_true', help="Remove maintainers added by this script from patch(es)")
+    args = parser.parse_args()
+
+    logging.basicConfig(level=args.verbosity.upper(), format='%(levelname)s: %(message)s')
+
+    for patch in args.patches:
+        if not os.path.isfile(patch):
+            logging.error("File does not exist: {}".format(patch))
+            sys.exit(1)
+
+    if args.undo:
+        remove_to_cc_from_header(args.patches)
+    else:
+        add_maintainers(args.patches)
+
+if __name__ == "__main__":
+    main()