mbox series

[0/3] Use "allowlist" and "denylist" tree-wide

Message ID pull.1274.git.1657718450.gitgitgadget@gmail.com (mailing list archive)
Headers show
Series Use "allowlist" and "denylist" tree-wide | expand

Message

Johannes Schindelin via GitGitGadget July 13, 2022, 1:20 p.m. UTC
The terms "allowlist" and "denylist" are self-defining. One "allows" things
while the other "denies" things.

These are better terms over "whitelist" and "blacklist" which require prior
knowledge of the terms or cultural expectations around what each color
"means".

This series replaces (almost) all uses of these terms with allowlist and
denylist. The only exceptions are in release notes for older Git versions.

There is no meaningful functional change, although one logging message in
daemon.c is changed and I'm unfamiliar with exactly how that might be
consumed.

Some recommend using "blocklist", but I personally prefer "denylist". To me,
"blocking" something seems permanent. "Denying" something seems ephemeral
and related to a specific request being denied due to some (possibly
mutable) state. I'm open to suggestions here. There are many fewer
replacements needed in this case.

I did not make any change to our CodingGuidelines. Hopefully having clear
usage throughout the codebase will be enough to promote using consistent
terminology.

Thanks, -Stolee

Derrick Stolee (3):
  Documentation: use allowlist and denylist
  t/*: use allowlist
  *: use allowlist and denylist

 Documentation/git-cvsserver.txt |  2 +-
 Documentation/git-daemon.txt    | 10 +++++-----
 Documentation/git.txt           |  2 +-
 daemon.c                        |  8 ++++----
 git-cvsserver.perl              |  2 +-
 sha1dc/sha1.c                   | 12 ++++++------
 t/README                        |  4 ++--
 t/lib-proto-disable.sh          |  6 +++---
 t/t5812-proto-disable-http.sh   |  2 +-
 t/t5815-submodule-protos.sh     |  4 ++--
 t/t9400-git-cvsserver-server.sh |  2 +-
 t/test-lib-functions.sh         |  2 +-
 t/test-lib.sh                   |  2 +-
 transport.c                     |  8 ++++----
 14 files changed, 33 insertions(+), 33 deletions(-)


base-commit: e4a4b31577c7419497ac30cebe30d755b97752c5
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1274%2Fderrickstolee%2Fallow-deny-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1274/derrickstolee/allow-deny-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1274

Comments

Johannes Schindelin July 13, 2022, 1:29 p.m. UTC | #1
Hi Stolee,

On Wed, 13 Jul 2022, Derrick Stolee via GitGitGadget wrote:

> The terms "allowlist" and "denylist" are self-defining. One "allows" things
> while the other "denies" things.
>
> These are better terms over "whitelist" and "blacklist" which require prior
> knowledge of the terms or cultural expectations around what each color
> "means".

I agree that "allowlist" and "denylist" are much better terms. As you say,
they are more obvious even for non-native speakers, but we do want to take
the cultural implications into account and be mindful to avoid needlessly
controversial language.

Therefore, I am very much in favor of this patch series: ACK!

Thank you for putting in the work to make this happen,
Dscho
Junio C Hamano July 13, 2022, 4:18 p.m. UTC | #2
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> The terms "allowlist" and "denylist" are self-defining. One "allows" things
> while the other "denies" things.
>
> These are better terms over "whitelist" and "blacklist" which require prior
> knowledge of the terms or cultural expectations around what each color
> "means".

Half Devil's advocate mode on, as I got up on the wrong side of the
bed this morning.

I am very much for consistent uses of allow/deny and I think it is a
good idea to review and apply this series.

But I'd prefer to see us more honest to ourselves.  Like it or not,
the code comment and documentation are targetted toward those who
can read English, and when you say something is whitelisted in
English, you know exactly what it means, due to shared knowledge of
historical use of the word.

We are doing this change in the name of inclusion.  I find it
intellectually dishonest to avoid saying that true reason, and
instead say the allow/deny pair is more "precise".  They are not
more precise.  In fact, the fact why you have to choose between deny
and block and defend deny over block shows that these words are less
precise.  People who use white/black do not have to choose between
black and other colors and say "white/red may be OK but we choose
black because..." to defend the choice of their words.

The reason we do this change is because the project thinks that it
is the right thing to encourage the adoption of these more inclusive
words, together with other projects that did the same.

In addition, they are words more widely accepted in today's world,
and new folks are more likely to be educated with these words.  As
time goes by, the historical white/black will be less understood, so
it makes it a future-proofing change, as well.
Derrick Stolee July 13, 2022, 6:33 p.m. UTC | #3
On 7/13/2022 12:18 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> The terms "allowlist" and "denylist" are self-defining. One "allows" things
>> while the other "denies" things.
>>
>> These are better terms over "whitelist" and "blacklist" which require prior
>> knowledge of the terms or cultural expectations around what each color
>> "means".
> 
> Half Devil's advocate mode on, as I got up on the wrong side of the
> bed this morning.
> 
> I am very much for consistent uses of allow/deny and I think it is a
> good idea to review and apply this series.
> 
> But I'd prefer to see us more honest to ourselves.  Like it or not,
> the code comment and documentation are targetted toward those who
> can read English, and when you say something is whitelisted in
> English, you know exactly what it means, due to shared knowledge of
> historical use of the word.
> 
> We are doing this change in the name of inclusion.  I find it
> intellectually dishonest to avoid saying that true reason, and
> instead say the allow/deny pair is more "precise".  They are not
> more precise.  In fact, the fact why you have to choose between deny
> and block and defend deny over block shows that these words are less
> precise.  People who use white/black do not have to choose between
> black and other colors and say "white/red may be OK but we choose
> black because..." to defend the choice of their words.
> 
> The reason we do this change is because the project thinks that it
> is the right thing to encourage the adoption of these more inclusive
> words, together with other projects that did the same.
> 
> In addition, they are words more widely accepted in today's world,
> and new folks are more likely to be educated with these words.  As
> time goes by, the historical white/black will be less understood, so
> it makes it a future-proofing change, as well.
 
I like how you start by saying you are playing devil's advocate,
but then go on to add more reasons to support the work. It's good
feedback to make the case stronger.

Thanks,
-Stolee
Ævar Arnfjörð Bjarmason July 13, 2022, 7:42 p.m. UTC | #4
On Wed, Jul 13 2022, Derrick Stolee via GitGitGadget wrote:

> The terms "allowlist" and "denylist" are self-defining. One "allows" things
> while the other "denies" things.

I've got a preference for things that can be found in widely available
dictionaries, these words seem to be tech neologisms.

The resulting wording also seems a bit ackward to me, e.g. we now say
that some tests are "allowlisted [...] as passing with no memory
leaks". Are we denying or allowing them to pass? No, they're going to
either pass or not.

So to me "whitelist" or "blacklist" is more natural when used in the
descriptive sense, whereas "allow" and "deny" are verbs, so that seems
to impart a sense of actively allowing or denying something.

> These are better terms over "whitelist" and "blacklist" which require prior
> knowledge of the terms or cultural expectations around what each color
> "means".

Apparently whitelist is defined in terms of blacklist, which per
Wikipedia originates in some 17th century play:
https://en.wikipedia.org/wiki/Blacklisting#Origins_of_the_term

> [...]
> Some recommend using "blocklist", but I personally prefer "denylist". To me,
> "blocking" something seems permanent. "Denying" something seems ephemeral
> and related to a specific request being denied due to some (possibly
> mutable) state. I'm open to suggestions here. There are many fewer
> replacements needed in this case.

I suspect the actual motivation is closer to that summarized in :
https://en.wikipedia.org/wiki/Whitelist#Controversy

Personally I'd really prefer if we didn't take these sort of changes,
and took the view that if something was readily understood that it was
good enough.

The CodingGuidelines note that we use a mix of US & UK english, so
forbidding certain words & basically requiring some of us to keep
abreast of the latest political trends in America is a bit too much. I'd
just like to write code, please...

> I did not make any change to our CodingGuidelines. Hopefully having clear
> usage throughout the codebase will be enough to promote using consistent
> terminology.

...particularly since I think what's being implied here is that we can
expect interested parties to be setting up the relevant E-Mail filters,
and asking patch submitters to change wording in the same way as this
series does.

We also have 30-40 uses of both terms in-tree, so it seems implausible
that people are mainly copying existing wording.

A few of the hunks here are changing docs I added, and I just added
those "naturally", i.e. I happened to think of those words to describe
what I was trying to get across).
Ævar Arnfjörð Bjarmason July 13, 2022, 8:02 p.m. UTC | #5
On Wed, Jul 13 2022, Derrick Stolee via GitGitGadget wrote:

>  sha1dc/sha1.c                   | 12 ++++++------

Aside from anything else I've commented on: Please drop this part of the
change. If you'd like to change this take it up with upstream:
https://github.com/cr-marcstevens/sha1collisiondetection/

As a "git log" on "sha1dc sha1collisiondetection" will show we try to
keep a 1=1 mapping to upstream with this code, this would both diverge
us from upstream, and diverge sha1dc from our own submodule copy in
sha1collisiondetection/".
Junio C Hamano July 13, 2022, 8:32 p.m. UTC | #6
Derrick Stolee <derrickstolee@github.com> writes:

> I like how you start by saying you are playing devil's advocate,
> but then go on to add more reasons to support the work. It's good
> feedback to make the case stronger.

I may have offered alternatives, but I am not "adding more" reasons.
Junio C Hamano July 13, 2022, 10:28 p.m. UTC | #7
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Wed, Jul 13 2022, Derrick Stolee via GitGitGadget wrote:
>
>> The terms "allowlist" and "denylist" are self-defining. One "allows" things
>> while the other "denies" things.
>
> I've got a preference for things that can be found in widely available
> dictionaries, these words seem to be tech neologisms.

FWIW, I share the same.  I suspect that "whitelist" may be found in
more dictionaries than "allowlist".

e.g.

    https://www.merriam-webster.com/dictionary/allowlist
    https://www.merriam-webster.com/dictionary/whitelist

A statement "We have audience who are not native English speakers,
and may not share cultural background" may not be incorrect at all,
but that does not justify s/whitelist/allowlist/.  We end up with
sentences written with non-words that these non-natives cannot even
look up in dictionary.

If we can rephrase without using these invented words, we should do
so, especially when the result becomes even easier to read than the
original that used "whitelist".  I've shown a few examples in my
other messages in this thread.

Thanks.
Derrick Stolee July 15, 2022, 2:25 a.m. UTC | #8
On 7/13/2022 6:28 PM, Junio C Hamano wrote:
> If we can rephrase without using these invented words, we should do
> so, especially when the result becomes even easier to read than the
> original that used "whitelist".  I've shown a few examples in my
> other messages in this thread.

Based on those examples, I agree that the best thing to do is to
rephrase to avoid the term altogether. This avoids confusion when
the reader does not know the term, as well as sometimes being more
consistent with the phrasing in the same document.

Thanks,
-Stolee