diff mbox series

[v3,6/6] banned.h: mark `strtok()` and `strtok_r()` as banned

Message ID da896aa358eab65f2629f85189c5be4ad9cec635.1682374789.git.me@ttaylorr.com (mailing list archive)
State Accepted
Commit 60ff56f50372c1498718938ef504e744fe011ffb
Headers show
Series banned: mark `strok()`, `strtok_r()` as banned | expand

Commit Message

Taylor Blau April 24, 2023, 10:20 p.m. UTC
`strtok()` has a couple of drawbacks that make it undesirable to have
any new instances. In addition to being thread-unsafe, it also
encourages confusing data flows, where `strtok()` may be called from
multiple functions with its first argument as NULL, making it unclear
from the immediate context which string is being tokenized.

Now that we have removed all instances of `strtok()` from the tree,
let's ban `strtok()` to avoid introducing new ones in the future. If new
callers should arise, they are encouraged to use
`string_list_split_in_place()` (and `string_list_remove_empty_items()`,
if applicable).

string_list_split_in_place() is not a perfect drop-in replacement
for `strtok_r()`, particularly if the caller is processing a string with
an arbitrary number of tokens, and wants to process each token one at a
time.

But there are no instances of this in Git's tree which are more
well-suited to `strtok_r()` than the friendlier
`string_list_in_place()`, so ban `strtok_r()`, too.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 banned.h | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Chris Torek April 24, 2023, 10:25 p.m. UTC | #1
I eyeballed the whole thing (not the same as real tests of course)
and it looks good, but there's one typo (missing-word-o) here of note:

On Mon, Apr 24, 2023 at 3:20 PM Taylor Blau <me@ttaylorr.com> wrote:
...
> But there are no instances of this in Git's tree which are more
> well-suited to `strtok_r()` than the friendlier
> `string_list_in_place()`, so ban `strtok_r()`, too.
>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>

Missing `split_`. Probably not worth a re-roll...

Chris
Taylor Blau April 24, 2023, 11 p.m. UTC | #2
On Mon, Apr 24, 2023 at 03:25:55PM -0700, Chris Torek wrote:
> I eyeballed the whole thing (not the same as real tests of course)
> and it looks good, but there's one typo (missing-word-o) here of note:
>
> On Mon, Apr 24, 2023 at 3:20 PM Taylor Blau <me@ttaylorr.com> wrote:
> ...
> > But there are no instances of this in Git's tree which are more
> > well-suited to `strtok_r()` than the friendlier
> > `string_list_in_place()`, so ban `strtok_r()`, too.
> >
> > Signed-off-by: Taylor Blau <me@ttaylorr.com>
>
> Missing `split_`. Probably not worth a re-roll...

Oops, thanks for noticing.

Thanks,
Taylor
Jeff King April 25, 2023, 6:26 a.m. UTC | #3
On Mon, Apr 24, 2023 at 06:20:26PM -0400, Taylor Blau wrote:

> string_list_split_in_place() is not a perfect drop-in replacement
> for `strtok_r()`, particularly if the caller is processing a string with
> an arbitrary number of tokens, and wants to process each token one at a
> time.
> 
> But there are no instances of this in Git's tree which are more
> well-suited to `strtok_r()` than the friendlier
> `string_list_in_place()`, so ban `strtok_r()`, too.

For true incremental left-to-right parsing, strcspn() is probably a
better solution. We could mention that here in case anybody digs up the
commit after getting a "banned function" error.

I'm tempted to say that this thread could serve the same function, but
I'm not sure where people turn to for answers (I find searching the list
about as easy as "git log -S", but then I've invested a lot of effort in
my list archive tooling :) ).

I'm happy with it either the way, though.

-Peff
Taylor Blau April 25, 2023, 9:02 p.m. UTC | #4
On Tue, Apr 25, 2023 at 02:26:17AM -0400, Jeff King wrote:
> On Mon, Apr 24, 2023 at 06:20:26PM -0400, Taylor Blau wrote:
>
> > string_list_split_in_place() is not a perfect drop-in replacement
> > for `strtok_r()`, particularly if the caller is processing a string with
> > an arbitrary number of tokens, and wants to process each token one at a
> > time.
> >
> > But there are no instances of this in Git's tree which are more
> > well-suited to `strtok_r()` than the friendlier
> > `string_list_in_place()`, so ban `strtok_r()`, too.
>
> For true incremental left-to-right parsing, strcspn() is probably a
> better solution. We could mention that here in case anybody digs up the
> commit after getting a "banned function" error.
>
> I'm tempted to say that this thread could serve the same function, but
> I'm not sure where people turn to for answers (I find searching the list
> about as easy as "git log -S", but then I've invested a lot of effort in
> my list archive tooling :) ).

Personally, between what's already in the patch message and this
discussion on the list, I think that folks would have enough guidance on
how to do things right.

But if others feel like we could or should be more rigid here and update
the commit message to say something like "if you're scanning from
left-to-right, you could use strtok_r(), or strcspn() instead", but TBH
I think there are a gazillion different ways to do this task, so I don't
know that adding another item to that list substantially changes things.

Thanks,
Taylor
diff mbox series

Patch

diff --git a/banned.h b/banned.h
index 6ccf46bc19..44e76bd90a 100644
--- a/banned.h
+++ b/banned.h
@@ -18,6 +18,10 @@ 
 #define strncpy(x,y,n) BANNED(strncpy)
 #undef strncat
 #define strncat(x,y,n) BANNED(strncat)
+#undef strtok
+#define strtok(x,y) BANNED(strtok)
+#undef strtok_r
+#define strtok_r(x,y,z) BANNED(strtok_r)
 
 #undef sprintf
 #undef vsprintf