diff mbox series

[v2,4/4] parse_commit(): describe more date-parsing failure modes

Message ID 20230425055519.GD4015649@coredump.intra.peff.net (mailing list archive)
State Superseded
Headers show
Series fixing some parse_commit() timestamp corner cases | expand

Commit Message

Jeff King April 25, 2023, 5:55 a.m. UTC
The previous few commits improved the parsing of dates in malformed
commit objects. But there's one big case left implicit: we may still
feed garbage to parse_timestamp(). This is preferable to trying to be
more strict, but let's document the thinking in a comment.

Signed-off-by: Jeff King <peff@peff.net>
---
Obviously not strictly necessary, but it felt like this was worth
pointing out, and it felt weird to shove it into patch 3.

 commit.c | 10 ++++++++++
 1 file changed, 10 insertions(+)
diff mbox series

Patch

diff --git a/commit.c b/commit.c
index 2f1b5d505b..2ac141e198 100644
--- a/commit.c
+++ b/commit.c
@@ -136,6 +136,16 @@  static timestamp_t parse_commit_date(const char *buf, const char *tail)
 	/*
 	 * We know there is at least one non-whitespace character, so we'll
 	 * begin parsing there and stop at worst case at eol.
+	 *
+	 * Note that we may feed parse_timestamp() non-digit garbage here if
+	 * the commit is malformed. If we see a totally unexpected token like
+	 * "foo" here, it will return 0, which is our error/sentinel value
+	 * anyway. For something like "123foo456", it will parse as far as it
+	 * can. That might be questionable (versus returning "0"), but it would
+	 * help in a hypothetical case like "123456+0100", where the whitespace
+	 * from the timezone is missing. Since such syntactic errors may be
+	 * baked into history and hard to correct now, let's err on trying to
+	 * make our best guess here, rather than insist on perfect syntax.
 	 */
 	return parse_timestamp(dateptr, NULL, 10);
 }