diff mbox series

[v3,14/22] pickaxe -S: support content with NULs under --pickaxe-regex

Message ID patch-14.22-a86032792b6-20210412T170457Z-avarab@gmail.com (mailing list archive)
State New
Headers show
Series pickaxe: test and refactoring for future PCRE backend | expand

Commit Message

Ævar Arnfjörð Bjarmason April 12, 2021, 5:15 p.m. UTC
Fix a bug in the matching routine powering -S<rx> --pickaxe-regex so
that we won't abort early on content that has NULs in it.

We've had a hard requirement on REG_STARTEND since 2f8952250a8 (regex:
add regexec_buf() that can work on a non NUL-terminated string,
2016-09-21), but this sanity check dates back to d01d8c67828 (Support
for pickaxe matching regular expressions, 2006-03-29).

It wasn't needed anymore, and as the now-passing test shows, actively
getting in our way. Since we always require REG_STARTEND support we do
not need to stop at NULs. If we are dealing with a haystack with NUL
in it. The needle may be behind that NUL.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 diffcore-pickaxe.c     | 4 ++--
 t/t4209-log-pickaxe.sh | 8 ++++++++
 2 files changed, 10 insertions(+), 2 deletions(-)
diff mbox series

Patch

diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
index 88b6ca840f6..be0dd683b63 100644
--- a/diffcore-pickaxe.c
+++ b/diffcore-pickaxe.c
@@ -78,12 +78,12 @@  static unsigned int contains(mmfile_t *mf, regex_t *regexp, kwset_t kws)
 		regmatch_t regmatch;
 		int flags = 0;
 
-		while (sz && *data &&
+		while (sz &&
 		       !regexec_buf(regexp, data, sz, 1, &regmatch, flags)) {
 			flags |= REG_NOTBOL;
 			data += regmatch.rm_eo;
 			sz -= regmatch.rm_eo;
-			if (sz && *data && regmatch.rm_so == regmatch.rm_eo) {
+			if (sz && regmatch.rm_so == regmatch.rm_eo) {
 				data++;
 				sz--;
 			}
diff --git a/t/t4209-log-pickaxe.sh b/t/t4209-log-pickaxe.sh
index 3f9aad0fdb0..75795d0b492 100755
--- a/t/t4209-log-pickaxe.sh
+++ b/t/t4209-log-pickaxe.sh
@@ -215,4 +215,12 @@  test_expect_success 'log -S looks into binary files' '
 	test_cmp log full-log
 '
 
+test_expect_success 'log -S --pickaxe-regex looks into binary files' '
+	git -C GS-bin-txt log --pickaxe-regex -Sa >log &&
+	test_cmp log full-log &&
+
+	git -C GS-bin-txt log --pickaxe-regex -S"[a]" >log &&
+	test_cmp log full-log
+'
+
 test_done