Message ID | 20190804031409.32764-1-carenas@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC,v3] grep: treat PCRE2 jit compilation memory error as non fatal | expand |
PROs: * it works (only for PCRE2) and tested in OpenBSD, NetBSD, macOS, Linux (Debian) * it applies everywhere (even pu) without conflicts * it doesn't introduce any regressions in tests (tested in Debian with SElinux in enforcing mode) * it is simple CONs: * HardenedBSD still segfaults (bugfix proposed[1] to sljit/pcre) * warning is noisy (at least once per thread) and might be even ineffective as it goes to stderr while stdout with most the output goes to a pager * too conservative (pcre2grep shows all errors from pcre2_jit_compile should be ignored) * no tests Known Issues: * code is ugly (it even triggers a warning if you have the right compiler) * code is suspiciously similar to one[2] that was rejected, but hopefully commit message is better * code is incomplete (PCRE1 has too many conflicting changes in flight to attempt a similar fix) * there are obvious blind spots in the tests that need fixing, and a lot more testing in other platforms/architectures * git still will sometimes die because the non fast path has UTF-8 issues I still think the pcre.jit flag knob might be useful to workaround some of the issues detailed in CONs but probably with a different definition: unset -> fallback (try JIT but use interpreter if that didn't work) false -> don't even try to use JIT true -> print warning and maybe even die (if we really think that is useful) some performance numbers below for the perl tests with JIT enabled (in non enforcing SELinux) Test this tree --------------------------------------------------------------- 7820.3: perl grep 'how.to' 0.56(0.29+0.60) 7820.7: perl grep '^how to' 0.49(0.29+0.54) 7820.11: perl grep '[how] to' 0.54(0.39+0.51) 7820.15: perl grep '(e.t[^ ]*|v.ry) rare' 0.60(0.45+0.58) 7820.19: perl grep 'm(ú|u)lt.b(æ|y)te' 0.58(0.30+0.61) with "fallback to interpreter" (in enforcing SELinux) Test this tree --------------------------------------------------------------- 7820.3: perl grep 'how.to' 0.64(0.59+0.56) 7820.7: perl grep '^how to' 1.83(2.91+0.56) 7820.11: perl grep '[how] to' 2.07(3.33+0.61) 7820.15: perl grep '(e.t[^ ]*|v.ry) rare' 2.89(4.91+0.66) 7820.19: perl grep 'm(ú|u)lt.b(æ|y)te' 0.78(0.86+0.55) [1] https://github.com/zherczeg/sljit/pull/2 [2] https://public-inbox.org/git/20181209230024.43444-3-carenas@gmail.com/
Carlo Arenas <carenas@gmail.com> writes: > * code is suspiciously similar to one[2] that was rejected, but > hopefully commit message is better > ... > [2] https://public-inbox.org/git/20181209230024.43444-3-carenas@gmail.com/ I do not recall ever rejecting that one. It did not come with a good proposed log message to be accepted as-is, so I do not find it surprising that I did not pick it up, was waiting for a new iteration and then everybody forgot about it. But that is quite different from getting rejected (with the connotation that "don't attempt this bad idea again, unless the world changes drastically"). In any case, this round looks a lot more reasoned. I personally do not think the warning() is a good idea. As I said in the old discussion, we by default should treat JIT as a mere optimization, and we should stay out of the way most of the time. An additional "must have JIT or we will die" [*1*] can be added on top of this change, if somebody really cares. Thanks. [Reference] *1* https://public-inbox.org/git/87pnu9yekk.fsf@evledraar.gmail.com/
diff --git a/grep.c b/grep.c index f7c3a5803e..593a1cb7a0 100644 --- a/grep.c +++ b/grep.c @@ -525,7 +525,13 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt if (p->pcre2_jit_on == 1) { jitret = pcre2_jit_compile(p->pcre2_pattern, PCRE2_JIT_COMPLETE); if (jitret) - die("Couldn't JIT the PCRE2 pattern '%s', got '%d'\n", p->pattern, jitret); + if (jitret == PCRE2_ERROR_NOMEMORY) { + warning("JIT couldn't be used in PCRE2"); + p->pcre2_jit_on = 0; + return; + } + else + die("Couldn't JIT the PCRE2 pattern '%s', got '%d'\n", p->pattern, jitret); /* * The pcre2_config(PCRE2_CONFIG_JIT, ...) call just
94da9193a6 (grep: add support for PCRE v2, 2017-06-01) uses the JIT fast path unless JIT support has not been compiled in the linked library. Starting from 10.23 of PCRE2, pcre2grep ignores any errors from pcre2_jit_cpmpile as a workaround for their bug1749[1] and we should do too, so that the interpreter could be used as a fallback in cases where JIT was not available because of a security policy. To be conservative, we are restricting initially the error to the known error that would be returned in that case (and to be documented as such in a future release of PCRE) and printing a warning so that corrective action could be taken. [1] https://bugs.exim.org/show_bug.cgi?id=1749 Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> --- grep.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)