[RFC,1/2] grep: fallback to interpreter if JIT fails with pcre1
diff mbox series

Message ID 20181209230024.43444-2-carenas@gmail.com
State New
Headers show
Series
  • fallback to interpreter if JIT fails with pcre
Related show

Commit Message

Carlo Marcelo Arenas Belón Dec. 9, 2018, 11 p.m. UTC
JIT support was added to 8.20 but the interface we rely on is only
enabled after 8.32 so try to make the message clearer.

in systems where there are restrictions against creating executable
pages programatically (like OpenBSD, NetBSD, macOS or seLinux) JIT
will fail, resulting in a error message to the user.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
---
 Makefile | 12 ++++++------
 grep.c   |  6 ++++++
 2 files changed, 12 insertions(+), 6 deletions(-)

Comments

Ævar Arnfjörð Bjarmason Dec. 9, 2018, 11:51 p.m. UTC | #1
On Sun, Dec 09 2018, Carlo Marcelo Arenas Belón wrote:

[+CC pcre-dev]

> JIT support was added to 8.20 but the interface we rely on is only
> enabled after 8.32 so try to make the message clearer.
>
> in systems where there are restrictions against creating executable
> pages programatically (like OpenBSD, NetBSD, macOS or seLinux) JIT
> will fail, resulting in a error message to the user.
>
> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
> ---
>  Makefile | 12 ++++++------
>  grep.c   |  6 ++++++
>  2 files changed, 12 insertions(+), 6 deletions(-)
>
> diff --git a/Makefile b/Makefile
> index 1a44c811aa..62b0cb6ee6 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -32,14 +32,14 @@ all::
>  # USE_LIBPCRE is a synonym for USE_LIBPCRE2, define USE_LIBPCRE1
>  # instead if you'd like to use the legacy version 1 of the PCRE
>  # library. Support for version 1 will likely be removed in some future
> -# release of Git, as upstream has all but abandoned it.
> +# release of Git, as upstream is focusing all development for new
> +# features in the newer version instead.

I think whatever we do here it makes sense to split this into its own
patch, since it doesn't have to do with this fallback mechanism.

FWIW I was trying to word this in some way that very briefly described
the v1 v.s. v2 situation. Just saying "new features" doesn't quite
capture it, e.g. some bugs in v1 are closed with some resolution like
"this isn't trivial to fix, use v2 instead".

>  # When using USE_LIBPCRE1, define NO_LIBPCRE1_JIT if the PCRE v1
> -# library is compiled without --enable-jit. We will auto-detect
> -# whether the version of the PCRE v1 library in use has JIT support at
> -# all, but we unfortunately can't auto-detect whether JIT support
> -# hasn't been compiled in in an otherwise JIT-supporting version. If
> -# you have link-time errors about a missing `pcre_jit_exec` define
> +# library is newer than 8.32 but compiled without --enable-jit or
> +# you want to disable JIT
> +#
> +# If you have link-time errors about a missing `pcre_jit_exec` define
>  # this, or recompile PCRE v1 with --enable-jit.
>  #
>  # Define LIBPCREDIR=/foo/bar if your PCRE header and library files are
> diff --git a/grep.c b/grep.c
> index 4db1510d16..5ccc0421a1 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -405,6 +405,12 @@ static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt *opt)
>  		die("%s", error);
>
>  #ifdef GIT_PCRE1_USE_JIT
> +	if (p->pcre1_extra_info &&
> +		!(p->pcre1_extra_info->flags & PCRE_EXTRA_EXECUTABLE_JIT)) {
> +		/* JIT failed so fallback to the interpreter */
> +		p->pcre1_jit_on = 0;
> +		return;
> +	}

Obviously this & what you have in 2/2 needs to be fixed in some way.

Is the issue on SELinux, OpenBSD, NetBSD etc. *how* PCRE is creating the
the JIT'd code? I.e. presumably Google Chrome's JIT engine, Java JIT and
the like work on those setup, or not? I.e. is this something upstream
can/is likely to fix eventually?

Are there cases where we can JIT, but fail for some entirely unrelated
reason, and are now hiding the error?

Are we mixing a condition where one some OS's or OS versions this just
won't work at all, and thus maybe should be something turned on in
config.mak.uname, v.s. e.g. SELinux where presumably it'll dynamically
change.

I'm inclined to suggest that we should have another ifdef here for "if
JIT fails I'd like it to die", so that e.g. packages I build (for
internal use) don't silently slow down in the future, only for me to
find some months later that someone enabled an overzealous SELinux
policy and we swept this under the rug.

But maybe that's just dumb for some reason and we always need to do this
dynamically...
brian m. carlson Dec. 10, 2018, 12:42 a.m. UTC | #2
On Mon, Dec 10, 2018 at 12:51:01AM +0100, Ævar Arnfjörð Bjarmason wrote:
> Obviously this & what you have in 2/2 needs to be fixed in some way.
> 
> Is the issue on SELinux, OpenBSD, NetBSD etc. *how* PCRE is creating the
> the JIT'd code? I.e. presumably Google Chrome's JIT engine, Java JIT and
> the like work on those setup, or not? I.e. is this something upstream
> can/is likely to fix eventually?

From the cover letter (but without testing), it seems like it would
probably be fine to first map the pages read-write to write the code and
then, once that's done, to map them read-executable. I know JIT
compilation does work on the BSDs, so presumably that's the technique to
make it do so.

Both versions of PCRE map pages both write and executable at the same
time, which is presumably where things go wrong. I assume it can be
fixed, but whether that's easy in the context of PCRE, I wouldn't know.

> Are we mixing a condition where one some OS's or OS versions this just
> won't work at all, and thus maybe should be something turned on in
> config.mak.uname, v.s. e.g. SELinux where presumably it'll dynamically
> change.

Considering that some Linux users use PaX kernels with standard
distributions and that most BSD kernels can be custom-compiled with a
variety of options enabled or disabled, I think this is something we
should detect dynamically.

> I'm inclined to suggest that we should have another ifdef here for "if
> JIT fails I'd like it to die", so that e.g. packages I build (for
> internal use) don't silently slow down in the future, only for me to
> find some months later that someone enabled an overzealous SELinux
> policy and we swept this under the rug.

My view is that JIT is a nice performance optimization, but it's
optional. I honestly don't think it should even be exposed through the
API: if it works, then things are faster, and if it doesn't, then
they're not. I don't see the value in an option for causing things to be
broken if someone improves the security of the system.
Carlo Marcelo Arenas Belón Dec. 10, 2018, 1:25 a.m. UTC | #3
On Sun, Dec 9, 2018 at 4:42 PM brian m. carlson
<sandals@crustytoothpaste.net> wrote:
>
> On Mon, Dec 10, 2018 at 12:51:01AM +0100, Ævar Arnfjörð Bjarmason wrote:
> > Obviously this & what you have in 2/2 needs to be fixed in some way.
> >
> > Is the issue on SELinux, OpenBSD, NetBSD etc. *how* PCRE is creating the
> > the JIT'd code? I.e. presumably Google Chrome's JIT engine, Java JIT and
> > the like work on those setup, or not? I.e. is this something upstream
> > can/is likely to fix eventually?
>
> From the cover letter (but without testing), it seems like it would
> probably be fine to first map the pages read-write to write the code and
> then, once that's done, to map them read-executable. I know JIT
> compilation does work on the BSDs, so presumably that's the technique to
> make it do so.

and that has been implemented (sljitProtExecAllocator.c) as part of the work
triggered by the bug I linked about [1], deep inside sljit (which is
what pcre uses for JIT)

the code AS-IS wouldn't compile for the BSD[2] but that is easy to fix
and sure works as expected but I am under the impression that is not
something that can be considered as a solution as explained by the open issues
described with crashes after a fork()

note that changing the map from read-write to executable will be
prevented by the
same policy so you have to create 2 maps (and therefore a backing file) and I
don't think there is a way to solve that in a foolproof way in a
library which is why
I mentioned more work might be needed to define the right interfaces
so it can be
solved by the application, and that is unlikely to happen with the old library.

Carlo

[1] https://bugs.exim.org/show_bug.cgi?id=1749
[2] https://bugs.exim.org/show_bug.cgi?id=2155
Junio C Hamano Dec. 10, 2018, 6:42 a.m. UTC | #4
"brian m. carlson" <sandals@crustytoothpaste.net> writes:

> Considering that some Linux users use PaX kernels with standard
> distributions and that most BSD kernels can be custom-compiled with a
> variety of options enabled or disabled, I think this is something we
> should detect dynamically.
> ...
> My view is that JIT is a nice performance optimization, but it's
> optional. I honestly don't think it should even be exposed through the
> API: if it works, then things are faster, and if it doesn't, then
> they're not. I don't see the value in an option for causing things to be
> broken if someone improves the security of the system.

I agree to both of these two points.  Thanks for a thoughtful
discussion, all.
Junio C Hamano Dec. 10, 2018, 6:48 a.m. UTC | #5
Carlo Marcelo Arenas Belón  <carenas@gmail.com> writes:

> JIT support was added to 8.20 but the interface we rely on is only
> enabled after 8.32 so try to make the message clearer.

Could you add some word before 8.20 and 8.32 (e.g. "pcre library
version 8.20", if that is what you are referring to, and if 8.32 is
also taken from the same context, adding such phrase to clarify the
context only to 8.20 is sufficient)?

> in systems where there are restrictions against creating executable
> pages programatically (like OpenBSD, NetBSD, macOS or seLinux) JIT
> will fail, resulting in a error message to the user.
>
> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
> ---

The change to the code looked sensible.
Junio C Hamano Dec. 10, 2018, 6:54 a.m. UTC | #6
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>> @@ -32,14 +32,14 @@ all::
>>  # USE_LIBPCRE is a synonym for USE_LIBPCRE2, define USE_LIBPCRE1
>>  # instead if you'd like to use the legacy version 1 of the PCRE
>>  # library. Support for version 1 will likely be removed in some future
>> -# release of Git, as upstream has all but abandoned it.
>> +# release of Git, as upstream is focusing all development for new
>> +# features in the newer version instead.
>
> I think whatever we do here it makes sense to split this into its own
> patch, since it doesn't have to do with this fallback mechanism.
>
> FWIW I was trying to word this in some way that very briefly described
> the v1 v.s. v2 situation. Just saying "new features" doesn't quite
> capture it, e.g. some bugs in v1 are closed with some resolution like
> "this isn't trivial to fix, use v2 instead".

I actually think we should remove the paragraph that says "Support
for version 1 will likely to be removed...", as I do not see how it
helps the users.  If they have both available, on the day they hear
that we are planning to remove pcre1 support and realize that they
need to plan to upgrade to the first version of Git that drops pcre1
support, they can switch to pcre2 just fine, so telling them about
our future that we do not even know definite plan yet does not help
them.  It is quite unlikely, given that "upstream has all but
abandoned it", that there only is pcre1 support without pcre2 for an
obscure platform, and even if there are such platforms, the users on
them won't have much choice.  Discouraging them from using pcre1
would not help them make better choices anyway.
Ævar Arnfjörð Bjarmason Dec. 10, 2018, 8:24 a.m. UTC | #7
On Mon, Dec 10 2018, brian m. carlson wrote:

> On Mon, Dec 10, 2018 at 12:51:01AM +0100, Ævar Arnfjörð Bjarmason wrote:
>> Obviously this & what you have in 2/2 needs to be fixed in some way.
>>
>> Is the issue on SELinux, OpenBSD, NetBSD etc. *how* PCRE is creating the
>> the JIT'd code? I.e. presumably Google Chrome's JIT engine, Java JIT and
>> the like work on those setup, or not? I.e. is this something upstream
>> can/is likely to fix eventually?
>
> From the cover letter (but without testing), it seems like it would
> probably be fine to first map the pages read-write to write the code and
> then, once that's done, to map them read-executable. I know JIT
> compilation does work on the BSDs, so presumably that's the technique to
> make it do so.
>
> Both versions of PCRE map pages both write and executable at the same
> time, which is presumably where things go wrong. I assume it can be
> fixed, but whether that's easy in the context of PCRE, I wouldn't know.
>
>> Are we mixing a condition where one some OS's or OS versions this just
>> won't work at all, and thus maybe should be something turned on in
>> config.mak.uname, v.s. e.g. SELinux where presumably it'll dynamically
>> change.
>
> Considering that some Linux users use PaX kernels with standard
> distributions and that most BSD kernels can be custom-compiled with a
> variety of options enabled or disabled, I think this is something we
> should detect dynamically.

Right. I'm asking whether we're mixing up cases where it can always be
detected at compile-time on some systems v.s. cases where it'll
potentially change at runtime.

>> I'm inclined to suggest that we should have another ifdef here for "if
>> JIT fails I'd like it to die", so that e.g. packages I build (for
>> internal use) don't silently slow down in the future, only for me to
>> find some months later that someone enabled an overzealous SELinux
>> policy and we swept this under the rug.
>
> My view is that JIT is a nice performance optimization, but it's
> optional. I honestly don't think it should even be exposed through the
> API: if it works, then things are faster, and if it doesn't, then
> they're not. I don't see the value in an option for causing things to be
> broken if someone improves the security of the system.

For many users that's definitely the case, but for others that's like
saying a RDBMS is still going to be functional if the "ORDER BY"
function degrades to bubblesort. The JIT improves performance my
multi-hundred percents sometimes, so some users (e.g. me) rely on that
not being silently degraded.

So I'm wondering if we can have something like:

    if (!jit)
        if (must_have_jit)
            BUG(...); // Like currently
        else
            fallback(); // new behavior
Carlo Marcelo Arenas Belón Dec. 11, 2018, 8:11 p.m. UTC | #8
On Mon, Dec 10, 2018 at 12:24 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> On Mon, Dec 10 2018, brian m. carlson wrote:
> > Considering that some Linux users use PaX kernels with standard
> > distributions and that most BSD kernels can be custom-compiled with a
> > variety of options enabled or disabled, I think this is something we
> > should detect dynamically.
>
> Right. I'm asking whether we're mixing up cases where it can always be
> detected at compile-time on some systems v.s. cases where it'll
> potentially change at runtime.

the closer we come to a system specific issues is with macOS where the
compiler (in some newer versions) is allocating the memory using the
MAP_JIT flag, which seems was originally meant to be only used in iOS
and has the strange characteristic of failing the mmap for versions
older than 10.14 if it was called more than once.

IMHO as brian pointed out, this is better done at runtime.

> >> I'm inclined to suggest that we should have another ifdef here for "if
> >> JIT fails I'd like it to die", so that e.g. packages I build (for
> >> internal use) don't silently slow down in the future, only for me to
> >> find some months later that someone enabled an overzealous SELinux
> >> policy and we swept this under the rug.
> >
> > My view is that JIT is a nice performance optimization, but it's
> > optional. I honestly don't think it should even be exposed through the
> > API: if it works, then things are faster, and if it doesn't, then
> > they're not. I don't see the value in an option for causing things to be
> > broken if someone improves the security of the system.
>
> For many users that's definitely the case, but for others that's like
> saying a RDBMS is still going to be functional if the "ORDER BY"
> function degrades to bubblesort. The JIT improves performance my
> multi-hundred percents sometimes, so some users (e.g. me) rely on that
> not being silently degraded.

the opposite is also true, specially considering that some old
versions of pcre result in a segfault instead of an error message and
therefore since there is no way to disable JIT, the only option left
is not to use `git grep -P` (or the equivalent git log call)

> So I'm wondering if we can have something like:
>
>     if (!jit)
>         if (must_have_jit)
>             BUG(...); // Like currently
>         else
>             fallback(); // new behavior

I am wondering if something like a `git doctor` command might be an
interesting alternative to this.

This way we could (for ex: in NetBSD) give the user a hint of what to
do to make their git grep -P faster when we detect we are running the
fallback, and might be useful as well to provide hints for
optimizations that could be used in other cases (probably even
depending on the size of the git repository)

For your use case, you just need to add a crontab that will trigger an
alarm if this command ever mentions PCRE

Carlo
Ævar Arnfjörð Bjarmason Dec. 11, 2018, 8:51 p.m. UTC | #9
On Tue, Dec 11 2018, Carlo Arenas wrote:

> On Mon, Dec 10, 2018 at 12:24 AM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>> On Mon, Dec 10 2018, brian m. carlson wrote:
>> > Considering that some Linux users use PaX kernels with standard
>> > distributions and that most BSD kernels can be custom-compiled with a
>> > variety of options enabled or disabled, I think this is something we
>> > should detect dynamically.
>>
>> Right. I'm asking whether we're mixing up cases where it can always be
>> detected at compile-time on some systems v.s. cases where it'll
>> potentially change at runtime.
>
> the closer we come to a system specific issues is with macOS where the
> compiler (in some newer versions) is allocating the memory using the
> MAP_JIT flag, which seems was originally meant to be only used in iOS
> and has the strange characteristic of failing the mmap for versions
> older than 10.14 if it was called more than once.
>
> IMHO as brian pointed out, this is better done at runtime.

Sure. Just something I was wondering since it wasn't clear from the
patch. Makes sense, if it's always runtime (or not worth the effort to
divide the two) let's do that.

>> >> I'm inclined to suggest that we should have another ifdef here for "if
>> >> JIT fails I'd like it to die", so that e.g. packages I build (for
>> >> internal use) don't silently slow down in the future, only for me to
>> >> find some months later that someone enabled an overzealous SELinux
>> >> policy and we swept this under the rug.
>> >
>> > My view is that JIT is a nice performance optimization, but it's
>> > optional. I honestly don't think it should even be exposed through the
>> > API: if it works, then things are faster, and if it doesn't, then
>> > they're not. I don't see the value in an option for causing things to be
>> > broken if someone improves the security of the system.
>>
>> For many users that's definitely the case, but for others that's like
>> saying a RDBMS is still going to be functional if the "ORDER BY"
>> function degrades to bubblesort. The JIT improves performance my
>> multi-hundred percents sometimes, so some users (e.g. me) rely on that
>> not being silently degraded.
>
> the opposite is also true, specially considering that some old
> versions of pcre result in a segfault instead of an error message and
> therefore since there is no way to disable JIT, the only option left
> is not to use `git grep -P` (or the equivalent git log call)

Right, of course it segfaulting is a bug...

>> So I'm wondering if we can have something like:
>>
>>     if (!jit)
>>         if (must_have_jit)
>>             BUG(...); // Like currently
>>         else
>>             fallback(); // new behavior
>
> I am wondering if something like a `git doctor` command might be an
> interesting alternative to this.
>
> This way we could (for ex: in NetBSD) give the user a hint of what to
> do to make their git grep -P faster when we detect we are running the
> fallback, and might be useful as well to provide hints for
> optimizations that could be used in other cases (probably even
> depending on the size of the git repository)

Such a command has been discussed before on-list. I think it's a good
idea for the more fuzzy things like optimization suggests, but for the
case of expecting something at compile-time where not having that at
runtime is a boolean state it's nicer to just die with a BUG(...).

> For your use case, you just need to add a crontab that will trigger an
> alarm if this command ever mentions PCRE

...The reason I'd like it to die is because it neatly and naturally
integrates with all existing test infrastructure I have. I.e. build
package, run stress tests on all sorts of machines, see that it passes,
and SELinux isn't ruining it or whatever.

I know this works now (I always get PCRE v2 JIT) because it doesn't die
or segfault. I'd like not to have it regress to having worse
performance.

Having a cronjob to test for "does PCRE v2 JIT still work?" is not as
easy & isn't a drop-in solution.

Patch
diff mbox series

diff --git a/Makefile b/Makefile
index 1a44c811aa..62b0cb6ee6 100644
--- a/Makefile
+++ b/Makefile
@@ -32,14 +32,14 @@  all::
 # USE_LIBPCRE is a synonym for USE_LIBPCRE2, define USE_LIBPCRE1
 # instead if you'd like to use the legacy version 1 of the PCRE
 # library. Support for version 1 will likely be removed in some future
-# release of Git, as upstream has all but abandoned it.
+# release of Git, as upstream is focusing all development for new
+# features in the newer version instead.
 #
 # When using USE_LIBPCRE1, define NO_LIBPCRE1_JIT if the PCRE v1
-# library is compiled without --enable-jit. We will auto-detect
-# whether the version of the PCRE v1 library in use has JIT support at
-# all, but we unfortunately can't auto-detect whether JIT support
-# hasn't been compiled in in an otherwise JIT-supporting version. If
-# you have link-time errors about a missing `pcre_jit_exec` define
+# library is newer than 8.32 but compiled without --enable-jit or
+# you want to disable JIT
+#
+# If you have link-time errors about a missing `pcre_jit_exec` define
 # this, or recompile PCRE v1 with --enable-jit.
 #
 # Define LIBPCREDIR=/foo/bar if your PCRE header and library files are
diff --git a/grep.c b/grep.c
index 4db1510d16..5ccc0421a1 100644
--- a/grep.c
+++ b/grep.c
@@ -405,6 +405,12 @@  static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt *opt)
 		die("%s", error);
 
 #ifdef GIT_PCRE1_USE_JIT
+	if (p->pcre1_extra_info &&
+		!(p->pcre1_extra_info->flags & PCRE_EXTRA_EXECUTABLE_JIT)) {
+		/* JIT failed so fallback to the interpreter */
+		p->pcre1_jit_on = 0;
+		return;
+	}
 	pcre_config(PCRE_CONFIG_JIT, &p->pcre1_jit_on);
 	if (p->pcre1_jit_on == 1) {
 		p->pcre1_jit_stack = pcre_jit_stack_alloc(1, 1024 * 1024);