Message ID | 20221219085336.391225-1-inseob@google.com (mailing list archive) |
---|---|
State | Superseded, archived |
Delegated to: | Jason Zaman |
Headers | show |
Series | libselinux: Workaround for heap overhead of pcre | expand |
On Mon, 19 Dec 2022 at 09:53, Inseob Kim <inseob@google.com> wrote: > > pcre's behavior is changed so that pcre2_match always allocates heap for > match_data, rather than stack, regardless of size. The heap isn't freed > until explicitly calling pcre2_match_data_free. This new behavior may > result in heap overhead, which may increase the peak memory usage about > a few megabytes. It's because regex_match is first called for regex_data > objects, and then regex_data objects are freed at once. This approach trades peak memory usage for temporary allocations, which effects runtime performance. On modern systems memory is most of the time not a scarce resource. Some examples: # selabel_lookup -b file -k /etc/shadow -t file [heaptrack] ## current total runtime: 0.07s. calls to allocation functions: 28420 (406000/s) temporary memory allocations: 16 (228/s) peak heap memory consumption: 10.09M peak RSS (including heaptrack overhead): 21.27M total memory leaked: 1.02K ## proposed total runtime: 0.06s. calls to allocation functions: 23430 (366093/s) temporary memory allocations: 675 (10546/s) peak heap memory consumption: 9.48M peak RSS (including heaptrack overhead): 18.59M total memory leaked: 1.02K # restorecon -vRn /etc [heaptrack] ## current total runtime: 0.14s. calls to allocation functions: 33873 (236874/s) temporary memory allocations: 1877 (13125/s) peak heap memory consumption: 10.09M peak RSS (including heaptrack overhead): 21.58M total memory leaked: 1.90K ## proposed total runtime: 0.27s. calls to allocation functions: 378762 (1423917/s) temporary memory allocations: 351487 (1321379/s) peak heap memory consumption: 9.48M peak RSS (including heaptrack overhead): 20.99M total memory leaked: 1.90K # restorecon -vRn /usr [hyperfine] ## current restorecon -vRn /usr Benchmark 1: ~/destdir/sbin/restorecon -vRn /usr Time (mean ± σ): 24.419 s ± 0.661 s [User: 23.480 s, System: 0.922 s] Range (min … max): 23.399 s … 25.495 s 10 runs ## proposed restorecon -vRn /usr Benchmark 1: ~/destdir/sbin/restorecon -vRn /usr Time (mean ± σ): 28.628 s ± 0.968 s [User: 27.688 s, System: 0.927 s] Range (min … max): 27.674 s … 30.798 s 10 runs So I would argue the performance impact for applications (like setfiles, restorecon) or daemon (like systemd, udev) is more critical than the 500K per application. > To workaround it, free and reallocate match_data whenever we call > regex_match. It's fine because libselinux currently doesn't use > match_data, but use only the return value. > > Signed-off-by: Inseob Kim <inseob@google.com> > --- > libselinux/src/regex.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/libselinux/src/regex.c b/libselinux/src/regex.c > index 149a7973..2df282f1 100644 > --- a/libselinux/src/regex.c > +++ b/libselinux/src/regex.c > @@ -213,10 +213,20 @@ void regex_data_free(struct regex_data *regex) > int regex_match(struct regex_data *regex, char const *subject, int partial) > { > int rc; > + pcre2_match_data *new_match_data; > __pthread_mutex_lock(®ex->match_mutex); > + new_match_data = pcre2_match_data_create_from_pattern( > + regex->regex, NULL); Should be checked for failure (cause pcre2_match() expects a non-NULL match_data, which would be passed the second time). Also with this change the member match_data of the struct regex_data becomes obsolete and should be removed. > rc = pcre2_match( > regex->regex, (PCRE2_SPTR)subject, PCRE2_ZERO_TERMINATED, 0, > partial ? PCRE2_PARTIAL_SOFT : 0, regex->match_data, NULL); > + // pcre2_match allocates heap and it won't be freed until > + // pcre2_match_data_free, resulting in heap overhead. > + // Reallocate match_data to prevent such overhead, whenever possible. > + if (new_match_data) { > + pcre2_match_data_free(regex->match_data); > + regex->match_data = new_match_data; > + } > __pthread_mutex_unlock(®ex->match_mutex); > if (rc > 0) > return REGEX_MATCH; > -- > 2.39.0.314.g84b9a713c41-goog >
On Wed, Dec 21, 2022 at 12:02 AM Christian Göttsche <cgzones@googlemail.com> wrote: > > On Mon, 19 Dec 2022 at 09:53, Inseob Kim <inseob@google.com> wrote: > > > > pcre's behavior is changed so that pcre2_match always allocates heap for > > match_data, rather than stack, regardless of size. The heap isn't freed > > until explicitly calling pcre2_match_data_free. This new behavior may > > result in heap overhead, which may increase the peak memory usage about > > a few megabytes. It's because regex_match is first called for regex_data > > objects, and then regex_data objects are freed at once. > > This approach trades peak memory usage for temporary allocations, > which effects runtime performance. On modern systems memory is most > of the time not a scarce resource. > > Some examples: > > # selabel_lookup -b file -k /etc/shadow -t file [heaptrack] > > ## current > > total runtime: 0.07s. > calls to allocation functions: 28420 (406000/s) > temporary memory allocations: 16 (228/s) > peak heap memory consumption: 10.09M > peak RSS (including heaptrack overhead): 21.27M > total memory leaked: 1.02K > > ## proposed > > total runtime: 0.06s. > calls to allocation functions: 23430 (366093/s) > temporary memory allocations: 675 (10546/s) > peak heap memory consumption: 9.48M > peak RSS (including heaptrack overhead): 18.59M > total memory leaked: 1.02K > > # restorecon -vRn /etc [heaptrack] > > ## current > > total runtime: 0.14s. > calls to allocation functions: 33873 (236874/s) > temporary memory allocations: 1877 (13125/s) > peak heap memory consumption: 10.09M > peak RSS (including heaptrack overhead): 21.58M > total memory leaked: 1.90K > > ## proposed > > total runtime: 0.27s. > calls to allocation functions: 378762 (1423917/s) > temporary memory allocations: 351487 (1321379/s) > peak heap memory consumption: 9.48M > peak RSS (including heaptrack overhead): 20.99M > total memory leaked: 1.90K > > > # restorecon -vRn /usr [hyperfine] > > ## current > > restorecon -vRn /usr > Benchmark 1: ~/destdir/sbin/restorecon -vRn /usr > Time (mean ± σ): 24.419 s ± 0.661 s [User: 23.480 s, System: 0.922 s] > Range (min … max): 23.399 s … 25.495 s 10 runs > > ## proposed > > restorecon -vRn /usr > Benchmark 1: ~/destdir/sbin/restorecon -vRn /usr > Time (mean ± σ): 28.628 s ± 0.968 s [User: 27.688 s, System: 0.927 s] > Range (min … max): 27.674 s … 30.798 s 10 runs > > > So I would argue the performance impact for applications (like > setfiles, restorecon) or daemon (like systemd, udev) is more critical > than the 500K per application. I observed about 3~4MB increase on Android device. Which pcre2 version are you using? Does it include https://github.com/PCRE2Project/pcre2/commit/d90fb238#diff-15ec3f4ed916f52c810daf305702985dda6d8d45e7ce22e2f309c95bd6ef32b7R74 ? And if this is difficult to apply, how about adding a new flag e.g. AGGRESSIVE_FREE_AFTER_REGEX_MATCH ? > > > To workaround it, free and reallocate match_data whenever we call > > regex_match. It's fine because libselinux currently doesn't use > > match_data, but use only the return value. > > > > Signed-off-by: Inseob Kim <inseob@google.com> > > --- > > libselinux/src/regex.c | 10 ++++++++++ > > 1 file changed, 10 insertions(+) > > > > diff --git a/libselinux/src/regex.c b/libselinux/src/regex.c > > index 149a7973..2df282f1 100644 > > --- a/libselinux/src/regex.c > > +++ b/libselinux/src/regex.c > > @@ -213,10 +213,20 @@ void regex_data_free(struct regex_data *regex) > > int regex_match(struct regex_data *regex, char const *subject, int partial) > > { > > int rc; > > + pcre2_match_data *new_match_data; > > __pthread_mutex_lock(®ex->match_mutex); > > + new_match_data = pcre2_match_data_create_from_pattern( > > + regex->regex, NULL); > > Should be checked for failure (cause pcre2_match() expects a non-NULL > match_data, which would be passed the second time). > > Also with this change the member match_data of the struct regex_data > becomes obsolete and should be removed. Thanks, this makes sense. > > > rc = pcre2_match( > > regex->regex, (PCRE2_SPTR)subject, PCRE2_ZERO_TERMINATED, 0, > > partial ? PCRE2_PARTIAL_SOFT : 0, regex->match_data, NULL); > > + // pcre2_match allocates heap and it won't be freed until > > + // pcre2_match_data_free, resulting in heap overhead. > > + // Reallocate match_data to prevent such overhead, whenever possible. > > + if (new_match_data) { > > + pcre2_match_data_free(regex->match_data); > > + regex->match_data = new_match_data; > > + } > > __pthread_mutex_unlock(®ex->match_mutex); > > if (rc > 0) > > return REGEX_MATCH; > > -- > > 2.39.0.314.g84b9a713c41-goog > >
On Wed, 21 Dec 2022 at 03:53, Inseob Kim <inseob@google.com> wrote: > > On Wed, Dec 21, 2022 at 12:02 AM Christian Göttsche > <cgzones@googlemail.com> wrote: > > > > On Mon, 19 Dec 2022 at 09:53, Inseob Kim <inseob@google.com> wrote: > > > > > > pcre's behavior is changed so that pcre2_match always allocates heap for > > > match_data, rather than stack, regardless of size. The heap isn't freed > > > until explicitly calling pcre2_match_data_free. This new behavior may > > > result in heap overhead, which may increase the peak memory usage about > > > a few megabytes. It's because regex_match is first called for regex_data > > > objects, and then regex_data objects are freed at once. > > > > This approach trades peak memory usage for temporary allocations, > > which effects runtime performance. On modern systems memory is most > > of the time not a scarce resource. > > > > Some examples: > > > > # selabel_lookup -b file -k /etc/shadow -t file [heaptrack] > > > > ## current > > > > total runtime: 0.07s. > > calls to allocation functions: 28420 (406000/s) > > temporary memory allocations: 16 (228/s) > > peak heap memory consumption: 10.09M > > peak RSS (including heaptrack overhead): 21.27M > > total memory leaked: 1.02K > > > > ## proposed > > > > total runtime: 0.06s. > > calls to allocation functions: 23430 (366093/s) > > temporary memory allocations: 675 (10546/s) > > peak heap memory consumption: 9.48M > > peak RSS (including heaptrack overhead): 18.59M > > total memory leaked: 1.02K > > > > # restorecon -vRn /etc [heaptrack] > > > > ## current > > > > total runtime: 0.14s. > > calls to allocation functions: 33873 (236874/s) > > temporary memory allocations: 1877 (13125/s) > > peak heap memory consumption: 10.09M > > peak RSS (including heaptrack overhead): 21.58M > > total memory leaked: 1.90K > > > > ## proposed > > > > total runtime: 0.27s. > > calls to allocation functions: 378762 (1423917/s) > > temporary memory allocations: 351487 (1321379/s) > > peak heap memory consumption: 9.48M > > peak RSS (including heaptrack overhead): 20.99M > > total memory leaked: 1.90K > > > > > > # restorecon -vRn /usr [hyperfine] > > > > ## current > > > > restorecon -vRn /usr > > Benchmark 1: ~/destdir/sbin/restorecon -vRn /usr > > Time (mean ± σ): 24.419 s ± 0.661 s [User: 23.480 s, System: 0.922 s] > > Range (min … max): 23.399 s … 25.495 s 10 runs > > > > ## proposed > > > > restorecon -vRn /usr > > Benchmark 1: ~/destdir/sbin/restorecon -vRn /usr > > Time (mean ± σ): 28.628 s ± 0.968 s [User: 27.688 s, System: 0.927 s] > > Range (min … max): 27.674 s … 30.798 s 10 runs > > > > > > So I would argue the performance impact for applications (like > > setfiles, restorecon) or daemon (like systemd, udev) is more critical > > than the 500K per application. > > I observed about 3~4MB increase on Android device. Which pcre2 version > are you using? Does it include > https://github.com/PCRE2Project/pcre2/commit/d90fb238#diff-15ec3f4ed916f52c810daf305702985dda6d8d45e7ce22e2f309c95bd6ef32b7R74 > ? Those measurements were done against pcre2 version 10.40. Indeed with version 10.42 the peak memory usage of the current implementation has grown and is reduced by this patch: # selabel_lookup -b file -k /etc/shadow -t file [heaptrack] ## current total runtime: 0.07s. calls to allocation functions: 29080 (421449/s) temporary memory allocations: 15 (217/s) peak heap memory consumption: 22.72M peak RSS (including heaptrack overhead): 18.68M total memory leaked: 1.02K ## proposed total runtime: 0.06s. calls to allocation functions: 24090 (376406/s) temporary memory allocations: 675 (10546/s) peak heap memory consumption: 9.48M peak RSS (including heaptrack overhead): 22.15M total memory leaked: 1.02K # restorecon -vRn /etc [heaptrack] ## current total runtime: 0.13s. calls to allocation functions: 34720 (259104/s) temporary memory allocations: 1872 (13970/s) peak heap memory consumption: 26.66M peak RSS (including heaptrack overhead): 25.59M total memory leaked: 1.90K ## proposed total runtime: 0.43s. calls to allocation functions: 729301 (1692113/s) temporary memory allocations: 351486 (815512/s) peak heap memory consumption: 9.48M peak RSS (including heaptrack overhead): 20.98M total memory leaked: 1.90K # restorecon -vRn /usr [hyperfine] ## current Benchmark 1: ~/destdir/sbin/restorecon -vRn /usr Time (mean ± σ): 31.053 s ± 1.491 s [User: 29.926 s, System: 1.105 s] Range (min … max): 28.817 s … 33.679 s 10 runs ## proposed Benchmark 1: ~/destdir/sbin/restorecon -vRn /usr Time (mean ± σ): 37.829 s ± 0.416 s [User: 36.935 s, System: 0.875 s] Range (min … max): 37.205 s … 38.764 s 10 runs > > And if this is difficult to apply, how about adding a new flag e.g. > AGGRESSIVE_FREE_AFTER_REGEX_MATCH ? > > > > > > To workaround it, free and reallocate match_data whenever we call > > > regex_match. It's fine because libselinux currently doesn't use > > > match_data, but use only the return value. > > > > > > Signed-off-by: Inseob Kim <inseob@google.com> > > > --- > > > libselinux/src/regex.c | 10 ++++++++++ > > > 1 file changed, 10 insertions(+) > > > > > > diff --git a/libselinux/src/regex.c b/libselinux/src/regex.c > > > index 149a7973..2df282f1 100644 > > > --- a/libselinux/src/regex.c > > > +++ b/libselinux/src/regex.c > > > @@ -213,10 +213,20 @@ void regex_data_free(struct regex_data *regex) > > > int regex_match(struct regex_data *regex, char const *subject, int partial) > > > { > > > int rc; > > > + pcre2_match_data *new_match_data; > > > __pthread_mutex_lock(®ex->match_mutex); > > > + new_match_data = pcre2_match_data_create_from_pattern( > > > + regex->regex, NULL); > > > > Should be checked for failure (cause pcre2_match() expects a non-NULL > > match_data, which would be passed the second time). > > > > Also with this change the member match_data of the struct regex_data > > becomes obsolete and should be removed. > > Thanks, this makes sense. > > > > > > rc = pcre2_match( > > > regex->regex, (PCRE2_SPTR)subject, PCRE2_ZERO_TERMINATED, 0, > > > partial ? PCRE2_PARTIAL_SOFT : 0, regex->match_data, NULL); > > > + // pcre2_match allocates heap and it won't be freed until > > > + // pcre2_match_data_free, resulting in heap overhead. > > > + // Reallocate match_data to prevent such overhead, whenever possible. > > > + if (new_match_data) { > > > + pcre2_match_data_free(regex->match_data); > > > + regex->match_data = new_match_data; > > > + } > > > __pthread_mutex_unlock(®ex->match_mutex); > > > if (rc > 0) > > > return REGEX_MATCH; > > > -- > > > 2.39.0.314.g84b9a713c41-goog > > > > > > > -- > Inseob Kim | Software Engineer | inseob@google.com
diff --git a/libselinux/src/regex.c b/libselinux/src/regex.c index 149a7973..2df282f1 100644 --- a/libselinux/src/regex.c +++ b/libselinux/src/regex.c @@ -213,10 +213,20 @@ void regex_data_free(struct regex_data *regex) int regex_match(struct regex_data *regex, char const *subject, int partial) { int rc; + pcre2_match_data *new_match_data; __pthread_mutex_lock(®ex->match_mutex); + new_match_data = pcre2_match_data_create_from_pattern( + regex->regex, NULL); rc = pcre2_match( regex->regex, (PCRE2_SPTR)subject, PCRE2_ZERO_TERMINATED, 0, partial ? PCRE2_PARTIAL_SOFT : 0, regex->match_data, NULL); + // pcre2_match allocates heap and it won't be freed until + // pcre2_match_data_free, resulting in heap overhead. + // Reallocate match_data to prevent such overhead, whenever possible. + if (new_match_data) { + pcre2_match_data_free(regex->match_data); + regex->match_data = new_match_data; + } __pthread_mutex_unlock(®ex->match_mutex); if (rc > 0) return REGEX_MATCH;
pcre's behavior is changed so that pcre2_match always allocates heap for match_data, rather than stack, regardless of size. The heap isn't freed until explicitly calling pcre2_match_data_free. This new behavior may result in heap overhead, which may increase the peak memory usage about a few megabytes. It's because regex_match is first called for regex_data objects, and then regex_data objects are freed at once. To workaround it, free and reallocate match_data whenever we call regex_match. It's fine because libselinux currently doesn't use match_data, but use only the return value. Signed-off-by: Inseob Kim <inseob@google.com> --- libselinux/src/regex.c | 10 ++++++++++ 1 file changed, 10 insertions(+)