diff mbox series

[userspace] sepolicy: generate man pages in parallel

Message ID 20191014080647.19602-1-omosnace@redhat.com (mailing list archive)
State Superseded
Headers show
Series [userspace] sepolicy: generate man pages in parallel | expand

Commit Message

Ondrej Mosnacek Oct. 14, 2019, 8:06 a.m. UTC
Generating man pages takes a lot of time. Do it in parallel to speed up
the process.

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
---
 python/sepolicy/sepolicy.py | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

Comments

Stephen Smalley Oct. 17, 2019, 5:14 p.m. UTC | #1
On 10/14/19 4:06 AM, Ondrej Mosnacek wrote:
> Generating man pages takes a lot of time. Do it in parallel to speed up
> the process.
> 
> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>

Acked-by: Stephen Smalley <sds@tycho.nsa.gov>

> ---
>   python/sepolicy/sepolicy.py | 14 ++++++++++----
>   1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/python/sepolicy/sepolicy.py b/python/sepolicy/sepolicy.py
> index 1934cd86..02094013 100755
> --- a/python/sepolicy/sepolicy.py
> +++ b/python/sepolicy/sepolicy.py
> @@ -25,6 +25,7 @@ import os
>   import sys
>   import selinux
>   import sepolicy
> +from concurrent.futures import ProcessPoolExecutor
>   from sepolicy import get_os_version, get_conditionals, get_conditionals_format_text
>   import argparse
>   PROGNAME = "policycoreutils"
> @@ -326,8 +327,13 @@ def gen_gui_args(parser):
>       gui.set_defaults(func=gui_run)
>   
>   
> +def manpage_work(domain, path, root, source_files, web):
> +    from sepolicy.manpage import ManPage
> +    m = ManPage(domain, path, root, source_files, web)
> +    print(m.get_man_page_path())
> +
>   def manpage(args):
> -    from sepolicy.manpage import ManPage, HTMLManPages, manpage_domains, manpage_roles, gen_domains
> +    from sepolicy.manpage import HTMLManPages, manpage_domains, manpage_roles, gen_domains
>   
>       path = args.path
>       if not args.policy and args.root != "/":
> @@ -340,9 +346,9 @@ def manpage(args):
>       else:
>           test_domains = args.domain
>   
> -    for domain in test_domains:
> -        m = ManPage(domain, path, args.root, args.source_files, args.web)
> -        print(m.get_man_page_path())
> +    with ProcessPoolExecutor() as e:
> +        for domain in test_domains:
> +            e.submit(manpage_work, domain, path, args.root, args.source_files, args.web)
>   
>       if args.web:
>           HTMLManPages(manpage_roles, manpage_domains, path, args.os)
>
Ondrej Mosnacek Oct. 18, 2019, 7:44 a.m. UTC | #2
On Thu, Oct 17, 2019 at 7:15 PM Stephen Smalley <sds@tycho.nsa.gov> wrote:
> On 10/14/19 4:06 AM, Ondrej Mosnacek wrote:
> > Generating man pages takes a lot of time. Do it in parallel to speed up
> > the process.
> >
> > Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
>
> Acked-by: Stephen Smalley <sds@tycho.nsa.gov>

Thank you for the ack, however I discovered that after this change it
becomes more difficult to end the program via KeyboardInterrupt
(SIGINT). The first interrupt only stops the main process and you need
to send several more to take down the background processes as well...

I found a different way (multiprocessing.Pool) to do the same, which
ends the processing gracefully on interrupt, but that one behaves even
worse under Python 2 (each interrupt only cancels one work item and
the processing happily continues...). Since there are plans to support
only Python 3 in 3.0+ this may not be an issue, but I could also add a
few lines to fallback to sequential execution under Python 2 for the
sake of compatibility. Would that be OK or should I not bother?

Either way I'd like to send a v2 that uses multiprocessing instead of
concurrent.futures, so please don't merge this yet :)

FYI, here is a preliminary diff for a switch to multiprocessing.Pool:
https://github.com/WOnder93/selinux/commit/a33acec8c298c112f5412b8b61b5b09058a267ee

...and here is what the Python 2 fallback would look like:
https://github.com/WOnder93/selinux/commit/b39a12120656b50eb0a1ee01227646ba3cd63f15

>
> > ---
> >   python/sepolicy/sepolicy.py | 14 ++++++++++----
> >   1 file changed, 10 insertions(+), 4 deletions(-)
> >
> > diff --git a/python/sepolicy/sepolicy.py b/python/sepolicy/sepolicy.py
> > index 1934cd86..02094013 100755
> > --- a/python/sepolicy/sepolicy.py
> > +++ b/python/sepolicy/sepolicy.py
> > @@ -25,6 +25,7 @@ import os
> >   import sys
> >   import selinux
> >   import sepolicy
> > +from concurrent.futures import ProcessPoolExecutor
> >   from sepolicy import get_os_version, get_conditionals, get_conditionals_format_text
> >   import argparse
> >   PROGNAME = "policycoreutils"
> > @@ -326,8 +327,13 @@ def gen_gui_args(parser):
> >       gui.set_defaults(func=gui_run)
> >
> >
> > +def manpage_work(domain, path, root, source_files, web):
> > +    from sepolicy.manpage import ManPage
> > +    m = ManPage(domain, path, root, source_files, web)
> > +    print(m.get_man_page_path())
> > +
> >   def manpage(args):
> > -    from sepolicy.manpage import ManPage, HTMLManPages, manpage_domains, manpage_roles, gen_domains
> > +    from sepolicy.manpage import HTMLManPages, manpage_domains, manpage_roles, gen_domains
> >
> >       path = args.path
> >       if not args.policy and args.root != "/":
> > @@ -340,9 +346,9 @@ def manpage(args):
> >       else:
> >           test_domains = args.domain
> >
> > -    for domain in test_domains:
> > -        m = ManPage(domain, path, args.root, args.source_files, args.web)
> > -        print(m.get_man_page_path())
> > +    with ProcessPoolExecutor() as e:
> > +        for domain in test_domains:
> > +            e.submit(manpage_work, domain, path, args.root, args.source_files, args.web)
> >
> >       if args.web:
> >           HTMLManPages(manpage_roles, manpage_domains, path, args.os)
> >
>
Chris PeBenito Oct. 18, 2019, 9 a.m. UTC | #3
On 10/18/19 3:44 AM, Ondrej Mosnacek wrote:
> Since there are plans to support
> only Python 3 in 3.0+ this may not be an issue, but I could also add a
> few lines to fallback to sequential execution under Python 2 for the
> sake of compatibility. Would that be OK or should I not bother?

Python 2 end of life is in less than 2 months.  Please don't add new 
code only for Python 2 compatibility.
Chris PeBenito Oct. 18, 2019, 9:01 a.m. UTC | #4
On 10/18/19 5:00 AM, Chris PeBenito wrote:
> On 10/18/19 3:44 AM, Ondrej Mosnacek wrote:
>> Since there are plans to support
>> only Python 3 in 3.0+ this may not be an issue, but I could also add a
>> few lines to fallback to sequential execution under Python 2 for the
>> sake of compatibility. Would that be OK or should I not bother?
> 
> Python 2 end of life is in less than 2 months.  Please don't add new 
> code only for Python 2 compatibility.

I can't count.  It's a little over 2 months.  The point still stands :)
Ondrej Mosnacek Oct. 18, 2019, 9:22 a.m. UTC | #5
On Fri, Oct 18, 2019 at 11:01 AM Chris PeBenito <pebenito@ieee.org> wrote:
> On 10/18/19 5:00 AM, Chris PeBenito wrote:
> > On 10/18/19 3:44 AM, Ondrej Mosnacek wrote:
> >> Since there are plans to support
> >> only Python 3 in 3.0+ this may not be an issue, but I could also add a
> >> few lines to fallback to sequential execution under Python 2 for the
> >> sake of compatibility. Would that be OK or should I not bother?
> >
> > Python 2 end of life is in less than 2 months.  Please don't add new
> > code only for Python 2 compatibility.
>
> I can't count.  It's a little over 2 months.  The point still stands :)

OK, I posted a v2 without the fallback.
diff mbox series

Patch

diff --git a/python/sepolicy/sepolicy.py b/python/sepolicy/sepolicy.py
index 1934cd86..02094013 100755
--- a/python/sepolicy/sepolicy.py
+++ b/python/sepolicy/sepolicy.py
@@ -25,6 +25,7 @@  import os
 import sys
 import selinux
 import sepolicy
+from concurrent.futures import ProcessPoolExecutor
 from sepolicy import get_os_version, get_conditionals, get_conditionals_format_text
 import argparse
 PROGNAME = "policycoreutils"
@@ -326,8 +327,13 @@  def gen_gui_args(parser):
     gui.set_defaults(func=gui_run)
 
 
+def manpage_work(domain, path, root, source_files, web):
+    from sepolicy.manpage import ManPage
+    m = ManPage(domain, path, root, source_files, web)
+    print(m.get_man_page_path())
+
 def manpage(args):
-    from sepolicy.manpage import ManPage, HTMLManPages, manpage_domains, manpage_roles, gen_domains
+    from sepolicy.manpage import HTMLManPages, manpage_domains, manpage_roles, gen_domains
 
     path = args.path
     if not args.policy and args.root != "/":
@@ -340,9 +346,9 @@  def manpage(args):
     else:
         test_domains = args.domain
 
-    for domain in test_domains:
-        m = ManPage(domain, path, args.root, args.source_files, args.web)
-        print(m.get_man_page_path())
+    with ProcessPoolExecutor() as e:
+        for domain in test_domains:
+            e.submit(manpage_work, domain, path, args.root, args.source_files, args.web)
 
     if args.web:
         HTMLManPages(manpage_roles, manpage_domains, path, args.os)