diff mbox series

[v3,7/8] golang/xenlight: Notify xenlight of SIGCHLD

Message ID 20200117155734.1067550-7-george.dunlap@citrix.com (mailing list archive)
State New, archived
Headers show
Series [v3,1/8] golang/xenlight: Don't try to marshall zero-length arrays in fromC | expand

Commit Message

George Dunlap Jan. 17, 2020, 3:57 p.m. UTC
libxl forks external processes and waits for them to complete; it
therefore needs to be notified when children exit.

In absence of instructions to the contrary, libxl sets up its own
SIGCHLD handlers.

Golang always unmasks and handles SIGCHLD itself.  libxl thankfully
notices this and throws an assert() rather than clobbering SIGCHLD
handlers.

Tell libxl that we'll be responsible for getting SIGCHLD notifications
to it.  Arrange for a channel in the context to receive notifications
on SIGCHLD, and set up a goroutine that will pass these on to libxl.

NB that every libxl context needs a notification; so multiple contexts
will each spin up their own goroutine when opening a context, and shut
it down on close.

libxl also wants to hold on to a const pointer to
xenlight_childproc_hooks rather than do a copy; so make a global
structure in C space and initialize it once on library creation.

While here, add a few comments to make the context set-up a bit easier
to follow.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
v2:
- Fix unsafe libxl_childproc_hooks pointer behavior
- Close down the SIGCHLD handler first, and make sure it's exited
  before closing the context
- Explicitly decide to have a separate goroutine per ctx

NB that due to a bug in libxl, this will hang without Ian's "[PATCH v2
00/10] libxl: event: Fix hang for some applications" series.

CC: Nick Rosbrook <rosbrookn@ainfosec.com>
CC: Ian Jackson <ian.jackson@citrix.com>
---
 tools/golang/xenlight/xenlight.go | 72 ++++++++++++++++++++++++++++++-
 1 file changed, 70 insertions(+), 2 deletions(-)

Comments

Ian Jackson Jan. 17, 2020, 4:52 p.m. UTC | #1
George Dunlap writes ("[PATCH v3 7/8] golang/xenlight: Notify xenlight of SIGCHLD"):
> libxl forks external processes and waits for them to complete; it
> therefore needs to be notified when children exit.
> 
> In absence of instructions to the contrary, libxl sets up its own
> SIGCHLD handlers.
> 
> Golang always unmasks and handles SIGCHLD itself.  libxl thankfully
> notices this and throws an assert() rather than clobbering SIGCHLD
> handlers.
> 
> Tell libxl that we'll be responsible for getting SIGCHLD notifications
> to it.  Arrange for a channel in the context to receive notifications
> on SIGCHLD, and set up a goroutine that will pass these on to libxl.
> 
> NB that every libxl context needs a notification; so multiple contexts
> will each spin up their own goroutine when opening a context, and shut
> it down on close.
> 
> libxl also wants to hold on to a const pointer to
> xenlight_childproc_hooks rather than do a copy; so make a global
> structure in C space and initialize it once on library creation.
> 
> While here, add a few comments to make the context set-up a bit easier
> to follow.

For what it's worth,

Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>

However, I don't think I understand golang (and particularly the
threading model etc.) well enough for that to mean I'm confident that
this correct.

> +func init() {
> +	// libxl for some reason wants to:
> +	// 1. Retain a copy to this pointer as long as the context is open, and
> +	// 2. Not free it when it's done

I found this comment a bit rude.  This is not an unusual approach for
a pointer in a C API.

In Rust this would be called an `immutable borrow': the ctx borrows
the contents of the pointer, promises not to modify it (and the caller
ought to promise not to modify it either); but the caller retains
ownership so when the ctx is done the borrow ends.

Normally in C the struct would be `static const', so there is no need
to allocate it or free it.

I think that ...

> +	// Rather than alloc and free multiple copies, just keep a single
> +	// static copy in the C space (since C code isn't allowed to retain pointers
> +	// to go code), and initialize it once.
> +	C.xenlight_childproc_hooks.chldowner = C.libxl_sigchld_owner_mainloop

... this is what this is doing ?

> +// This should "play nicely" with other users of SIGCHLD as long as
> +// they don't reap libxl's processes.

I assume that nothing in golang will do this.  If something calls a
non-process-specific variant of wait* then you would need to somehow
capture the results and feed them to libxl_childproc_exited.

> +// The alternate would be to register a fork callback, such that the
> +// xenlight package can make a single list of all children, and only
> +// notify the specific libxl context(s) that have children woken.  But
> +// it's not clear to me this will be much more work than having the
> +// xenlight go library do the same thing; doing it in separate go
> +// threads has the potential to do it in parallel.  Leave that as an
> +// optimization for later if it turns out to be a bottleneck.

I think this is fine.

Thanks,
Ian.
George Dunlap Jan. 17, 2020, 5:33 p.m. UTC | #2
On 1/17/20 4:52 PM, Ian Jackson wrote:
> George Dunlap writes ("[PATCH v3 7/8] golang/xenlight: Notify xenlight of SIGCHLD"):
>> libxl forks external processes and waits for them to complete; it
>> therefore needs to be notified when children exit.
>>
>> In absence of instructions to the contrary, libxl sets up its own
>> SIGCHLD handlers.
>>
>> Golang always unmasks and handles SIGCHLD itself.  libxl thankfully
>> notices this and throws an assert() rather than clobbering SIGCHLD
>> handlers.
>>
>> Tell libxl that we'll be responsible for getting SIGCHLD notifications
>> to it.  Arrange for a channel in the context to receive notifications
>> on SIGCHLD, and set up a goroutine that will pass these on to libxl.
>>
>> NB that every libxl context needs a notification; so multiple contexts
>> will each spin up their own goroutine when opening a context, and shut
>> it down on close.
>>
>> libxl also wants to hold on to a const pointer to
>> xenlight_childproc_hooks rather than do a copy; so make a global
>> structure in C space and initialize it once on library creation.
>>
>> While here, add a few comments to make the context set-up a bit easier
>> to follow.
> 
> For what it's worth,
> 
> Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
> 
> However, I don't think I understand golang (and particularly the
> threading model etc.) well enough for that to mean I'm confident that
> this correct.

Thanks -- I mainly just wanted to give you the opportunity to spot any
obvious things I was doing wrong wrt libxl.  (For instance, an earlier
version of this patch had me destroying the libxl context before
shutting down the sigchld helper, which is obviously wrong.)

>> +func init() {
>> +	// libxl for some reason wants to:
>> +	// 1. Retain a copy to this pointer as long as the context is open, and
>> +	// 2. Not free it when it's done
> 
> I found this comment a bit rude.  This is not an unusual approach for
> a pointer in a C API.>
> In Rust this would be called an `immutable borrow': the ctx borrows
> the contents of the pointer, promises not to modify it (and the caller
> ought to promise not to modify it either); but the caller retains
> ownership so when the ctx is done the borrow ends.

I'm sorry to be rude; I've deleted the comment.  But none of what you
said is obvious from the documentation; on the contrary:

---
...is equivalent to &{ libxl_sigchld_owner_libxl, 0, 0 }
---

...seems to imply that you can pass it a pointer to the stack.  And,
from an interface perspective, that seems obviously better to me --
rather than make the caller promise not to change the contents (to
who-knows-what result if they forget), it's much easier to just take a
local copy and update it with locks next time the function is called.

I was slightly more annoyed because Go's rule about C functions not
retaining pointers to Go memory meant I had to do some un-Golike things
to make this work; but that's nothing to do with libxl.

> Normally in C the struct would be `static const', so there is no need
> to allocate it or free it.
> 
> I think that ...
> 
>> +	// Rather than alloc and free multiple copies, just keep a single
>> +	// static copy in the C space (since C code isn't allowed to retain pointers
>> +	// to go code), and initialize it once.
>> +	C.xenlight_childproc_hooks.chldowner = C.libxl_sigchld_owner_mainloop
> 
> ... this is what this is doing ?

In fact, there's a global C variable declared here:

---
 #include <libxl.h>
+
+libxl_childproc_hooks xenlight_childproc_hooks;
 */
 import "C"
---

...and the line above just initialized it.  But on reflection I've
decided to do this:

---
/*

#cgo LDFLAGS: -lxenlight -lyajl -lxentoollog
#include <stdlib.h>
#include <libxl.h>

static const libxl_childproc_hooks childproc_hooks = { .chldowner =
libxl_sigchld_owner_mainloop };

void xenlight_set_chldproc(libxl_ctx *ctx) {
	libxl_childproc_setmode(ctx, &childproc_hooks, NULL);
}

*/
import "C"
---

That declares childproc_hooks as static const in the C space; and then
defines a C function which takes a libxl_ctx* and calls
libxl_childproc_setmode appropriately.  That way childproc_hooks can
enjoy the safety of static const.

>> +// The alternate would be to register a fork callback, such that the
>> +// xenlight package can make a single list of all children, and only
>> +// notify the specific libxl context(s) that have children woken.  But
>> +// it's not clear to me this will be much more work than having the
>> +// xenlight go library do the same thing; doing it in separate go
>> +// threads has the potential to do it in parallel.  Leave that as an
>> +// optimization for later if it turns out to be a bottleneck.
> 
> I think this is fine.

Thanks.

 -George
Nick Rosbrook Jan. 17, 2020, 6:13 p.m. UTC | #3
>  // Context represents a libxl_ctx.
>  type Context struct {
> -       ctx    *C.libxl_ctx
> -       logger *C.xentoollog_logger_stdiostream
> +       ctx         *C.libxl_ctx
> +       logger      *C.xentoollog_logger_stdiostream
> +       sigchld     chan os.Signal
> +       sigchldDone chan bool

It's preferred to use `chan struct{}` for this pattern; it makes it
clear that the data sent over the channel has no meaning, and is only
intended to be used for synchronization.

> +       // ...and arrange to keep that promise.
> +       ctx.sigchld = make(chan os.Signal, 2)
> +       ctx.sigchldDone = make(chan bool, 1)
> +       signal.Notify(ctx.sigchld, syscall.SIGCHLD)
> +
> +       go sigchldHandler(ctx)

It could be useful to add a comment here that explains the lifetime of
this goroutine, i.e. it returns when the context is Close()'d.

>  // Close closes the Context.
>  func (ctx *Context) Close() error {
> +       // Tell our SIGCHLD notifier to shut down, and wait for it to exit
> +       // before we free the context.
> +       if ctx.sigchld == nil {

Shouldn't this be `if ctx.sigchld != nil`?

-NR
George Dunlap Jan. 17, 2020, 6:28 p.m. UTC | #4
On 1/17/20 6:13 PM, Nick Rosbrook wrote:
>>  // Context represents a libxl_ctx.
>>  type Context struct {
>> -       ctx    *C.libxl_ctx
>> -       logger *C.xentoollog_logger_stdiostream
>> +       ctx         *C.libxl_ctx
>> +       logger      *C.xentoollog_logger_stdiostream
>> +       sigchld     chan os.Signal
>> +       sigchldDone chan bool
> 
> It's preferred to use `chan struct{}` for this pattern; it makes it
> clear that the data sent over the channel has no meaning, and is only
> intended to be used for synchronization.

OK.  I think it looks ugly, but there's certainly a signalling value to
having it really be empty.

> 
>> +       // ...and arrange to keep that promise.
>> +       ctx.sigchld = make(chan os.Signal, 2)
>> +       ctx.sigchldDone = make(chan bool, 1)
>> +       signal.Notify(ctx.sigchld, syscall.SIGCHLD)
>> +
>> +       go sigchldHandler(ctx)
> 
> It could be useful to add a comment here that explains the lifetime of
> this goroutine, i.e. it returns when the context is Close()'d.

Ack.

>>  // Close closes the Context.
>>  func (ctx *Context) Close() error {
>> +       // Tell our SIGCHLD notifier to shut down, and wait for it to exit
>> +       // before we free the context.
>> +       if ctx.sigchld == nil {
> 
> Shouldn't this be `if ctx.sigchld != nil`?

Er, yes, indeed it should.  This has gone through too many iterations...

 -George
diff mbox series

Patch

diff --git a/tools/golang/xenlight/xenlight.go b/tools/golang/xenlight/xenlight.go
index 662b266250..c462e4bb42 100644
--- a/tools/golang/xenlight/xenlight.go
+++ b/tools/golang/xenlight/xenlight.go
@@ -20,6 +20,8 @@  package xenlight
 #cgo LDFLAGS: -lxenlight -lyajl -lxentoollog
 #include <stdlib.h>
 #include <libxl.h>
+
+libxl_childproc_hooks xenlight_childproc_hooks;
 */
 import "C"
 
@@ -33,6 +35,9 @@  import "C"
 
 import (
 	"fmt"
+	"os"
+	"os/signal"
+	"syscall"
 	"unsafe"
 )
 
@@ -72,10 +77,49 @@  func (e Error) Error() string {
 	return fmt.Sprintf("libxl error: %d", e)
 }
 
+func init() {
+	// libxl for some reason wants to:
+	// 1. Retain a copy to this pointer as long as the context is open, and
+	// 2. Not free it when it's done
+	//
+	// Rather than alloc and free multiple copies, just keep a single
+	// static copy in the C space (since C code isn't allowed to retain pointers
+	// to go code), and initialize it once.
+	C.xenlight_childproc_hooks.chldowner = C.libxl_sigchld_owner_mainloop
+}
+
 // Context represents a libxl_ctx.
 type Context struct {
-	ctx    *C.libxl_ctx
-	logger *C.xentoollog_logger_stdiostream
+	ctx         *C.libxl_ctx
+	logger      *C.xentoollog_logger_stdiostream
+	sigchld     chan os.Signal
+	sigchldDone chan bool
+}
+
+// Golang always unmasks SIGCHLD, and internally has ways of
+// distributing SIGCHLD to multiple recipients.  libxl has provision
+// for this model: just tell it when a SIGCHLD happened, and it will
+// look after its own processes.
+//
+// This should "play nicely" with other users of SIGCHLD as long as
+// they don't reap libxl's processes.
+//
+// Every context needs to be notified on each SIGCHLD; so spin up a
+// new goroutine for each context.  If there are a large number of contexts,
+// this means each context will be woken up looking through its own list of children.
+//
+// The alternate would be to register a fork callback, such that the
+// xenlight package can make a single list of all children, and only
+// notify the specific libxl context(s) that have children woken.  But
+// it's not clear to me this will be much more work than having the
+// xenlight go library do the same thing; doing it in separate go
+// threads has the potential to do it in parallel.  Leave that as an
+// optimization for later if it turns out to be a bottleneck.
+func sigchldHandler(ctx *Context) {
+	for _ = range ctx.sigchld {
+		go C.libxl_childproc_sigchld_occurred(ctx.ctx)
+	}
+	close(ctx.sigchldDone)
 }
 
 // NewContext returns a new Context.
@@ -89,19 +133,43 @@  func NewContext() (ctx *Context, err error) {
 		}
 	}()
 
+	// Create a logger
 	ctx.logger = C.xtl_createlogger_stdiostream(C.stderr, C.XTL_DEBUG, 0)
 
+	// Allocate a context
 	ret := C.libxl_ctx_alloc(&ctx.ctx, C.LIBXL_VERSION, 0,
 		(*C.xentoollog_logger)(unsafe.Pointer(ctx.logger)))
 	if ret != 0 {
 		return ctx, Error(ret)
 	}
 
+	// Tell libxl that we'll be dealing with SIGCHLD...
+	C.libxl_childproc_setmode(ctx.ctx, &C.xenlight_childproc_hooks, nil)
+
+	// ...and arrange to keep that promise.
+	ctx.sigchld = make(chan os.Signal, 2)
+	ctx.sigchldDone = make(chan bool, 1)
+	signal.Notify(ctx.sigchld, syscall.SIGCHLD)
+
+	go sigchldHandler(ctx)
+
 	return ctx, nil
 }
 
 // Close closes the Context.
 func (ctx *Context) Close() error {
+	// Tell our SIGCHLD notifier to shut down, and wait for it to exit
+	// before we free the context.
+	if ctx.sigchld == nil {
+		signal.Stop(ctx.sigchld)
+		close(ctx.sigchld)
+
+		<-ctx.sigchldDone
+
+		ctx.sigchld = nil
+		ctx.sigchldDone = nil
+	}
+
 	if ctx.ctx != nil {
 		ret := C.libxl_ctx_free(ctx.ctx)
 		if ret != 0 {