Message ID | 20200430154606.6421-1-imammedo@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | hostmem: don't use mbind() if host-nodes is epmty | expand |
Typo "empty" in patch subject. On 4/30/20 5:46 PM, Igor Mammedov wrote: > Since 5.0 QEMU uses hostmem backend for allocating main guest RAM. > The backend however calls mbind() which is typically NOP > in case of default policy/absent host-nodes bitmap. > However when runing in container with black-listed mbind() > syscall, QEMU fails to start with error > "cannot bind memory to host NUMA nodes: Operation not permitted" > even when user hasn't provided host-nodes to pin to explictly > (which is the case with -m option) > > To fix issue, call mbind() only in case when user has provided > host-nodes explicitly (i.e. host_nodes bitmap is not empty). > That should allow to run QEMU in containers with black-listed > mbind() without memory pinning. If QEMU provided memory-pinning > is required user still has to white-list mbind() in container > configuration. > > Reported-by: Manuel Hohmann <mhohmann@physnet.uni-hamburg.de> > Signed-off-by: Igor Mammedov <imammedo@redhat.com> > --- > CC: berrange@redhat.com > CC: ehabkost@redhat.com > CC: pbonzini@redhat.com > CC: mhohmann@physnet.uni-hamburg.de > CC: qemu-stable@nongnu.org > --- > backends/hostmem.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/backends/hostmem.c b/backends/hostmem.c > index 327f9eebc3..0efd7b7bd6 100644 > --- a/backends/hostmem.c > +++ b/backends/hostmem.c > @@ -383,8 +383,10 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp) > assert(sizeof(backend->host_nodes) >= > BITS_TO_LONGS(MAX_NODES + 1) * sizeof(unsigned long)); > assert(maxnode <= MAX_NODES); > - if (mbind(ptr, sz, backend->policy, > - maxnode ? backend->host_nodes : NULL, maxnode + 1, flags)) { > + > + if (maxnode && > + mbind(ptr, sz, backend->policy, backend->host_nodes, maxnode + 1, > + flags)) { > if (backend->policy != MPOL_DEFAULT || errno != ENOSYS) { > error_setg_errno(errp, errno, > "cannot bind memory to host NUMA nodes"); >
Thanks! I applied the patch, and now it works also inside the docker container, for all architectures (i386, x86_64, arm, aarch64) for which I have test cases at hand. Indeed, since the container is configured by a public cloud service, there is no possibility to change any security settings. Disabling mbind unless explicitly requested seems to be the best way to go here. On 30.04.20 19:42, Philippe Mathieu-Daudé wrote: > Typo "empty" in patch subject. > > On 4/30/20 5:46 PM, Igor Mammedov wrote: >> Since 5.0 QEMU uses hostmem backend for allocating main guest RAM. >> The backend however calls mbind() which is typically NOP >> in case of default policy/absent host-nodes bitmap. >> However when runing in container with black-listed mbind() >> syscall, QEMU fails to start with error >> "cannot bind memory to host NUMA nodes: Operation not permitted" >> even when user hasn't provided host-nodes to pin to explictly >> (which is the case with -m option) >> >> To fix issue, call mbind() only in case when user has provided >> host-nodes explicitly (i.e. host_nodes bitmap is not empty). >> That should allow to run QEMU in containers with black-listed >> mbind() without memory pinning. If QEMU provided memory-pinning >> is required user still has to white-list mbind() in container >> configuration. >> >> Reported-by: Manuel Hohmann <mhohmann@physnet.uni-hamburg.de> >> Signed-off-by: Igor Mammedov <imammedo@redhat.com> >> --- >> CC: berrange@redhat.com >> CC: ehabkost@redhat.com >> CC: pbonzini@redhat.com >> CC: mhohmann@physnet.uni-hamburg.de >> CC: qemu-stable@nongnu.org >> --- >> backends/hostmem.c | 6 ++++-- >> 1 file changed, 4 insertions(+), 2 deletions(-) >> >> diff --git a/backends/hostmem.c b/backends/hostmem.c >> index 327f9eebc3..0efd7b7bd6 100644 >> --- a/backends/hostmem.c >> +++ b/backends/hostmem.c >> @@ -383,8 +383,10 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp) >> assert(sizeof(backend->host_nodes) >= >> BITS_TO_LONGS(MAX_NODES + 1) * sizeof(unsigned long)); >> assert(maxnode <= MAX_NODES); >> - if (mbind(ptr, sz, backend->policy, >> - maxnode ? backend->host_nodes : NULL, maxnode + 1, flags)) { >> + >> + if (maxnode && >> + mbind(ptr, sz, backend->policy, backend->host_nodes, maxnode + 1, >> + flags)) { >> if (backend->policy != MPOL_DEFAULT || errno != ENOSYS) { >> error_setg_errno(errp, errno, >> "cannot bind memory to host NUMA nodes"); >> >
On Thu, Apr 30, 2020 at 11:46:06AM -0400, Igor Mammedov wrote: > Since 5.0 QEMU uses hostmem backend for allocating main guest RAM. > The backend however calls mbind() which is typically NOP > in case of default policy/absent host-nodes bitmap. > However when runing in container with black-listed mbind() > syscall, QEMU fails to start with error > "cannot bind memory to host NUMA nodes: Operation not permitted" > even when user hasn't provided host-nodes to pin to explictly > (which is the case with -m option) > > To fix issue, call mbind() only in case when user has provided > host-nodes explicitly (i.e. host_nodes bitmap is not empty). > That should allow to run QEMU in containers with black-listed > mbind() without memory pinning. If QEMU provided memory-pinning > is required user still has to white-list mbind() in container > configuration. > > Reported-by: Manuel Hohmann <mhohmann@physnet.uni-hamburg.de> > Signed-off-by: Igor Mammedov <imammedo@redhat.com> > --- > CC: berrange@redhat.com > CC: ehabkost@redhat.com > CC: pbonzini@redhat.com > CC: mhohmann@physnet.uni-hamburg.de > CC: qemu-stable@nongnu.org > --- > backends/hostmem.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/backends/hostmem.c b/backends/hostmem.c > index 327f9eebc3..0efd7b7bd6 100644 > --- a/backends/hostmem.c > +++ b/backends/hostmem.c > @@ -383,8 +383,10 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp) > assert(sizeof(backend->host_nodes) >= > BITS_TO_LONGS(MAX_NODES + 1) * sizeof(unsigned long)); > assert(maxnode <= MAX_NODES); > - if (mbind(ptr, sz, backend->policy, > - maxnode ? backend->host_nodes : NULL, maxnode + 1, flags)) { > + > + if (maxnode && > + mbind(ptr, sz, backend->policy, backend->host_nodes, maxnode + 1, > + flags)) { > if (backend->policy != MPOL_DEFAULT || errno != ENOSYS) { > error_setg_errno(errp, errno, > "cannot bind memory to host NUMA nodes"); personally I would have found this code clearer if the check had been "if (backend->policy != MPOL_DEFAULT && ..." as I had to read quite a few lines to understand that the 'maxnode' is zero if-and-only-if policy == MPOL_DEFAULT Regardless though, this is functionally correct so Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel
On 5/1/20 10:57 AM, Daniel P. Berrangé wrote: > On Thu, Apr 30, 2020 at 11:46:06AM -0400, Igor Mammedov wrote: >> Since 5.0 QEMU uses hostmem backend for allocating main guest RAM. >> The backend however calls mbind() which is typically NOP >> in case of default policy/absent host-nodes bitmap. >> However when runing in container with black-listed mbind() >> syscall, QEMU fails to start with error >> "cannot bind memory to host NUMA nodes: Operation not permitted" >> even when user hasn't provided host-nodes to pin to explictly >> (which is the case with -m option) >> >> To fix issue, call mbind() only in case when user has provided >> host-nodes explicitly (i.e. host_nodes bitmap is not empty). >> That should allow to run QEMU in containers with black-listed >> mbind() without memory pinning. If QEMU provided memory-pinning >> is required user still has to white-list mbind() in container >> configuration. >> >> Reported-by: Manuel Hohmann <mhohmann@physnet.uni-hamburg.de> >> Signed-off-by: Igor Mammedov <imammedo@redhat.com> >> --- >> CC: berrange@redhat.com >> CC: ehabkost@redhat.com >> CC: pbonzini@redhat.com >> CC: mhohmann@physnet.uni-hamburg.de >> CC: qemu-stable@nongnu.org >> --- >> backends/hostmem.c | 6 ++++-- >> 1 file changed, 4 insertions(+), 2 deletions(-) >> >> diff --git a/backends/hostmem.c b/backends/hostmem.c >> index 327f9eebc3..0efd7b7bd6 100644 >> --- a/backends/hostmem.c >> +++ b/backends/hostmem.c >> @@ -383,8 +383,10 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp) >> assert(sizeof(backend->host_nodes) >= >> BITS_TO_LONGS(MAX_NODES + 1) * sizeof(unsigned long)); >> assert(maxnode <= MAX_NODES); >> - if (mbind(ptr, sz, backend->policy, >> - maxnode ? backend->host_nodes : NULL, maxnode + 1, flags)) { >> + >> + if (maxnode && >> + mbind(ptr, sz, backend->policy, backend->host_nodes, maxnode + 1, >> + flags)) { >> if (backend->policy != MPOL_DEFAULT || errno != ENOSYS) { >> error_setg_errno(errp, errno, >> "cannot bind memory to host NUMA nodes"); > > personally I would have found this code clearer if the > check had been "if (backend->policy != MPOL_DEFAULT && ..." > as I had to read quite a few lines to understand that the > 'maxnode' is zero if-and-only-if policy == MPOL_DEFAULT > > Regardless though, this is functionally correct so > > Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> I could reproduce running 'make check-qtest-hppa' on the qemu:fedora image: TEST check-qtest-hppa: tests/qtest/boot-serial-test qemu-system-hppa: cannot bind memory to host NUMA nodes: Operation not permitted Broken pipe tests/qtest/libqtest.c:166: kill_qemu() tried to terminate QEMU process but encountered exit status 1 (expected 0) ERROR - too few tests run (expected 1, got 0) make: *** [tests/Makefile.include:637: check-qtest-hppa] Error 1 Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> > > Regards, > Daniel >
On Thu, Apr 30, 2020 at 11:46:06AM -0400, Igor Mammedov wrote: > Since 5.0 QEMU uses hostmem backend for allocating main guest RAM. > The backend however calls mbind() which is typically NOP > in case of default policy/absent host-nodes bitmap. > However when runing in container with black-listed mbind() > syscall, QEMU fails to start with error > "cannot bind memory to host NUMA nodes: Operation not permitted" > even when user hasn't provided host-nodes to pin to explictly > (which is the case with -m option) > > To fix issue, call mbind() only in case when user has provided > host-nodes explicitly (i.e. host_nodes bitmap is not empty). > That should allow to run QEMU in containers with black-listed > mbind() without memory pinning. If QEMU provided memory-pinning > is required user still has to white-list mbind() in container > configuration. > > Reported-by: Manuel Hohmann <mhohmann@physnet.uni-hamburg.de> > Signed-off-by: Igor Mammedov <imammedo@redhat.com> Queued on machine-next, thanks!
Hi Eduardo, On 5/4/20 5:44 PM, Eduardo Habkost wrote: > On Thu, Apr 30, 2020 at 11:46:06AM -0400, Igor Mammedov wrote: >> Since 5.0 QEMU uses hostmem backend for allocating main guest RAM. >> The backend however calls mbind() which is typically NOP >> in case of default policy/absent host-nodes bitmap. >> However when runing in container with black-listed mbind() >> syscall, QEMU fails to start with error >> "cannot bind memory to host NUMA nodes: Operation not permitted" >> even when user hasn't provided host-nodes to pin to explictly >> (which is the case with -m option) >> >> To fix issue, call mbind() only in case when user has provided >> host-nodes explicitly (i.e. host_nodes bitmap is not empty). >> That should allow to run QEMU in containers with black-listed >> mbind() without memory pinning. If QEMU provided memory-pinning >> is required user still has to white-list mbind() in container >> configuration. >> >> Reported-by: Manuel Hohmann <mhohmann@physnet.uni-hamburg.de> >> Signed-off-by: Igor Mammedov <imammedo@redhat.com> > > Queued on machine-next, thanks! I've been debugging this issue again today and figured it was not merged, if possible can you add the "Cc: qemu-stable@nongnu.org" tag before sending your pull request? Thanks, Phil.
On Mon, 11 May 2020 18:00:01 +0200 Philippe Mathieu-Daudé <philmd@redhat.com> wrote: > Hi Eduardo, > > On 5/4/20 5:44 PM, Eduardo Habkost wrote: > > On Thu, Apr 30, 2020 at 11:46:06AM -0400, Igor Mammedov wrote: > >> Since 5.0 QEMU uses hostmem backend for allocating main guest RAM. > >> The backend however calls mbind() which is typically NOP > >> in case of default policy/absent host-nodes bitmap. > >> However when runing in container with black-listed mbind() > >> syscall, QEMU fails to start with error > >> "cannot bind memory to host NUMA nodes: Operation not permitted" > >> even when user hasn't provided host-nodes to pin to explictly > >> (which is the case with -m option) > >> > >> To fix issue, call mbind() only in case when user has provided > >> host-nodes explicitly (i.e. host_nodes bitmap is not empty). > >> That should allow to run QEMU in containers with black-listed > >> mbind() without memory pinning. If QEMU provided memory-pinning > >> is required user still has to white-list mbind() in container > >> configuration. > >> > >> Reported-by: Manuel Hohmann <mhohmann@physnet.uni-hamburg.de> > >> Signed-off-by: Igor Mammedov <imammedo@redhat.com> > > > > Queued on machine-next, thanks! > > I've been debugging this issue again today and figured it was not > merged, if possible can you add the "Cc: qemu-stable@nongnu.org" tag > before sending your pull request? it's CCed already, so my impression was that will should picked up once it was reviewed. > > Thanks, > > Phil. >
On 5/11/20 9:24 PM, Igor Mammedov wrote: > On Mon, 11 May 2020 18:00:01 +0200 > Philippe Mathieu-Daudé <philmd@redhat.com> wrote: > >> Hi Eduardo, >> >> On 5/4/20 5:44 PM, Eduardo Habkost wrote: >>> On Thu, Apr 30, 2020 at 11:46:06AM -0400, Igor Mammedov wrote: >>>> Since 5.0 QEMU uses hostmem backend for allocating main guest RAM. >>>> The backend however calls mbind() which is typically NOP >>>> in case of default policy/absent host-nodes bitmap. >>>> However when runing in container with black-listed mbind() >>>> syscall, QEMU fails to start with error >>>> "cannot bind memory to host NUMA nodes: Operation not permitted" >>>> even when user hasn't provided host-nodes to pin to explictly >>>> (which is the case with -m option) >>>> >>>> To fix issue, call mbind() only in case when user has provided >>>> host-nodes explicitly (i.e. host_nodes bitmap is not empty). >>>> That should allow to run QEMU in containers with black-listed >>>> mbind() without memory pinning. If QEMU provided memory-pinning >>>> is required user still has to white-list mbind() in container >>>> configuration. >>>> >>>> Reported-by: Manuel Hohmann <mhohmann@physnet.uni-hamburg.de> >>>> Signed-off-by: Igor Mammedov <imammedo@redhat.com> >>> >>> Queued on machine-next, thanks! >> >> I've been debugging this issue again today and figured it was not >> merged, if possible can you add the "Cc: qemu-stable@nongnu.org" tag >> before sending your pull request? > it's CCed already, so my impression was that will should picked up once it was reviewed. Correct, however some distributions find easier to grep for the 'Cc: qemu-stable@nongnu.org' merged tag before qemu-stable is released. > >> >> Thanks, >> >> Phil. >> >
diff --git a/backends/hostmem.c b/backends/hostmem.c index 327f9eebc3..0efd7b7bd6 100644 --- a/backends/hostmem.c +++ b/backends/hostmem.c @@ -383,8 +383,10 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp) assert(sizeof(backend->host_nodes) >= BITS_TO_LONGS(MAX_NODES + 1) * sizeof(unsigned long)); assert(maxnode <= MAX_NODES); - if (mbind(ptr, sz, backend->policy, - maxnode ? backend->host_nodes : NULL, maxnode + 1, flags)) { + + if (maxnode && + mbind(ptr, sz, backend->policy, backend->host_nodes, maxnode + 1, + flags)) { if (backend->policy != MPOL_DEFAULT || errno != ENOSYS) { error_setg_errno(errp, errno, "cannot bind memory to host NUMA nodes");
Since 5.0 QEMU uses hostmem backend for allocating main guest RAM. The backend however calls mbind() which is typically NOP in case of default policy/absent host-nodes bitmap. However when runing in container with black-listed mbind() syscall, QEMU fails to start with error "cannot bind memory to host NUMA nodes: Operation not permitted" even when user hasn't provided host-nodes to pin to explictly (which is the case with -m option) To fix issue, call mbind() only in case when user has provided host-nodes explicitly (i.e. host_nodes bitmap is not empty). That should allow to run QEMU in containers with black-listed mbind() without memory pinning. If QEMU provided memory-pinning is required user still has to white-list mbind() in container configuration. Reported-by: Manuel Hohmann <mhohmann@physnet.uni-hamburg.de> Signed-off-by: Igor Mammedov <imammedo@redhat.com> --- CC: berrange@redhat.com CC: ehabkost@redhat.com CC: pbonzini@redhat.com CC: mhohmann@physnet.uni-hamburg.de CC: qemu-stable@nongnu.org --- backends/hostmem.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)