Message ID | 20190327095904.18595-2-otubo@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [PULL,1/2] seccomp: don't kill process for resource control syscalls | expand |
On Wed, Mar 27, 2019 at 10:59:03AM +0100, Eduardo Otubo wrote: > From: Daniel P. Berrangé <berrange@redhat.com> > > The Mesa library tries to set process affinity on some of its threads in > order to optimize its performance. Currently this results in QEMU being > immediately terminated when seccomp is enabled. > > Mesa doesn't consider failure of the process affinity settings to be > fatal to its operation, but our seccomp policy gives it no choice in > gracefully handling this denial. > > It is reasonable to consider that malicious code using the resource > control syscalls to be a less serious attack than if they were trying > to spawn processes or change UIDs and other such things. Generally > speaking changing the resource control setting will "merely" affect > quality of service of processes on the host. With this in mind, rather > than kill the process, we can relax the policy for these syscalls to > return the EPERM errno value. This allows callers to detect that QEMU > does not want them to change resource allocations, and apply some > reasonable fallback logic. > > The main downside to this is for code which uses these syscalls but does > not check the return value, blindly assuming they will always > succeeed. Returning an errno could result in sub-optimal behaviour. > Arguably though such code is already broken & needs fixing regardless. > > Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> > Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> > Acked-by: Eduardo Otubo <otubo@redhat.com> Normally the person sending the pull request should be adding a Signed-off-by line, not an Acked-by line, as Acked-by doesn't have any meaning wrt to the DCO. IIUC, we don't really use Acked-by in QEMU. Only case I think it would be used is where a maintainer is giving their approval for a patch to be sent someone else's tree. eg if a seccomp patch had to merge via a block maintainer tree for some reason, then you could give an Acked-by to indicate you are ok with that going via the different tree. Regards, Daniel
diff --git a/qemu-seccomp.c b/qemu-seccomp.c index 36d5829831..cf520883c7 100644 --- a/qemu-seccomp.c +++ b/qemu-seccomp.c @@ -121,20 +121,37 @@ qemu_seccomp(unsigned int operation, unsigned int flags, void *args) #endif } -static uint32_t qemu_seccomp_get_kill_action(void) +static uint32_t qemu_seccomp_get_action(int set) { + switch (set) { + case QEMU_SECCOMP_SET_DEFAULT: + case QEMU_SECCOMP_SET_OBSOLETE: + case QEMU_SECCOMP_SET_PRIVILEGED: + case QEMU_SECCOMP_SET_SPAWN: { #if defined(SECCOMP_GET_ACTION_AVAIL) && defined(SCMP_ACT_KILL_PROCESS) && \ defined(SECCOMP_RET_KILL_PROCESS) - { - uint32_t action = SECCOMP_RET_KILL_PROCESS; + static int kill_process = -1; + if (kill_process == -1) { + uint32_t action = SECCOMP_RET_KILL_PROCESS; - if (qemu_seccomp(SECCOMP_GET_ACTION_AVAIL, 0, &action) == 0) { + if (qemu_seccomp(SECCOMP_GET_ACTION_AVAIL, 0, &action) == 0) { + kill_process = 1; + } + kill_process = 0; + } + if (kill_process == 1) { return SCMP_ACT_KILL_PROCESS; } - } #endif + return SCMP_ACT_TRAP; + } + + case QEMU_SECCOMP_SET_RESOURCECTL: + return SCMP_ACT_ERRNO(EPERM); - return SCMP_ACT_TRAP; + default: + g_assert_not_reached(); + } } @@ -143,7 +160,6 @@ static int seccomp_start(uint32_t seccomp_opts) int rc = 0; unsigned int i = 0; scmp_filter_ctx ctx; - uint32_t action = qemu_seccomp_get_kill_action(); ctx = seccomp_init(SCMP_ACT_ALLOW); if (ctx == NULL) { @@ -157,10 +173,12 @@ static int seccomp_start(uint32_t seccomp_opts) } for (i = 0; i < ARRAY_SIZE(blacklist); i++) { + uint32_t action; if (!(seccomp_opts & blacklist[i].set)) { continue; } + action = qemu_seccomp_get_action(blacklist[i].set); rc = seccomp_rule_add_array(ctx, action, blacklist[i].num, blacklist[i].narg, blacklist[i].arg_cmp); if (rc < 0) {