Message ID | 20181230132856.24095-3-jlee@suse.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
Series | sysfs: Add hook for checking the file capability of opener | expand |
On Sun, Dec 30, 2018 at 09:28:56PM +0800, Lee, Chun-Yi wrote: > The wake lock/unlock sysfs interfaces check that the writer must has > CAP_BLOCK_SUSPEND capability. But the checking logic can be bypassed > by opening sysfs file within an unprivileged process and then writing > the file within a privileged process. The tricking way has been exposed > by Andy Lutomirski in CVE-2013-1959. Don't you mean "open by privileged and then written by unprivileged?" Or if not, exactly how is this a problem? You check the capabilities when you do the write and if that is not allowed then, well And you are checking the namespace of the person trying to do the write when the write happens, which is correct here, right? If you really want to mess with wake locks in a namespaced environment, then put it in a real namespaced environment, which is {HUGE HINT} not sysfs. So no, this patch isn't ok... thanks, greg k-h
Hi Greg, On Sun, Dec 30, 2018 at 03:48:35PM +0100, Greg Kroah-Hartman wrote: > On Sun, Dec 30, 2018 at 09:28:56PM +0800, Lee, Chun-Yi wrote: > > The wake lock/unlock sysfs interfaces check that the writer must has > > CAP_BLOCK_SUSPEND capability. But the checking logic can be bypassed > > by opening sysfs file within an unprivileged process and then writing > > the file within a privileged process. The tricking way has been exposed > > by Andy Lutomirski in CVE-2013-1959. > > Don't you mean "open by privileged and then written by unprivileged?" > Or if not, exactly how is this a problem? You check the capabilities > when you do the write and if that is not allowed then, well > Sorry for I didn't provide clear explanation. The privileged means CAP_BLOCK_SUSPEND but not file permission. The file permission has already relaxed for non-root user. Then the expected behavior is that non-root process must has CAP_BLOCK_SUSPEND capability for writing wake_lock sysfs. But, the CAP_BLOCK_SUSPEND restrict can be bypassed: int main(int argc, char* argv[]) { int fd, ret = 0; fd = open("/sys/power/wake_lock", O_RDWR); if (fd < 0) err(1, "open wake_lock"); if (dup2(fd, 1) != 1) // overwrite the stdout with wake_lock err(1, "dup2"); sleep(1); execl("./string", "string"); //string has capability return ret; } This program is an unpriviledged process (has no CAP_BLOCK_SUSPEND), it opened wake_lock sysfs and overwrited stdout. Then it executes the "string" program that has CAP_BLOCK_SUSPEND. The string program writes to stdout, which means that it writes to wake_lock. So an unpriviledged opener can trick an priviledged writer for writing sysfs. > And you are checking the namespace of the person trying to do the write > when the write happens, which is correct here, right? > > If you really want to mess with wake locks in a namespaced environment, > then put it in a real namespaced environment, which is {HUGE HINT} not > sysfs. > I don't want to mess with wake locks in namespace. > So no, this patch isn't ok... > Thanks a lot! Joey Lee
On Mon, Dec 31, 2018 at 05:38:51PM +0800, joeyli wrote: > Hi Greg, > > On Sun, Dec 30, 2018 at 03:48:35PM +0100, Greg Kroah-Hartman wrote: > > On Sun, Dec 30, 2018 at 09:28:56PM +0800, Lee, Chun-Yi wrote: > > > The wake lock/unlock sysfs interfaces check that the writer must has > > > CAP_BLOCK_SUSPEND capability. But the checking logic can be bypassed > > > by opening sysfs file within an unprivileged process and then writing > > > the file within a privileged process. The tricking way has been exposed > > > by Andy Lutomirski in CVE-2013-1959. > > > > Don't you mean "open by privileged and then written by unprivileged?" > > Or if not, exactly how is this a problem? You check the capabilities > > when you do the write and if that is not allowed then, well > > > > Sorry for I didn't provide clear explanation. > > The privileged means CAP_BLOCK_SUSPEND but not file permission. The file permission > has already relaxed for non-root user. Then the expected behavior is that non-root > process must has CAP_BLOCK_SUSPEND capability for writing wake_lock sysfs. > > But, the CAP_BLOCK_SUSPEND restrict can be bypassed: > > int main(int argc, char* argv[]) > { > int fd, ret = 0; > > fd = open("/sys/power/wake_lock", O_RDWR); > if (fd < 0) > err(1, "open wake_lock"); > > if (dup2(fd, 1) != 1) // overwrite the stdout with wake_lock > err(1, "dup2"); > sleep(1); > execl("./string", "string"); //string has capability > > return ret; > } > > This program is an unpriviledged process (has no CAP_BLOCK_SUSPEND), it opened > wake_lock sysfs and overwrited stdout. Then it executes the "string" program > that has CAP_BLOCK_SUSPEND. That's the problem right there, do not give CAP_BLOCK_SUSPEND rights to "string". If any user can run that program, there's nothing the kernel can do about this, right? Just don't allow that program on the system :) > The string program writes to stdout, which means that it writes to > wake_lock. So an unpriviledged opener can trick an priviledged writer > for writing sysfs. That sounds like a userspace program that was somehow given incorrect rights by the admin, and a user is taking advantage of it. That's not the kernel's fault. > > And you are checking the namespace of the person trying to do the write > > when the write happens, which is correct here, right? > > > > If you really want to mess with wake locks in a namespaced environment, > > then put it in a real namespaced environment, which is {HUGE HINT} not > > sysfs. > > > > I don't want to mess with wake locks in namespace. Neither do I :) so all should be fine, don't allow crazy executables with odd permissions to be run by any user and you should be fine. That's always been the case, right? thanks, greg k-h
On Mon, Dec 31, 2018 at 11:41 AM Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote: > > On Mon, Dec 31, 2018 at 05:38:51PM +0800, joeyli wrote: > > Hi Greg, > > > > On Sun, Dec 30, 2018 at 03:48:35PM +0100, Greg Kroah-Hartman wrote: > > > On Sun, Dec 30, 2018 at 09:28:56PM +0800, Lee, Chun-Yi wrote: > > > > The wake lock/unlock sysfs interfaces check that the writer must has > > > > CAP_BLOCK_SUSPEND capability. But the checking logic can be bypassed > > > > by opening sysfs file within an unprivileged process and then writing > > > > the file within a privileged process. The tricking way has been exposed > > > > by Andy Lutomirski in CVE-2013-1959. > > > > > > Don't you mean "open by privileged and then written by unprivileged?" > > > Or if not, exactly how is this a problem? You check the capabilities > > > when you do the write and if that is not allowed then, well > > > > > > > Sorry for I didn't provide clear explanation. > > > > The privileged means CAP_BLOCK_SUSPEND but not file permission. The file permission > > has already relaxed for non-root user. Then the expected behavior is that non-root > > process must has CAP_BLOCK_SUSPEND capability for writing wake_lock sysfs. > > > > But, the CAP_BLOCK_SUSPEND restrict can be bypassed: > > > > int main(int argc, char* argv[]) > > { > > int fd, ret = 0; > > > > fd = open("/sys/power/wake_lock", O_RDWR); > > if (fd < 0) > > err(1, "open wake_lock"); > > > > if (dup2(fd, 1) != 1) // overwrite the stdout with wake_lock > > err(1, "dup2"); > > sleep(1); > > execl("./string", "string"); //string has capability > > > > return ret; > > } > > > > This program is an unpriviledged process (has no CAP_BLOCK_SUSPEND), it opened > > wake_lock sysfs and overwrited stdout. Then it executes the "string" program > > that has CAP_BLOCK_SUSPEND. > > That's the problem right there, do not give CAP_BLOCK_SUSPEND rights to > "string". If any user can run that program, there's nothing the kernel > can do about this, right? Just don't allow that program on the system :) > > > The string program writes to stdout, which means that it writes to > > wake_lock. So an unpriviledged opener can trick an priviledged writer > > for writing sysfs. > > That sounds like a userspace program that was somehow given incorrect > rights by the admin, and a user is taking advantage of it. That's not > the kernel's fault. Isn't it? Pretty much any setuid program will write to stdout or stderr; even the glibc linker code does so if you set LD_DEBUG. (Normally the output isn't entirely attacker-controlled, but it is in the case of stuff like "procmail", which I think Debian still ships as setuid root.) setuid programs should always be able to safely call read() and write() on caller-provided file descriptors. Also, you're supposed to be able to receive file descriptors over unix domain sockets and then write to them without trusting the sender. Basically, the ->read and ->write VFS handlers should never look at the caller's credentials, only the opener's (with the exception of LSMs, which tend to do weird things to the system's security model).
On Mon, Dec 31, 2018 at 01:02:35PM +0100, Jann Horn wrote: > On Mon, Dec 31, 2018 at 11:41 AM Greg Kroah-Hartman > <gregkh@linuxfoundation.org> wrote: > > > > On Mon, Dec 31, 2018 at 05:38:51PM +0800, joeyli wrote: > > > Hi Greg, > > > > > > On Sun, Dec 30, 2018 at 03:48:35PM +0100, Greg Kroah-Hartman wrote: > > > > On Sun, Dec 30, 2018 at 09:28:56PM +0800, Lee, Chun-Yi wrote: > > > > > The wake lock/unlock sysfs interfaces check that the writer must has > > > > > CAP_BLOCK_SUSPEND capability. But the checking logic can be bypassed > > > > > by opening sysfs file within an unprivileged process and then writing > > > > > the file within a privileged process. The tricking way has been exposed > > > > > by Andy Lutomirski in CVE-2013-1959. > > > > > > > > Don't you mean "open by privileged and then written by unprivileged?" > > > > Or if not, exactly how is this a problem? You check the capabilities > > > > when you do the write and if that is not allowed then, well > > > > > > > > > > Sorry for I didn't provide clear explanation. > > > > > > The privileged means CAP_BLOCK_SUSPEND but not file permission. The file permission > > > has already relaxed for non-root user. Then the expected behavior is that non-root > > > process must has CAP_BLOCK_SUSPEND capability for writing wake_lock sysfs. > > > > > > But, the CAP_BLOCK_SUSPEND restrict can be bypassed: > > > > > > int main(int argc, char* argv[]) > > > { > > > int fd, ret = 0; > > > > > > fd = open("/sys/power/wake_lock", O_RDWR); > > > if (fd < 0) > > > err(1, "open wake_lock"); > > > > > > if (dup2(fd, 1) != 1) // overwrite the stdout with wake_lock > > > err(1, "dup2"); > > > sleep(1); > > > execl("./string", "string"); //string has capability > > > > > > return ret; > > > } > > > > > > This program is an unpriviledged process (has no CAP_BLOCK_SUSPEND), it opened > > > wake_lock sysfs and overwrited stdout. Then it executes the "string" program > > > that has CAP_BLOCK_SUSPEND. > > > > That's the problem right there, do not give CAP_BLOCK_SUSPEND rights to > > "string". If any user can run that program, there's nothing the kernel > > can do about this, right? Just don't allow that program on the system :) > > > > > The string program writes to stdout, which means that it writes to > > > wake_lock. So an unpriviledged opener can trick an priviledged writer > > > for writing sysfs. > > > > That sounds like a userspace program that was somehow given incorrect > > rights by the admin, and a user is taking advantage of it. That's not > > the kernel's fault. > > Isn't it? Pretty much any setuid program will write to stdout or > stderr; even the glibc linker code does so if you set LD_DEBUG. > (Normally the output isn't entirely attacker-controlled, but it is in > the case of stuff like "procmail", which I think Debian still ships as > setuid root.) setuid programs should always be able to safely call > read() and write() on caller-provided file descriptors. Also, you're > supposed to be able to receive file descriptors over unix domain > sockets and then write to them without trusting the sender. Basically, > the ->read and ->write VFS handlers should never look at the caller's > credentials, only the opener's (with the exception of LSMs, which tend > to do weird things to the system's security model). So a root program gets the file handle to the sysfs file and then passes it off to a setuid program and the kernel should somehow protect from this? I think that any sysfs file that is relying on the capable() check should just set their permissions properly first, and then it should be ok. thanks, greg k-h
> On Dec 31, 2018, at 5:33 AM, Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote: > >> On Mon, Dec 31, 2018 at 01:02:35PM +0100, Jann Horn wrote: >> On Mon, Dec 31, 2018 at 11:41 AM Greg Kroah-Hartman >> <gregkh@linuxfoundation.org> wrote: >>> >>>> On Mon, Dec 31, 2018 at 05:38:51PM +0800, joeyli wrote: >>>> Hi Greg, >>>> >>>>> On Sun, Dec 30, 2018 at 03:48:35PM +0100, Greg Kroah-Hartman wrote: >>>>>> On Sun, Dec 30, 2018 at 09:28:56PM +0800, Lee, Chun-Yi wrote: >>>>>> The wake lock/unlock sysfs interfaces check that the writer must has >>>>>> CAP_BLOCK_SUSPEND capability. But the checking logic can be bypassed >>>>>> by opening sysfs file within an unprivileged process and then writing >>>>>> the file within a privileged process. The tricking way has been exposed >>>>>> by Andy Lutomirski in CVE-2013-1959. >>>>> >>>>> Don't you mean "open by privileged and then written by unprivileged?" >>>>> Or if not, exactly how is this a problem? You check the capabilities >>>>> when you do the write and if that is not allowed then, well >>>>> >>>> >>>> Sorry for I didn't provide clear explanation. >>>> >>>> The privileged means CAP_BLOCK_SUSPEND but not file permission. The file permission >>>> has already relaxed for non-root user. Then the expected behavior is that non-root >>>> process must has CAP_BLOCK_SUSPEND capability for writing wake_lock sysfs. >>>> >>>> But, the CAP_BLOCK_SUSPEND restrict can be bypassed: >>>> >>>> int main(int argc, char* argv[]) >>>> { >>>> int fd, ret = 0; >>>> >>>> fd = open("/sys/power/wake_lock", O_RDWR); >>>> if (fd < 0) >>>> err(1, "open wake_lock"); >>>> >>>> if (dup2(fd, 1) != 1) // overwrite the stdout with wake_lock >>>> err(1, "dup2"); >>>> sleep(1); >>>> execl("./string", "string"); //string has capability >>>> >>>> return ret; >>>> } >>>> >>>> This program is an unpriviledged process (has no CAP_BLOCK_SUSPEND), it opened >>>> wake_lock sysfs and overwrited stdout. Then it executes the "string" program >>>> that has CAP_BLOCK_SUSPEND. >>> >>> That's the problem right there, do not give CAP_BLOCK_SUSPEND rights to >>> "string". If any user can run that program, there's nothing the kernel >>> can do about this, right? Just don't allow that program on the system :) >>> >>>> The string program writes to stdout, which means that it writes to >>>> wake_lock. So an unpriviledged opener can trick an priviledged writer >>>> for writing sysfs. >>> >>> That sounds like a userspace program that was somehow given incorrect >>> rights by the admin, and a user is taking advantage of it. That's not >>> the kernel's fault. >> >> Isn't it? Pretty much any setuid program will write to stdout or >> stderr; even the glibc linker code does so if you set LD_DEBUG. >> (Normally the output isn't entirely attacker-controlled, but it is in >> the case of stuff like "procmail", which I think Debian still ships as >> setuid root.) setuid programs should always be able to safely call >> read() and write() on caller-provided file descriptors. Also, you're >> supposed to be able to receive file descriptors over unix domain >> sockets and then write to them without trusting the sender. Basically, >> the ->read and ->write VFS handlers should never look at the caller's >> credentials, only the opener's (with the exception of LSMs, which tend >> to do weird things to the system's security model). > > So a root program gets the file handle to the sysfs file and then passes > it off to a setuid program and the kernel should somehow protect from > this? Yes, the kernel should. If the kernel wants to check caps, it should do it right. Calling capable() from a .write handler is wrong, even in sysfs. > > I think that any sysfs file that is relying on the capable() check > should just set their permissions properly first, and then it should be > ok. > Probably true. > thanks, > > greg k-h
diff --git a/kernel/power/main.c b/kernel/power/main.c index 98e76cad128b..265199efedc1 100644 --- a/kernel/power/main.c +++ b/kernel/power/main.c @@ -661,6 +661,11 @@ static ssize_t wake_lock_store(struct kobject *kobj, return error ? error : n; } +static bool wake_lock_store_file_capable(const struct file *file) +{ + return file_ns_capable(file, &init_user_ns, CAP_BLOCK_SUSPEND); +} + power_attr(wake_lock); static ssize_t wake_unlock_show(struct kobject *kobj, @@ -678,6 +683,11 @@ static ssize_t wake_unlock_store(struct kobject *kobj, return error ? error : n; } +static bool wake_unlock_store_file_capable(const struct file *file) +{ + return file_ns_capable(file, &init_user_ns, CAP_BLOCK_SUSPEND); +} + power_attr(wake_unlock); #endif /* CONFIG_PM_WAKELOCKS */ @@ -803,6 +813,10 @@ static int __init pm_init(void) power_kobj = kobject_create_and_add("power", NULL); if (!power_kobj) return -ENOMEM; +#ifdef CONFIG_PM_WAKELOCKS + wake_lock_attr.store_file_capable = wake_lock_store_file_capable; + wake_unlock_attr.store_file_capable = wake_unlock_store_file_capable; +#endif error = sysfs_create_group(power_kobj, &attr_group); if (error) return error; diff --git a/kernel/power/wakelock.c b/kernel/power/wakelock.c index 4210152e56f0..52a4cfe55eb5 100644 --- a/kernel/power/wakelock.c +++ b/kernel/power/wakelock.c @@ -205,9 +205,6 @@ int pm_wake_lock(const char *buf) size_t len; int ret = 0; - if (!capable(CAP_BLOCK_SUSPEND)) - return -EPERM; - while (*str && !isspace(*str)) str++; @@ -251,9 +248,6 @@ int pm_wake_unlock(const char *buf) size_t len; int ret = 0; - if (!capable(CAP_BLOCK_SUSPEND)) - return -EPERM; - len = strlen(buf); if (!len) return -EINVAL;
The wake lock/unlock sysfs interfaces check that the writer must has CAP_BLOCK_SUSPEND capability. But the checking logic can be bypassed by opening sysfs file within an unprivileged process and then writing the file within a privileged process. The tricking way has been exposed by Andy Lutomirski in CVE-2013-1959. Here is an example that it's a simplified version from CVE-2013-1959 to bypass the capability checking of wake_lock sysfs: int main(int argc, char* argv[]) { int fd, ret = 0; fd = open("/sys/power/wake_lock", O_RDWR); if (fd < 0) err(1, "open wake_lock"); if (dup2(fd, 1) != 1) err(1, "dup2"); sleep(1); execl("./string", "string"); return ret; } The string is a privileged program that it can be used to write string to wake_lock interface. The main unprivileged process opens the sysfs interface and overwrites stdout. So the privileged process will write to wake_lock. This patch implemented the callback of file capable checking hook in kobject attribute level. It will check the opener's capability. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Chen Yu <yu.c.chen@intel.com> Cc: Giovanni Gherdovich <ggherdovich@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Len Brown <len.brown@intel.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Joe Perches <joe@perches.com> Cc: Bart Van Assche <bvanassche@acm.org> Signed-off-by: "Lee, Chun-Yi" <jlee@suse.com> --- kernel/power/main.c | 14 ++++++++++++++ kernel/power/wakelock.c | 6 ------ 2 files changed, 14 insertions(+), 6 deletions(-)