diff mbox

[RFC] A method to prevent priviledge escalation

Message ID D6DDBEC5E2802849A9493717C9A419C3E5343C1C@GSjpTK1DCembx17.service.hitachi.net (mailing list archive)
State New, archived
Headers show

Commit Message

中村雄一 / NAKAMURA,YUUICHI Sept. 22, 2017, 2:49 a.m. UTC
Hi.

As we said in Linux Security Summit 2017, 
we would like to post a patch to prevent privilege escalation attack.

The concept is here:
http://events.linuxfoundation.org/sites/events/files/slides/nakamura_20170831_1.pdf

This work is still work in progress and feedback is welcomed.
Below patch works for linux-4.4.0, 
To see that it works (try it in a safe place!), 
 * build vulnerable kernel on Ubuntu 16.04.1 
   source: https://launchpad.net/ubuntu/+source/linux/4.4.0-62.83
   please enable "CONFIG_AKO" in kernel config.
 * try a poc code for kernel vulnerability
   https://github.com/xairy/kernel-exploits/blob/master/CVE-2017-6074/poc.c
 * look at kernel log, you can see a log that it detected attack like below:
 AKO: detected unauthorized change of UID. syscall=45 original: uid=1000, euid=1000, fsuid=1000, suid=1000 attempt: uid=0, euid=0, fsuid=0, suid=0
 AKO: detected unauthorized change of gid. syscall=45 original: gid=1000, egid=1000, fsgid=1000, sgid=1000 attempt: gid=0, egid=0, fsgid=0, sgid=0

Regards,
Yuichi Nakamura, Hitachi,Ltd.
Toshihiro Yamauchi, Okayama University

Comments

Eric Biggers Sept. 22, 2017, 7:30 a.m. UTC | #1
On Fri, Sep 22, 2017 at 02:49:59AM +0000, 中村雄一 / NAKAMURA,YUUICHI wrote:
> Hi.
> 
> As we said in Linux Security Summit 2017, 
> we would like to post a patch to prevent privilege escalation attack.
> 
> The concept is here:
> http://events.linuxfoundation.org/sites/events/files/slides/nakamura_20170831_1.pdf
> 
> This work is still work in progress and feedback is welcomed.
> Below patch works for linux-4.4.0, 
> To see that it works (try it in a safe place!), 
>  * build vulnerable kernel on Ubuntu 16.04.1 
>    source: https://launchpad.net/ubuntu/+source/linux/4.4.0-62.83
>    please enable "CONFIG_AKO" in kernel config.
>  * try a poc code for kernel vulnerability
>    https://github.com/xairy/kernel-exploits/blob/master/CVE-2017-6074/poc.c
>  * look at kernel log, you can see a log that it detected attack like below:
>  AKO: detected unauthorized change of UID. syscall=45 original: uid=1000, euid=1000, fsuid=1000, suid=1000 attempt: uid=0, euid=0, fsuid=0, suid=0
>  AKO: detected unauthorized change of gid. syscall=45 original: gid=1000, egid=1000, fsgid=1000, sgid=1000 attempt: gid=0, egid=0, fsgid=0, sgid=0
> 

You mean a method to *allow* privilege escalation?  This makes stack overflows
exploitable again, and trivially so.  Not to mention that adding one line of
code to the POC to set ->ako_sysnum circumvents this entire "security" feature.

The only way this would be even somewhat sane is if the copy of the credentials
were mapped read-only.  But even then, it wouldn't protect against exploits that
already gain arbitrary code execution (such as your example exploit) --- it
would only protect against exploits that somehow modify their cred struct
directly.  And in that case keeping the cred structs mapped readonly would
probably make more sense...
Jann Horn Sept. 22, 2017, 7:57 a.m. UTC | #2
On Fri, Sep 22, 2017 at 4:49 AM, 中村雄一 / NAKAMURA,YUUICHI
<yuichi.nakamura.fe@hitachi.com> wrote:
> Hi.
>
> As we said in Linux Security Summit 2017,
> we would like to post a patch to prevent privilege escalation attack.
>
> The concept is here:
> http://events.linuxfoundation.org/sites/events/files/slides/nakamura_20170831_1.pdf

I believe that the basic concept behind this patch is flawed for the
following reasons:

You are only protecting a tiny subset of the pieces of data in the
kernel that can be used to gain heightened privileges. The syscall
return frame of a setuid task, userspace code in the directmap area,
the uid_map in the user namespace, the credentials structure of
another task, the owner and mode of pretty much any inode and so on
are all interesting targets for an overwrite.

And yes, "commit_creds(prepare_kernel_cred(0))" is a nice and easy
trick if the attacker already has arbitrary code execution in ring 0,
but at that point, you've already lost anyway.
Look at the exploit you linked to: At the point where your mitigation
tries to stop the attack, the attacker has already turned off SMAP and
SMEP and has executed arbitrary code in ring 0. Anything you try to do
after that point is completely useless.

> This work is still work in progress and feedback is welcomed.
> Below patch works for linux-4.4.0,
> To see that it works (try it in a safe place!),
>  * build vulnerable kernel on Ubuntu 16.04.1
>    source: https://launchpad.net/ubuntu/+source/linux/4.4.0-62.83
>    please enable "CONFIG_AKO" in kernel config.
>  * try a poc code for kernel vulnerability
>    https://github.com/xairy/kernel-exploits/blob/master/CVE-2017-6074/poc.c
>  * look at kernel log, you can see a log that it detected attack like below:
>  AKO: detected unauthorized change of UID. syscall=45 original: uid=1000, euid=1000, fsuid=1000, suid=1000 attempt: uid=0, euid=0, fsuid=0, suid=0
>  AKO: detected unauthorized change of gid. syscall=45 original: gid=1000, egid=1000, fsgid=1000, sgid=1000 attempt: gid=0, egid=0, fsgid=0, sgid=0

Showing that a mitigation stops an exploit does not demonstrate that
it is a good mitigation. After all, an attacker who is attacking a
system with the mitigation applied would probably write the exploit
differently, designed to bypass the mitigation.


Some comments on details of the patch are inline.


> --- linux-4.4.0-62-83.orig/kernel/ako.c 1970-01-01 09:00:00.000000000 +0900
> +++ linux-4.4.0/kernel/ako.c    2017-07-03 23:06:54.068000000 +0900
[...]
> +void AKO_save_creds(struct ako_struct * ako_cred, int ako_sysnum)
> +{
> +
> +        /*Save credential information to be observed */
> +        /*UID and GID*/
> +        ako_cred->ako_uid = current->cred->uid.val;
> +        ako_cred->ako_euid = current->cred->euid.val;
> +        ako_cred->ako_fsuid = current->cred->fsuid.val;
> +        ako_cred->ako_suid = current->cred->suid.val;
> +        ako_cred->ako_gid = current->cred->gid.val;
> +        ako_cred->ako_egid = current->cred->egid.val;
> +        ako_cred->ako_fsgid = current->cred->fsgid.val;
> +        ako_cred->ako_sgid = current->cred->sgid.val;
> +        /*Capability*/
> +        ako_cred->ako_inheritable[0] = current->cred->cap_inheritable.cap[0];
> +        ako_cred->ako_inheritable[1] = current->cred->cap_inheritable.cap[1];
> +        ako_cred->ako_permitted[0] = current->cred->cap_permitted.cap[0];
> +        ako_cred->ako_permitted[1] = current->cred->cap_permitted.cap[1];
> +        ako_cred->ako_effective[0] = current->cred->cap_effective.cap[0];
> +        ako_cred->ako_effective[1] = current->cred->cap_effective.cap[1];
> +        ako_cred->ako_bset[0] = current->cred->cap_bset.cap[0];
> +        ako_cred->ako_bset[1] = current->cred->cap_bset.cap[1];
> +
> +        return;
> +}
> +
> +/*copy from sys.c*/
> +static int set_user(struct cred *new)
> +{
> +       struct user_struct *new_user;
> +
> +       new_user = alloc_uid(new->uid);
> +       if (!new_user)
> +               return -EAGAIN;
> +
> +       /*
> +        * We don't fail in case of NPROC limit excess here because too many
> +        * poorly written programs don't check set*uid() return code, assuming
> +        * it never fails if called by root.  We may still enforce NPROC limit
> +        * for programs doing set*uid()+execve() by harmlessly deferring the
> +        * failure to the execve() stage.
> +        */
> +       if (atomic_read(&new_user->processes) >= rlimit(RLIMIT_NPROC) &&
> +                       new_user != INIT_USER)
> +               current->flags |= PF_NPROC_EXCEEDED;
> +       else
> +               current->flags &= ~PF_NPROC_EXCEEDED;
> +
> +       free_uid(new->user);
> +       new->user = new_user;
> +       return 0;
> +}

Can you describe why exactly you need this here?

> +static int AKO_restore_uids(struct ako_struct * ako_cred)
> +{
> +        struct cred *new;
> +        struct user_namespace *ns = current_user_ns();
> +        kuid_t uid;
> +        kuid_t suid;
> +        kuid_t euid;
> +        kuid_t fsuid;
> +       kernel_cap_t effective, permitted;
> +
> +       new = prepare_creds();
> +        if (!new)
> +                return -ENOMEM;
> +
> +        uid = make_kuid(ns, ako_cred->ako_uid);
> +        if (!uid_valid(uid))
> +                return -EINVAL;
> +        suid = make_kuid(ns, ako_cred->ako_suid);
> +        if (!uid_valid(suid))
> +                return -EINVAL;
> +        euid = make_kuid(ns, ako_cred->ako_euid);
> +        if (!uid_valid(euid))
> +                return -EINVAL;
> +        fsuid = make_kuid(ns, ako_cred->ako_fsuid);
> +        if (!uid_valid(fsuid))
> +                return -EINVAL;
> +        new->uid = uid;
> +        new->suid = suid;
> +        new->euid = euid;
> +        new->fsuid = fsuid;

This is wrong. AKO_save_creds copies raw kernel UIDs into ako_cred,
but this code passes
those raw kernel UIDs to make_kuid(), which assumes that the input is
a namespaced UID
as seen in userspace.


> --- linux-4.4.0-62-83.orig/arch/x86/entry/entry_64.S    2017-06-18 14:34:04.008000000 +0900
> +++ linux-4.4.0/arch/x86/entry/entry_64.S       2017-07-01 23:07:43.824000000 +0900
> @@ -182,9 +182,43 @@
>  #endif
>         ja      1f                              /* return -ENOSYS (already in pt_regs->ax) */
>         movq    %r10, %rcx
> +/*
> + * Additional Kernel Observer (AKO)
> + * Copyright (c) 2017 Okayama-University
> + *     Yohei Akao, Yamauchi Laboratory, Okayama University
> + */
> +       subq    $6144,%rsp /*Allocate area in stack to save credential information*/
> +       ALLOC_PT_GPREGS_ON_STACK
> +       SAVE_C_REGS
> +       SAVE_EXTRA_REGS
> +       leaq    15*8(%rsp), %rdi /* size of SAVE_C_REGS and size of SAVE_EXTRA_REGS is added to rsp, and start address of allocated area is saved in %rdi*/
> +       movq    %rax, %rsi /* Syscall number(%rax) is saved in %rsi */
> +       call AKO_before /*credential information is saved*/
> +       RESTORE_EXTRA_REGS
> +       RESTORE_C_REGS
> +       REMOVE_PT_GPREGS_FROM_STACK
> +       addq    $6144,%rsp /*Allocate area in stack to save credential information*/
> +       /*end of AKO*/
>         call    *sys_call_table(, %rax, 8)
> +/*
> + * Additional Kernel Observer (AKO)
> + * Copyright (c) 2017 Okayama-University
> + *     Yohei Akao, Yamauchi Laboratory, Okayama University
> + */
> +       /*Start of AKO*/
> +       subq    $6144,%rsp

What is going on here? You're allocating a stack area that is then
passed to AKO_after(),
which reads from it?
Are you trying to store information at the bottom of the stack in
AKO_before() and then read it back in AKO_after()?

> +       ALLOC_PT_GPREGS_ON_STACK
> +       SAVE_C_REGS
> +       SAVE_EXTRA_REGS
> +       leaq    15*8(%rsp), %rdi
> +       call AKO_after
> +       RESTORE_EXTRA_REGS
> +       RESTORE_C_REGS
> +       REMOVE_PT_GPREGS_FROM_STACK
> +       /*Free area to store credential infomation*/
> +       addq    $6144,%rsp
> +       /*End of AKO*/
>         movq    %rax, RAX(%rsp)
> -1:
>  /*
>   * Syscall return path ending with SYSRET (fast path).
>   * Has incompletely filled pt_regs.

This is the 64-bit syscall entry fastpath. As far as I can tell, this
is the only place where you're calling AKO_before() and AKO_after().
This fastpath is not used if either compat (32-bit) syscalls are used
or the slowpath has to be used, e.g. because a seccomp filter is
active.
So a simple trick to get past this check is probably to do this before
the main attack:

prctl(PR_SET_NO_NEW_PRIVS, 1);
struct sock_filter filter[] = {
  BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW)
};
struct sock_fprog prog = {
  .len = (unsigned short)(sizeof(filter)/sizeof(filter[0])),
  .filter = filter
};
seccomp(SECCOMP_SET_MODE_FILTER, &prog, 0);
Solar Designer Sept. 30, 2017, 10 p.m. UTC | #3
Hi,

On Fri, Sep 22, 2017 at 02:49:59AM +0000, ?$BCfB<M:0l / NAKAMURA?$B!$YUUICHI wrote:
> As we said in Linux Security Summit 2017, 
> we would like to post a patch to prevent privilege escalation attack.
> 
> The concept is here:
> http://events.linuxfoundation.org/sites/events/files/slides/nakamura_20170831_1.pdf

We were not aware of your work, but FWIW we happen to have similar
functionality, also in early development/testing stage, in LKRG by Adam
Zabrocki (who is CC'ed on this reply, so he might chime in):

http://openwall.info/wiki/p_lkrg/Main#Exploit-Detection

LKRG is not publicly available yet, but it probably will be soon.

Unlike your implementation, LKRG is a kernel module, not a patch.
LKRG can be built for and loaded on top of a wide range of mainline and
distros' kernels, without needing to patch those.  There are other
implementation differences as well - indeed, hopefully we are not
actually introducing vulnerabilities with our code, although the risk is
always there, so it would need to be justified.  And this brings us to:

Frankly, we're skeptical about the value that this approach provides.
Much of the value could be in diversity - if most systems do not have
this sort of runtime checks, then canned exploits and even some capable
human attackers would not care or know to try and bypass the checks.
However, if this functionality becomes standard, so will the bypasses.
Now, are there cases (such as specific kernel vulnerabilities) where a
bypass is not possible, not practical, not reliable, or would complicate
the exploit so much it becomes less generic and/or less reliable?
Perhaps yes, and this would mean there's some value in this approach
even if it becomes standard.  Another option may be to have a lighter
standard version and a heavier and more extensive limited distribution
version (which could also be a way to receive funding for the project).

Speaking of addressing a sensible threat model, I suggested to Adam that
we add validation of credentials right after the kernel's original
permissions check on the module loading syscalls.  (Or we might
duplicate those permission checks first, since that's easier to do when
we're also a kernel module rather than a patch.)  Then we'll achieve
sensible overall functionality: can't use those unauthorized credentials
for much time, and can't easily backdoor the running kernel through
having exploited a typical kernel vulnerability (due to LKRG's integrity
checking of the rest of the kernel and, with this extra check, not being
able to perform an unauthorized kernel module load by official means,
which LKRG's integrity checking wouldn't otherwise catch).

The credentials checks would need to be extended to cover cap_*, etc. -
there's already a (partial?) list of those other credential-like data
items on the wiki page above.  And they will need to be performed in
more places - I think in particular on open(), as otherwise even a very
brief exposure of the unauthorized credentials would let the attacker
retain an fd e.g. to a block device corresponding to the root filesystem
and then install persistent backdoors in there without hurry (e.g.,
pattern-match a critical system binary and introduce own code in there).

Of course, the action on detected integrity violation (of the kernel
itself or of its critical data) should not be limited to logging a
warning, as we do for testing now.  What exact action is best might be a
non-trivial question.  Panic the kernel?  Lock the kernel from any
further syscalls, log what happened, sync the disks, and then panic?

Attempt to undo the unauthorized change whenever possible - e.g.,
restore the credentials from our shadow copy that couldn't be tampered
with as easily?  But then an alternative would appear to be to protect
the main credentials to the same extent, e.g. keeping them read-only
most of the time or/and encrypted with a per-boot key (trickier to do
when we're a kernel module, though).  This undoing the damage is
probably bad as it'd allow the attack to repeat until all races are won,
unless we combine it e.g. with blacklisting of attack sources such as by
uid when a threshold is reached.  It's also bad in that it requires
exploits to replace just one set of credentials - the more trusted
(and better protected?) one - rather than both, as would be the case if
we choose panic.  For some vulnerabilities, an exploit would probably
only be able to perform one write in a while (e.g., per syscall), so
needing to replace both sets of credentials improves the chances of
detection.

We welcome any other thoughts - be it criticism, feedback, or anything.

Thanks,

Alexander
Toshihiro Yamauchi Oct. 27, 2017, 8:58 a.m. UTC | #4
Dear all

Thank you for your valuable comments. As you said, we found it is
difficult to prevent privilege escalation only with our
approach. Since we are thinking of another approach, let us discuss
once we have a new approach. We will post it once we had.

Best regards,
Toshihiro Yamauchi, Okayama University
Yuichi Nakamura, Hitachi,Ltd.


From: Solar Designer <solar@openwall.com>
Subject: Re: [kernel-hardening] [RFC] A method to prevent priviledge escalation
Date: Sun, 1 Oct 2017 00:00:11 +0200

> Hi,
> 
> On Fri, Sep 22, 2017 at 02:49:59AM +0000, ?$BCfB<M:0l / NAKAMURA?$B!$YUUICHI wrote:
>> As we said in Linux Security Summit 2017, 
>> we would like to post a patch to prevent privilege escalation attack.
>> 
>> The concept is here:
>> http://events.linuxfoundation.org/sites/events/files/slides/nakamura_20170831_1.pdf
> 
> We were not aware of your work, but FWIW we happen to have similar
> functionality, also in early development/testing stage, in LKRG by Adam
> Zabrocki (who is CC'ed on this reply, so he might chime in):
> 
> http://openwall.info/wiki/p_lkrg/Main#Exploit-Detection
> 
> LKRG is not publicly available yet, but it probably will be soon.
> 
> Unlike your implementation, LKRG is a kernel module, not a patch.
> LKRG can be built for and loaded on top of a wide range of mainline and
> distros' kernels, without needing to patch those.  There are other
> implementation differences as well - indeed, hopefully we are not
> actually introducing vulnerabilities with our code, although the risk is
> always there, so it would need to be justified.  And this brings us to:
> 
> Frankly, we're skeptical about the value that this approach provides.
> Much of the value could be in diversity - if most systems do not have
> this sort of runtime checks, then canned exploits and even some capable
> human attackers would not care or know to try and bypass the checks.
> However, if this functionality becomes standard, so will the bypasses.
> Now, are there cases (such as specific kernel vulnerabilities) where a
> bypass is not possible, not practical, not reliable, or would complicate
> the exploit so much it becomes less generic and/or less reliable?
> Perhaps yes, and this would mean there's some value in this approach
> even if it becomes standard.  Another option may be to have a lighter
> standard version and a heavier and more extensive limited distribution
> version (which could also be a way to receive funding for the project).
> 
> Speaking of addressing a sensible threat model, I suggested to Adam that
> we add validation of credentials right after the kernel's original
> permissions check on the module loading syscalls.  (Or we might
> duplicate those permission checks first, since that's easier to do when
> we're also a kernel module rather than a patch.)  Then we'll achieve
> sensible overall functionality: can't use those unauthorized credentials
> for much time, and can't easily backdoor the running kernel through
> having exploited a typical kernel vulnerability (due to LKRG's integrity
> checking of the rest of the kernel and, with this extra check, not being
> able to perform an unauthorized kernel module load by official means,
> which LKRG's integrity checking wouldn't otherwise catch).
> 
> The credentials checks would need to be extended to cover cap_*, etc. -
> there's already a (partial?) list of those other credential-like data
> items on the wiki page above.  And they will need to be performed in
> more places - I think in particular on open(), as otherwise even a very
> brief exposure of the unauthorized credentials would let the attacker
> retain an fd e.g. to a block device corresponding to the root filesystem
> and then install persistent backdoors in there without hurry (e.g.,
> pattern-match a critical system binary and introduce own code in there).
> 
> Of course, the action on detected integrity violation (of the kernel
> itself or of its critical data) should not be limited to logging a
> warning, as we do for testing now.  What exact action is best might be a
> non-trivial question.  Panic the kernel?  Lock the kernel from any
> further syscalls, log what happened, sync the disks, and then panic?
> 
> Attempt to undo the unauthorized change whenever possible - e.g.,
> restore the credentials from our shadow copy that couldn't be tampered
> with as easily?  But then an alternative would appear to be to protect
> the main credentials to the same extent, e.g. keeping them read-only
> most of the time or/and encrypted with a per-boot key (trickier to do
> when we're a kernel module, though).  This undoing the damage is
> probably bad as it'd allow the attack to repeat until all races are won,
> unless we combine it e.g. with blacklisting of attack sources such as by
> uid when a threshold is reached.  It's also bad in that it requires
> exploits to replace just one set of credentials - the more trusted
> (and better protected?) one - rather than both, as would be the case if
> we choose panic.  For some vulnerabilities, an exploit would probably
> only be able to perform one write in a while (e.g., per syscall), so
> needing to replace both sets of credentials improves the chances of
> detection.
> 
> We welcome any other thoughts - be it criticism, feedback, or anything.
> 
> Thanks,
> 
> Alexander
diff mbox

Patch

--- linux-4.4.0-62-83.orig/arch/x86/kernel/ako.c	1970-01-01 09:00:00.000000000 +0900
+++ linux-4.4.0/arch/x86/kernel/ako.c	2017-07-02 17:43:15.780000000 +0900
@@ -0,0 +1,55 @@ 
+/*
+ * Additional Kernel Observer (AKO)
+ *   CPU specific part
+ * Copyright 2017 Okayama-University
+ *     Yohei Akao, Yamauchi Laboratory
+ * Copyright 2017 Hitachi,Ltd.
+ *     Yuichi Nakamura
+ */
+
+#include <linux/cred.h>
+#include <linux/printk.h>
+#include <linux/ako.h>
+
+asmlinkage void AKO_before(struct ako_struct * ako_cred, unsigned long long ako_sysnum)
+{
+        /* Called at the entry of system calls,
+           credential data are saved to detect priviledge escalation attacks
+           that exploit vulnerabilites of system calls.
+        */
+
+       /*save system call number*/
+        ako_cred->ako_sysnum = ako_sysnum;
+
+	/*System calls that change credential are skipped*/	
+	if(!AKO_syscall_checked(ako_sysnum))
+		return;
+
+	/*credential data are saved*/
+        AKO_save_creds(ako_cred,ako_sysnum);
+	/*addr_limit is saved*/
+	ako_cred->ako_addr_limit = current_thread_info()->addr_limit.seg;
+
+}
+
+
+asmlinkage void AKO_after(struct ako_struct * ako_cred)
+{
+	/*System calls that change credential are skipped*/	
+	if(!AKO_syscall_checked(ako_cred->ako_sysnum))
+		return;
+
+	/*check addr_limit, restore if changed*/
+/*        if(current_thread_info()->addr_limit.seg != ako_cred->ako_addr_limit){
+		audit_AKO_rlimit(ako_cred, current_thread_info()->addr_limit.seg); 
+                current_thread_info()->addr_limit.seg = ako_cred->ako_addr_limit;
+		audit_AKO_restore(ako_cred, "addr_limit");
+                return ;
+        } */
+
+	/*check credentials, and restore if changed*/
+        AKO_check_creds(ako_cred);
+
+        return;
+}
+
--- linux-4.4.0-62-83.orig/kernel/ako.c	1970-01-01 09:00:00.000000000 +0900
+++ linux-4.4.0/kernel/ako.c	2017-07-03 23:06:54.068000000 +0900
@@ -0,0 +1,389 @@ 
+/*
+ * Additional Kernel Observer (AKO)
+ *  Common features
+ * Copyright 2017 Okayama-University
+ *     Yohei Akao, Yamauchi Laboratory
+ * Copyright 2017 Hitachi,Ltd.
+ *     Yuichi Nakamura
+ */
+
+#include <linux/printk.h>
+#include <linux/cred.h>
+#include <linux/syscalls.h>
+#include <linux/ako.h>
+
+#include <linux/audit.h>
+#include <uapi/linux/audit.h>
+#include "audit.h"
+
+void audit_AKO_rlimit(struct ako_struct * ako_cred, unsigned long current_addr_limit) {
+	/*This is called from arch/xx/ako.c*/
+	struct audit_buffer *ab;
+
+	printk(KERN_INFO "AKO: detected unauthorized change of addr_limit: syscall=%u original: 0x%lx, attempt: 0x%lx", ako_cred->ako_sysnum, ako_cred->ako_addr_limit,current_addr_limit);
+        if(!audit_enabled) {
+                return;
+        }
+	ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_SYSCALL);
+	if (unlikely(!ab))
+		return;
+
+	audit_log_format(ab, "AKO: detected unauthorized change of addr_limit: syscall=%u original: 0x%lx, attempt: 0x%lx", ako_cred->ako_sysnum, ako_cred->ako_addr_limit,current_addr_limit);
+        audit_log_d_path_exe(ab, current->mm);
+        audit_log_end(ab);
+	return; 
+}
+
+void audit_AKO_restore(struct ako_struct * ako_cred, const char * cred_type)
+{
+	struct audit_buffer *ab;
+	
+	printk(KERN_INFO "AKO: restored credential:%s syscall=%u", cred_type, ako_cred->ako_sysnum);
+	if(!audit_enabled) {
+		return;
+	}
+	ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_SYSCALL);
+        if (unlikely(!ab))
+                return;
+
+	audit_log_format(ab, "AKO: restored credential:%s syscall=%u", cred_type,
+	ako_cred->ako_sysnum);
+	
+	audit_log_d_path_exe(ab, current->mm);
+ 	audit_log_end(ab);
+}
+
+static void audit_AKO_uid(struct ako_struct * ako_cred)
+{
+        struct audit_buffer *ab;
+        const struct cred *ccred;
+        ccred = current->cred;
+
+	printk(KERN_INFO "AKO: detected unauthorized change of UID. syscall=%u original: uid=%u, euid=%u, fsuid=%u, suid=%u attempt: uid=%u, euid=%u, fsuid=%u, suid=%u", ako_cred->ako_sysnum, 
+	ako_cred->ako_uid, ako_cred->ako_euid, ako_cred->ako_fsuid, ako_cred->ako_suid, 
+	ccred->uid.val, ccred->euid.val, current_fsuid().val, ccred->suid.val);
+
+	if(!audit_enabled) {
+ 		return;
+	}
+
+	ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_SYSCALL);
+	if (unlikely(!ab))
+		return;
+	audit_log_format(ab, "AKO: detected unauthorized change of UID. syscall=%u original: uid=%u, euid=%u, fsuid=%u, suid=%u attempt: uid=%u, euid=%u, fsuid=%u, suid=%u", ako_cred->ako_sysnum,
+	ako_cred->ako_uid, ako_cred->ako_euid, ako_cred->ako_fsuid, ako_cred->ako_suid,
+	ccred->uid.val, ccred->euid.val, ccred->fsuid.val, ccred->suid.val);
+
+        audit_log_d_path_exe(ab, current->mm);
+        audit_log_end(ab);
+
+        return;
+}
+
+static void audit_AKO_gid(struct ako_struct * ako_cred)
+{
+        struct audit_buffer *ab;
+        const struct cred *ccred;
+        ccred = current->cred;
+
+        printk(KERN_INFO "AKO: detected unauthorized change of gid. syscall=%u original: gid=%u, egid=%u, fsgid=%u, sgid=%u attempt: gid=%u, egid=%u, fsgid=%u, sgid=%u", ako_cred->ako_sysnum,
+        ako_cred->ako_gid, ako_cred->ako_egid, ako_cred->ako_fsgid, ako_cred->ako_sgid,
+        ccred->gid.val, ccred->egid.val, ccred->fsgid.val, ccred->sgid.val);
+
+        if(!audit_enabled) {
+                return;
+        }
+
+        ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_SYSCALL);
+        if (unlikely(!ab))
+                return;
+        audit_log_format(ab, "AKO: detected unauthorized change of gid. syscall=%u original: gid=%u, egid=%u, fsgid=%u, sgid=%u attempt: gid=%u, egid=%u, fsgid=%u, sgid=%u", ako_cred->ako_sysnum,
+        ako_cred->ako_gid, ako_cred->ako_egid, ako_cred->ako_fsgid, ako_cred->ako_sgid,
+        ccred->gid.val, ccred->egid.val, ccred->fsgid.val, ccred->sgid.val);
+
+        audit_log_d_path_exe(ab, current->mm);
+        audit_log_end(ab);
+
+        return;
+}
+
+static void audit_AKO_cap(struct ako_struct * ako_cred)
+{
+	struct audit_buffer *ab;
+	const struct cred *ccred;
+	ccred = current->cred;
+
+	 printk(KERN_INFO "AKO: detected unauthorized change of capability. syscall=%u original: inh[0]=%u inh[1]=%u per[0]=%u per[1]=%u eff[0]=%u eff[1]%u bset[0]=%u bset[1]=%u attempt: inh[0]=%u inh[1]=%u per[0]=%u per[1]=%u eff[0]=%u eff[1]%u bset[0]=%u bset[1]=%u",
+	ako_cred->ako_sysnum,
+	ako_cred->ako_inheritable[0], ako_cred->ako_inheritable[1],
+	ako_cred->ako_permitted[0], ako_cred->ako_permitted[1],
+	ako_cred->ako_effective[0], ako_cred->ako_effective[1],
+       	ako_cred->ako_bset[0], ako_cred->ako_bset[1],
+	ccred->cap_inheritable.cap[0], ccred->cap_inheritable.cap[1],
+	ccred->cap_permitted.cap[0], ccred->cap_permitted.cap[1],
+	ccred->cap_effective.cap[0], ccred->cap_effective.cap[1],
+	ccred->cap_bset.cap[0], ccred->cap_bset.cap[1]);
+
+	if (!audit_enabled)
+		return;
+	ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_SYSCALL);
+	if (unlikely(!ab))
+		return;
+
+	audit_log_format(ab, "AKO: detected unauthorized change of capability. syscall=%u original: inh[0]=%u inh[1]=%u per[0]=%u per[1]=%u eff[0]=%u eff[1]%u bset[0]=%u bset[1]=%u attempt: inh[0]=%u inh[1]=%u per[0]=%u per[1]=%u eff[0]=%u eff[1]%u bset[0]=%u bset[1]=%u", 
+	ako_cred->ako_sysnum, 
+	ako_cred->ako_inheritable[0], ako_cred->ako_inheritable[1],
+	ako_cred->ako_permitted[0], ako_cred->ako_permitted[1], 
+ 	ako_cred->ako_effective[0], ako_cred->ako_effective[1],
+ 	ako_cred->ako_bset[0], ako_cred->ako_bset[1],
+	ccred->cap_inheritable.cap[0], ccred->cap_inheritable.cap[1],
+	ccred->cap_permitted.cap[0], ccred->cap_permitted.cap[1],
+	ccred->cap_effective.cap[0], ccred->cap_effective.cap[1],
+	ccred->cap_bset.cap[0], ccred->cap_bset.cap[1]);
+
+	audit_log_d_path_exe(ab, current->mm);
+	audit_log_end(ab);
+
+	return;
+}
+
+
+
+void AKO_save_creds(struct ako_struct * ako_cred, int ako_sysnum)
+{
+
+        /*Save credential information to be observed */
+        /*UID and GID*/
+        ako_cred->ako_uid = current->cred->uid.val;
+        ako_cred->ako_euid = current->cred->euid.val;
+        ako_cred->ako_fsuid = current->cred->fsuid.val;
+        ako_cred->ako_suid = current->cred->suid.val;
+        ako_cred->ako_gid = current->cred->gid.val;
+        ako_cred->ako_egid = current->cred->egid.val;
+        ako_cred->ako_fsgid = current->cred->fsgid.val;
+        ako_cred->ako_sgid = current->cred->sgid.val;
+        /*Capability*/
+        ako_cred->ako_inheritable[0] = current->cred->cap_inheritable.cap[0];
+        ako_cred->ako_inheritable[1] = current->cred->cap_inheritable.cap[1];
+        ako_cred->ako_permitted[0] = current->cred->cap_permitted.cap[0];
+        ako_cred->ako_permitted[1] = current->cred->cap_permitted.cap[1];
+        ako_cred->ako_effective[0] = current->cred->cap_effective.cap[0];
+        ako_cred->ako_effective[1] = current->cred->cap_effective.cap[1];
+        ako_cred->ako_bset[0] = current->cred->cap_bset.cap[0];
+        ako_cred->ako_bset[1] = current->cred->cap_bset.cap[1];
+
+        return;
+}
+
+/*copy from sys.c*/
+static int set_user(struct cred *new)
+{
+	struct user_struct *new_user;
+
+	new_user = alloc_uid(new->uid);
+	if (!new_user)
+		return -EAGAIN;
+
+	/*
+	 * We don't fail in case of NPROC limit excess here because too many
+	 * poorly written programs don't check set*uid() return code, assuming
+	 * it never fails if called by root.  We may still enforce NPROC limit
+	 * for programs doing set*uid()+execve() by harmlessly deferring the
+	 * failure to the execve() stage.
+	 */
+	if (atomic_read(&new_user->processes) >= rlimit(RLIMIT_NPROC) &&
+			new_user != INIT_USER)
+		current->flags |= PF_NPROC_EXCEEDED;
+	else
+		current->flags &= ~PF_NPROC_EXCEEDED;
+
+	free_uid(new->user);
+	new->user = new_user;
+	return 0;
+}
+
+
+static int AKO_restore_uids(struct ako_struct * ako_cred)
+{
+        struct cred *new;
+        struct user_namespace *ns = current_user_ns();
+        kuid_t uid;
+        kuid_t suid;
+        kuid_t euid;
+        kuid_t fsuid;
+	kernel_cap_t effective, permitted;
+        
+	new = prepare_creds();
+        if (!new)
+                return -ENOMEM;
+
+        uid = make_kuid(ns, ako_cred->ako_uid);
+        if (!uid_valid(uid))
+                return -EINVAL;
+        suid = make_kuid(ns, ako_cred->ako_suid);
+        if (!uid_valid(suid))
+                return -EINVAL;
+        euid = make_kuid(ns, ako_cred->ako_euid);
+        if (!uid_valid(euid))
+                return -EINVAL;
+        fsuid = make_kuid(ns, ako_cred->ako_fsuid);
+        if (!uid_valid(fsuid))
+                return -EINVAL;
+        new->uid = uid;
+        new->suid = suid;
+        new->euid = euid;
+        new->fsuid = fsuid;
+
+	/*clear capabilities*/
+	effective.cap[0] = 0;
+	effective.cap[1] = 0;
+	permitted.cap[0] = 0;
+	permitted.cap[1] = 0;
+	new->cap_effective  = effective;
+	new->cap_permitted  = permitted;
+
+	set_user(new);
+        commit_creds(new);
+        return 0;
+}
+
+static int AKO_restore_gids(struct ako_struct * ako_cred)
+{
+        struct cred *new;
+        struct user_namespace *ns = current_user_ns();
+        kgid_t gid;
+        kgid_t sgid;
+        kgid_t egid;
+        kgid_t fsgid;
+	kernel_cap_t effective, permitted;
+
+        new = prepare_creds();
+        if (!new)
+                return -ENOMEM;
+
+        gid = make_kgid(ns, ako_cred->ako_gid);
+        if (!gid_valid(gid))
+                return -EINVAL;
+        sgid = make_kgid(ns, ako_cred->ako_sgid);
+        if (!gid_valid(sgid))
+                return -EINVAL;
+        egid = make_kgid(ns, ako_cred->ako_egid);
+        if (!gid_valid(egid))
+                return -EINVAL;
+        fsgid = make_kgid(ns, ako_cred->ako_fsgid);
+        if (!gid_valid(fsgid))
+                return -EINVAL;
+        new->gid = gid;
+        new->sgid = sgid;
+        new->egid = egid;
+        new->fsgid = fsgid;
+
+        /*clear capabilities*/
+	effective.cap[0] = 0;
+	effective.cap[1] = 0;
+	permitted.cap[0] = 0;
+	permitted.cap[1] = 0;
+	new->cap_effective  = effective;
+	new->cap_permitted  = permitted;
+
+        commit_creds(new);
+
+        return 0;
+}
+static int AKO_restore_caps(struct ako_struct * ako_cred)
+{
+        struct cred *new;
+        kernel_cap_t inheritable, permitted, effective, bset;
+        unsigned i;
+        new = prepare_creds();
+        if (!new)
+                return -ENOMEM;
+
+        for (i = 0; i<2; i++) {
+                inheritable.cap[i] = ako_cred->ako_inheritable[i];
+                permitted.cap[i] = ako_cred->ako_permitted[i];
+                effective.cap[i] = ako_cred->ako_effective[i];
+                bset.cap[i] = ako_cred->ako_bset[i];
+        }
+
+        new->cap_effective   = effective;
+        new->cap_inheritable = inheritable;
+        new->cap_permitted   = permitted;
+        new->cap_bset   = bset;
+
+        commit_creds(new);
+
+        return 0;
+}
+
+inline int AKO_syscall_checked(int sysnum)
+{
+	 /*Since following system calls change credential information,
+          AKO does not detect attacks for them*/
+
+        if((sysnum == __NR_execve)   || (sysnum == __NR_setuid)    || (sysnum == __NR_setgid)    || (sysnum == __NR_setreuid) ||
+           (sysnum == __NR_setregid) || (sysnum == __NR_setresuid) || (sysnum == __NR_setresgid) || (sysnum == __NR_setfsuid) ||
+           (sysnum == __NR_setfsgid) || (sysnum == __NR_capset)    || (sysnum == __NR_prctl) || (sysnum == __NR_unshare)  ){
+                return 0;
+        }
+	return 1;
+
+}
+
+void AKO_check_creds(struct ako_struct * ako_cred)
+{
+        /*Called at the exit of system call */
+        /*Compare credntial information before the systemcall
+          if changed, the information is restored and logged*/
+	int uid_modified = 0;
+	int gid_modified = 0;
+ 	int cap_modified = 0;	
+
+        /*check uids*/
+        if(ako_cred->ako_uid != current->cred->uid.val || ako_cred->ako_euid != current->cred->euid.val || ako_cred->ako_fsuid != current->cred->fsuid.val ||
+           ako_cred->ako_suid != current->cred->suid.val){
+		audit_AKO_uid(ako_cred);
+		uid_modified = 1;
+        }
+
+	/*Check gids*/
+        if(ako_cred->ako_gid != current->cred->gid.val || ako_cred->ako_egid != current->cred->egid.val || ako_cred->ako_fsgid != current->cred->fsgid.val ||
+           ako_cred->ako_sgid != current->cred->sgid.val){
+		audit_AKO_gid(ako_cred);
+		gid_modified = 1;		
+        }
+	/*Check capabilities*/
+        if(ako_cred->ako_inheritable[0] != current->cred->cap_inheritable.cap[0] || ako_cred->ako_inheritable[1] != current->cred->cap_inheritable.cap[1] ||
+           ako_cred->ako_permitted[0]   != current->cred->cap_permitted.cap[0]   || ako_cred->ako_permitted[1]   != current->cred->cap_permitted.cap[1]   ||
+           ako_cred->ako_effective[0]   != current->cred->cap_effective.cap[0]   || ako_cred->ako_effective[1]   != current->cred->cap_effective.cap[1]   ||
+           ako_cred->ako_bset[0]        != current->cred->cap_bset.cap[0]        || ako_cred->ako_bset[1]        != current->cred->cap_bset.cap[1] ){
+		audit_AKO_cap(ako_cred);
+		cap_modified = 1;
+        }
+
+	/*restore creds if modified*/
+	if (uid_modified) {
+		/*Restore uids*/
+               if(!AKO_restore_uids(ako_cred))
+                        audit_AKO_restore(ako_cred,"uids");
+                else
+                        do_exit(SIGKILL);
+
+	}
+	if (gid_modified) {
+                /*Restore gids*/
+                if(!AKO_restore_gids(ako_cred))
+                        audit_AKO_restore(ako_cred,"gids");
+                else
+                        do_exit(SIGKILL);
+	}
+	if (cap_modified) {
+		/*restore capabilities*/
+		if(!AKO_restore_caps(ako_cred))
+			audit_AKO_restore(ako_cred,"capabilities");
+		else
+			do_exit(SIGKILL);
+	}
+
+        return;
+}
--- linux-4.4.0-62-83.orig/include/linux/ako.h	1970-01-01 09:00:00.000000000 +0900
+++ linux-4.4.0/include/linux/ako.h	2017-07-04 22:33:45.188000000 +0900
@@ -0,0 +1,35 @@ 
+#ifndef __LINUX__AKO_H
+#define __LINUX__AKO_H
+/*
+ * Additional Kernel Observer (AKO)
+ * Copyright (c) 2017 Okayama-University
+ *     Yohei Akao, Yamauchi Laboratory, Okayama University
+ * Copyright (c) 2017 Hitachi,Ltd.
+ *	Yuichi Nakamura
+ */
+
+struct ako_struct{
+int ako_sysnum;
+/*Credential information to be observed*/
+unsigned long ako_addr_limit;
+uid_t  ako_uid;
+uid_t  ako_euid;
+uid_t  ako_fsuid;
+uid_t  ako_suid;
+gid_t  ako_gid;
+gid_t  ako_egid;
+gid_t  ako_fsgid;
+gid_t  ako_sgid;
+__u32  ako_inheritable[2];
+__u32  ako_permitted[2];
+__u32  ako_effective[2];
+__u32  ako_bset[2];
+};
+extern void AKO_save_creds(struct ako_struct * ako_cred, int ako_sysnum);
+extern void AKO_check_creds(struct ako_struct * ako_cred);
+extern void AKO_save_addr_limit(struct ako_struct * ako_cred);
+extern int AKO_syscall_checked(int sysnum);
+extern void audit_AKO_rlimit(struct ako_struct * ako_cred, unsigned long current_addr_limit);
+extern void audit_AKO_restore(struct ako_struct * ako_cred, const char * cred_type);
+
+#endif
--- linux-4.4.0-62-83.orig/kernel/Makefile	2017-06-18 14:33:57.240000000 +0900
+++ linux-4.4.0/kernel/Makefile	2017-07-04 22:29:00.004000000 +0900
@@ -13,6 +13,8 @@ 
 
 obj-$(CONFIG_MULTIUSER) += groups.o
 
+obj-$(CONFIG_AKO) += ako.o
+
 ifdef CONFIG_FUNCTION_TRACER
 # Do not trace debug files and internal ftrace files
 CFLAGS_REMOVE_cgroup-debug.o = $(CC_FLAGS_FTRACE)
--- linux-4.4.0-62-83.orig/arch/x86/kernel/Makefile	2017-06-18 14:34:04.180000000 +0900
+++ linux-4.4.0/arch/x86/kernel/Makefile	2017-07-04 22:16:11.372000000 +0900
@@ -23,6 +23,7 @@ 
 CFLAGS_irq.o := -I$(src)/../include/asm/trace
 
 obj-y			:= process_$(BITS).o signal.o
+obj-$(CONFIG_AKO)	+= ako.o
 obj-$(CONFIG_COMPAT)	+= signal_compat.o
 obj-y			+= traps.o irq.o irq_$(BITS).o dumpstack_$(BITS).o
 obj-y			+= time.o ioport.o dumpstack.o nmi.o
--- linux-4.4.0-62-83.orig/init/Kconfig	2017-06-18 14:33:57.236000000 +0900
+++ linux-4.4.0/init/Kconfig	2017-07-04 22:14:43.080000000 +0900
@@ -333,6 +333,17 @@ 
 	depends on AUDITSYSCALL
 	select FSNOTIFY
 
+config HAVE_ARCH_AKO
+        bool
+
+config AKO
+        bool "Enable Additional Kernel Observer (AKO)" 
+        depends on AUDIT && HAVE_ARCH_AKO
+        help
+          AKO detects and prevents priviledge escalation attacks 
+          that exploit vulnerabilities of kernel
+
+
 source "kernel/irq/Kconfig"
 source "kernel/time/Kconfig"
 
--- linux-4.4.0-62-83.orig/arch/x86/Kconfig	2017-06-18 14:34:03.924000000 +0900
+++ linux-4.4.0/arch/x86/Kconfig	2017-07-06 12:27:26.692000000 +0900
@@ -76,6 +76,7 @@ 
 	select HAVE_ACPI_APEI_NMI		if ACPI
 	select HAVE_ALIGNED_STRUCT_PAGE		if SLUB
 	select HAVE_AOUT			if X86_32
+	select HAVE_ARCH_AKO			if X86_64
 	select HAVE_ARCH_AUDITSYSCALL
 	select HAVE_ARCH_HUGE_VMAP		if X86_64 || X86_PAE
 	select HAVE_ARCH_JUMP_LABEL
--- linux-4.4.0-62-83.orig/arch/x86/entry/entry_64.S	2017-06-18 14:34:04.008000000 +0900
+++ linux-4.4.0/arch/x86/entry/entry_64.S	2017-07-01 23:07:43.824000000 +0900
@@ -182,9 +182,43 @@ 
 #endif
 	ja	1f				/* return -ENOSYS (already in pt_regs->ax) */
 	movq	%r10, %rcx
+/*
+ * Additional Kernel Observer (AKO)
+ * Copyright (c) 2017 Okayama-University
+ *     Yohei Akao, Yamauchi Laboratory, Okayama University
+ */
+	subq	$6144,%rsp /*Allocate area in stack to save credential information*/
+	ALLOC_PT_GPREGS_ON_STACK
+	SAVE_C_REGS
+	SAVE_EXTRA_REGS
+	leaq	15*8(%rsp), %rdi /* size of SAVE_C_REGS and size of SAVE_EXTRA_REGS is added to rsp, and start address of allocated area is saved in %rdi*/
+	movq	%rax, %rsi /* Syscall number(%rax) is saved in %rsi */
+	call AKO_before /*credential information is saved*/
+	RESTORE_EXTRA_REGS
+	RESTORE_C_REGS
+	REMOVE_PT_GPREGS_FROM_STACK
+	addq	$6144,%rsp /*Allocate area in stack to save credential information*/
+	/*end of AKO*/
 	call	*sys_call_table(, %rax, 8)
+/*
+ * Additional Kernel Observer (AKO)
+ * Copyright (c) 2017 Okayama-University
+ *     Yohei Akao, Yamauchi Laboratory, Okayama University
+ */
+	/*Start of AKO*/
+	subq	$6144,%rsp
+	ALLOC_PT_GPREGS_ON_STACK
+	SAVE_C_REGS
+	SAVE_EXTRA_REGS
+	leaq	15*8(%rsp), %rdi
+	call AKO_after
+	RESTORE_EXTRA_REGS
+	RESTORE_C_REGS
+	REMOVE_PT_GPREGS_FROM_STACK
+	/*Free area to store credential infomation*/
+	addq	$6144,%rsp
+	/*End of AKO*/
 	movq	%rax, RAX(%rsp)
-1:
 /*
  * Syscall return path ending with SYSRET (fast path).
  * Has incompletely filled pt_regs.