Message ID | 20200910202107.3799376-7-keescook@chromium.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Fork brute force attack mitigation (fbfam) | expand |
On Thu, Sep 10, 2020 at 10:22 PM Kees Cook <keescook@chromium.org> wrote: > In order to mitigate a fork brute force attack it is necessary to kill > all the offending tasks. This tasks are all the ones that share the > statistical data with the current task (the task that has crashed). > > Since the attack detection is done in the function fbfam_handle_attack() > that is called every time a core dump is triggered, only is needed to > kill the others tasks that share the same statistical data, not the > current one as this is in the path to be killed. > > When the SIGKILL signal is sent to the offending tasks from the function > fbfam_kill_tasks(), this one will be called again during the core dump > due to the shared statistical data shows a quickly crashing rate. So, to > avoid kill again the same tasks due to a recursive call of this > function, it is necessary to disable the attack detection. > > To disable this attack detection, add a condition in the function > fbfam_handle_attack() to not compute the crashing rate when the jiffies > stored in the statistical data are set to zero. [...] > /** > - * fbfam_handle_attack() - Fork brute force attack detection. > + * fbfam_kill_tasks() - Kill the offending tasks > + * > + * When a fork brute force attack is detected it is necessary to kill all the > + * offending tasks. Since this function is called from fbfam_handle_attack(), > + * and so, every time a core dump is triggered, only is needed to kill the > + * others tasks that share the same statistical data, not the current one as > + * this is in the path to be killed. > + * > + * When the SIGKILL signal is sent to the offending tasks, this function will be > + * called again during the core dump due to the shared statistical data shows a > + * quickly crashing rate. So, to avoid kill again the same tasks due to a > + * recursive call of this function, it is necessary to disable the attack > + * detection setting the jiffies to zero. > + * > + * To improve the for_each_process loop it is possible to end it when all the > + * tasks that shared the same statistics are found. This is not a fastpath, there's no need to be clever and optimize things here, please get rid of that optimization. Especially since that fastpath looks racy against concurrent execve(). > + * Return: -EFAULT if the current task doesn't have statistical data. Zero > + * otherwise. > + */ > +static int fbfam_kill_tasks(void) > +{ > + struct fbfam_stats *stats = current->fbfam_stats; > + struct task_struct *p; > + unsigned int to_kill, killed = 0; > + > + if (!stats) > + return -EFAULT; > + > + to_kill = refcount_read(&stats->refc) - 1; > + if (!to_kill) > + return 0; > + > + /* Disable the attack detection */ > + stats->jiffies = 0; > + rcu_read_lock(); > + > + for_each_process(p) { > + if (p == current || p->fbfam_stats != stats) p->fbfam_stats could change concurrently, you should at least use READ_ONCE() here. Also, if this codepath is hit by a non-leader thread, "p == current" will always be false, and you'll end up killing the caller, too. You may want to compare with current->group_leader instead. > + continue; > + > + do_send_sig_info(SIGKILL, SEND_SIG_PRIV, p, PIDTYPE_PID); > + pr_warn("fbfam: Offending process with PID %d killed\n", > + p->pid); Normally pr_*() messages about tasks mention not just the pid, but also the ->comm name of the task. > + killed += 1; > + if (killed >= to_kill) > + break; > + } > + > + rcu_read_unlock(); > + return 0; > +}
On Thu, Sep 10, 2020 at 01:21:07PM -0700, Kees Cook wrote: > From: John Wood <john.wood@gmx.com> > > In order to mitigate a fork brute force attack it is necessary to kill > all the offending tasks. This tasks are all the ones that share the > statistical data with the current task (the task that has crashed). > > Since the attack detection is done in the function fbfam_handle_attack() > that is called every time a core dump is triggered, only is needed to > kill the others tasks that share the same statistical data, not the > current one as this is in the path to be killed. > > When the SIGKILL signal is sent to the offending tasks from the function > fbfam_kill_tasks(), this one will be called again during the core dump > due to the shared statistical data shows a quickly crashing rate. So, to > avoid kill again the same tasks due to a recursive call of this > function, it is necessary to disable the attack detection. > > To disable this attack detection, add a condition in the function > fbfam_handle_attack() to not compute the crashing rate when the jiffies > stored in the statistical data are set to zero. > > Signed-off-by: John Wood <john.wood@gmx.com> > --- > security/fbfam/fbfam.c | 76 +++++++++++++++++++++++++++++++++++++++--- > 1 file changed, 71 insertions(+), 5 deletions(-) > > diff --git a/security/fbfam/fbfam.c b/security/fbfam/fbfam.c > index 3aa669e4ea51..173a6122390f 100644 > --- a/security/fbfam/fbfam.c > +++ b/security/fbfam/fbfam.c > @@ -4,8 +4,11 @@ > #include <linux/errno.h> > #include <linux/gfp.h> > #include <linux/jiffies.h> > +#include <linux/pid.h> > #include <linux/printk.h> > +#include <linux/rcupdate.h> > #include <linux/refcount.h> > +#include <linux/sched/signal.h> > #include <linux/signal.h> > #include <linux/slab.h> > > @@ -24,7 +27,8 @@ unsigned long sysctl_crashing_rate_threshold = 30000; > * struct fbfam_stats - Fork brute force attack mitigation statistics. > * @refc: Reference counter. > * @faults: Number of crashes since jiffies. > - * @jiffies: First fork or execve timestamp. > + * @jiffies: First fork or execve timestamp. If zero, the attack detection is > + * disabled. > * > * The purpose of this structure is to manage all the necessary information to > * compute the crashing rate of an application. So, it holds a first fork or > @@ -175,13 +179,69 @@ int fbfam_exit(void) > } > > /** > - * fbfam_handle_attack() - Fork brute force attack detection. > + * fbfam_kill_tasks() - Kill the offending tasks > + * > + * When a fork brute force attack is detected it is necessary to kill all the > + * offending tasks. Since this function is called from fbfam_handle_attack(), > + * and so, every time a core dump is triggered, only is needed to kill the > + * others tasks that share the same statistical data, not the current one as > + * this is in the path to be killed. > + * > + * When the SIGKILL signal is sent to the offending tasks, this function will be > + * called again during the core dump due to the shared statistical data shows a > + * quickly crashing rate. So, to avoid kill again the same tasks due to a > + * recursive call of this function, it is necessary to disable the attack > + * detection setting the jiffies to zero. > + * > + * To improve the for_each_process loop it is possible to end it when all the > + * tasks that shared the same statistics are found. > + * > + * Return: -EFAULT if the current task doesn't have statistical data. Zero > + * otherwise. > + */ > +static int fbfam_kill_tasks(void) > +{ > + struct fbfam_stats *stats = current->fbfam_stats; > + struct task_struct *p; > + unsigned int to_kill, killed = 0; > + > + if (!stats) > + return -EFAULT; > + > + to_kill = refcount_read(&stats->refc) - 1; > + if (!to_kill) > + return 0; > + > + /* Disable the attack detection */ > + stats->jiffies = 0; > + rcu_read_lock(); > + > + for_each_process(p) { > + if (p == current || p->fbfam_stats != stats) > + continue; > + > + do_send_sig_info(SIGKILL, SEND_SIG_PRIV, p, PIDTYPE_PID); > + pr_warn("fbfam: Offending process with PID %d killed\n", > + p->pid); I'd make this ratelimited (along with Jann's suggestions). Also, instead of the explicit "fbfam:" prefix, use the regular prefixing method: #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt > + > + killed += 1; > + if (killed >= to_kill) > + break; > + } > + > + rcu_read_unlock(); Can't newly created processes escape this RCU read lock? I think this need alternate locking, or something in the task_alloc hook that will block any new process from being created within the stats group. > + return 0; > +} > + > +/** > + * fbfam_handle_attack() - Fork brute force attack detection and mitigation. > * @signal: Signal number that causes the core dump. > * > * The crashing rate of an application is computed in milliseconds per fault in > * each crash. So, if this rate goes under a certain threshold there is a clear > * signal that the application is crashing quickly. At this moment, a fork brute > - * force attack is happening. > + * force attack is happening. Under this scenario it is necessary to kill all > + * the offending tasks in order to mitigate the attack. > * > * Return: -EFAULT if the current task doesn't have statistical data. Zero > * otherwise. > @@ -195,6 +255,10 @@ int fbfam_handle_attack(int signal) > if (!stats) > return -EFAULT; > > + /* The attack detection is disabled */ > + if (!stats->jiffies) > + return 0; > + > if (!(signal == SIGILL || signal == SIGBUS || signal == SIGKILL || > signal == SIGSEGV || signal == SIGSYS)) > return 0; > @@ -205,9 +269,11 @@ int fbfam_handle_attack(int signal) > delta_time = jiffies64_to_msecs(delta_jiffies); > crashing_rate = delta_time / (u64)stats->faults; > > - if (crashing_rate < (u64)sysctl_crashing_rate_threshold) > - pr_warn("fbfam: Fork brute force attack detected\n"); > + if (crashing_rate >= (u64)sysctl_crashing_rate_threshold) > + return 0; > > + pr_warn("fbfam: Fork brute force attack detected\n"); > + fbfam_kill_tasks(); > return 0; > } > > -- > 2.25.1 >
On Fri, Sep 11, 2020 at 1:56 AM Kees Cook <keescook@chromium.org> wrote: > On Thu, Sep 10, 2020 at 01:21:07PM -0700, Kees Cook wrote: > > From: John Wood <john.wood@gmx.com> > > > > In order to mitigate a fork brute force attack it is necessary to kill > > all the offending tasks. This tasks are all the ones that share the > > statistical data with the current task (the task that has crashed). > > > > Since the attack detection is done in the function fbfam_handle_attack() > > that is called every time a core dump is triggered, only is needed to > > kill the others tasks that share the same statistical data, not the > > current one as this is in the path to be killed. [...] > > + for_each_process(p) { > > + if (p == current || p->fbfam_stats != stats) > > + continue; > > + > > + do_send_sig_info(SIGKILL, SEND_SIG_PRIV, p, PIDTYPE_PID); > > + pr_warn("fbfam: Offending process with PID %d killed\n", > > + p->pid); [...] > > + > > + killed += 1; > > + if (killed >= to_kill) > > + break; > > + } > > + > > + rcu_read_unlock(); > > Can't newly created processes escape this RCU read lock? I think this > need alternate locking, or something in the task_alloc hook that will > block any new process from being created within the stats group. Good point; the proper way to deal with this would probably be to take the tasklist_lock in read mode around this loop (with read_lock(&tasklist_lock) / read_unlock(&tasklist_lock)), which pairs with the write_lock_irq(&tasklist_lock) in copy_process(). Thanks to the fatal_signal_pending() check while holding the lock in copy_process(), that would be race-free - any fork() that has not yet inserted the new task into the global task list would wait for us to drop the tasklist_lock, then bail out at the fatal_signal_pending() check.
On Thu, Sep 10, 2020 at 04:56:19PM -0700, Kees Cook wrote: > On Thu, Sep 10, 2020 at 01:21:07PM -0700, Kees Cook wrote: > > /** > > + * fbfam_kill_tasks() - Kill the offending tasks > > + * > > + * When a fork brute force attack is detected it is necessary to kill all the > > + * offending tasks. Since this function is called from fbfam_handle_attack(), > > + * and so, every time a core dump is triggered, only is needed to kill the > > + * others tasks that share the same statistical data, not the current one as > > + * this is in the path to be killed. > > + * > > + * When the SIGKILL signal is sent to the offending tasks, this function will be > > + * called again during the core dump due to the shared statistical data shows a > > + * quickly crashing rate. So, to avoid kill again the same tasks due to a > > + * recursive call of this function, it is necessary to disable the attack > > + * detection setting the jiffies to zero. > > + * > > + * To improve the for_each_process loop it is possible to end it when all the > > + * tasks that shared the same statistics are found. > > + * > > + * Return: -EFAULT if the current task doesn't have statistical data. Zero > > + * otherwise. > > + */ > > +static int fbfam_kill_tasks(void) > > +{ > > + struct fbfam_stats *stats = current->fbfam_stats; > > + struct task_struct *p; > > + unsigned int to_kill, killed = 0; > > + > > + if (!stats) > > + return -EFAULT; > > + > > + to_kill = refcount_read(&stats->refc) - 1; > > + if (!to_kill) > > + return 0; > > + > > + /* Disable the attack detection */ > > + stats->jiffies = 0; > > + rcu_read_lock(); > > + > > + for_each_process(p) { > > + if (p == current || p->fbfam_stats != stats) > > + continue; > > + > > + do_send_sig_info(SIGKILL, SEND_SIG_PRIV, p, PIDTYPE_PID); > > + pr_warn("fbfam: Offending process with PID %d killed\n", > > + p->pid); > > I'd make this ratelimited (along with Jann's suggestions). Sorry, but I don't understand what you mean with "make this ratelimited". A clarification would be greatly appreciated. > Also, instead of the explicit "fbfam:" prefix, use the regular > prefixing method: > > #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt Understood. > > + > > + killed += 1; > > + if (killed >= to_kill) > > + break; > > + } > > + > > + rcu_read_unlock(); > > Can't newly created processes escape this RCU read lock? I think this > need alternate locking, or something in the task_alloc hook that will > block any new process from being created within the stats group. I will work on this for the next version. Thanks. > > + return 0; > > +} > > -- > Kees Cook Thanks John Wood
On Fri, Sep 18, 2020 at 06:02:16PM +0200, John Wood wrote: > On Thu, Sep 10, 2020 at 04:56:19PM -0700, Kees Cook wrote: > > On Thu, Sep 10, 2020 at 01:21:07PM -0700, Kees Cook wrote: > > > + pr_warn("fbfam: Offending process with PID %d killed\n", > > > + p->pid); > > > > I'd make this ratelimited (along with Jann's suggestions). > > Sorry, but I don't understand what you mean with "make this ratelimited". > A clarification would be greatly appreciated. Ah! Yes, sorry for not being more clear. There are ratelimit helpers for the pr_*() family of functions, e.g.: pr_warn_ratelimited("brute: Offending process with PID...
On Fri, Sep 18, 2020 at 02:35:12PM -0700, Kees Cook wrote: > On Fri, Sep 18, 2020 at 06:02:16PM +0200, John Wood wrote: > > On Thu, Sep 10, 2020 at 04:56:19PM -0700, Kees Cook wrote: > > > On Thu, Sep 10, 2020 at 01:21:07PM -0700, Kees Cook wrote: > > > > + pr_warn("fbfam: Offending process with PID %d killed\n", > > > > + p->pid); > > > > > > I'd make this ratelimited (along with Jann's suggestions). > > > > Sorry, but I don't understand what you mean with "make this ratelimited". > > A clarification would be greatly appreciated. > > Ah! Yes, sorry for not being more clear. There are ratelimit helpers for > the pr_*() family of functions, e.g.: > > pr_warn_ratelimited("brute: Offending process with PID... Thanks for the clarification. > -- > Kees Cook Regards, John Wood
diff --git a/security/fbfam/fbfam.c b/security/fbfam/fbfam.c index 3aa669e4ea51..173a6122390f 100644 --- a/security/fbfam/fbfam.c +++ b/security/fbfam/fbfam.c @@ -4,8 +4,11 @@ #include <linux/errno.h> #include <linux/gfp.h> #include <linux/jiffies.h> +#include <linux/pid.h> #include <linux/printk.h> +#include <linux/rcupdate.h> #include <linux/refcount.h> +#include <linux/sched/signal.h> #include <linux/signal.h> #include <linux/slab.h> @@ -24,7 +27,8 @@ unsigned long sysctl_crashing_rate_threshold = 30000; * struct fbfam_stats - Fork brute force attack mitigation statistics. * @refc: Reference counter. * @faults: Number of crashes since jiffies. - * @jiffies: First fork or execve timestamp. + * @jiffies: First fork or execve timestamp. If zero, the attack detection is + * disabled. * * The purpose of this structure is to manage all the necessary information to * compute the crashing rate of an application. So, it holds a first fork or @@ -175,13 +179,69 @@ int fbfam_exit(void) } /** - * fbfam_handle_attack() - Fork brute force attack detection. + * fbfam_kill_tasks() - Kill the offending tasks + * + * When a fork brute force attack is detected it is necessary to kill all the + * offending tasks. Since this function is called from fbfam_handle_attack(), + * and so, every time a core dump is triggered, only is needed to kill the + * others tasks that share the same statistical data, not the current one as + * this is in the path to be killed. + * + * When the SIGKILL signal is sent to the offending tasks, this function will be + * called again during the core dump due to the shared statistical data shows a + * quickly crashing rate. So, to avoid kill again the same tasks due to a + * recursive call of this function, it is necessary to disable the attack + * detection setting the jiffies to zero. + * + * To improve the for_each_process loop it is possible to end it when all the + * tasks that shared the same statistics are found. + * + * Return: -EFAULT if the current task doesn't have statistical data. Zero + * otherwise. + */ +static int fbfam_kill_tasks(void) +{ + struct fbfam_stats *stats = current->fbfam_stats; + struct task_struct *p; + unsigned int to_kill, killed = 0; + + if (!stats) + return -EFAULT; + + to_kill = refcount_read(&stats->refc) - 1; + if (!to_kill) + return 0; + + /* Disable the attack detection */ + stats->jiffies = 0; + rcu_read_lock(); + + for_each_process(p) { + if (p == current || p->fbfam_stats != stats) + continue; + + do_send_sig_info(SIGKILL, SEND_SIG_PRIV, p, PIDTYPE_PID); + pr_warn("fbfam: Offending process with PID %d killed\n", + p->pid); + + killed += 1; + if (killed >= to_kill) + break; + } + + rcu_read_unlock(); + return 0; +} + +/** + * fbfam_handle_attack() - Fork brute force attack detection and mitigation. * @signal: Signal number that causes the core dump. * * The crashing rate of an application is computed in milliseconds per fault in * each crash. So, if this rate goes under a certain threshold there is a clear * signal that the application is crashing quickly. At this moment, a fork brute - * force attack is happening. + * force attack is happening. Under this scenario it is necessary to kill all + * the offending tasks in order to mitigate the attack. * * Return: -EFAULT if the current task doesn't have statistical data. Zero * otherwise. @@ -195,6 +255,10 @@ int fbfam_handle_attack(int signal) if (!stats) return -EFAULT; + /* The attack detection is disabled */ + if (!stats->jiffies) + return 0; + if (!(signal == SIGILL || signal == SIGBUS || signal == SIGKILL || signal == SIGSEGV || signal == SIGSYS)) return 0; @@ -205,9 +269,11 @@ int fbfam_handle_attack(int signal) delta_time = jiffies64_to_msecs(delta_jiffies); crashing_rate = delta_time / (u64)stats->faults; - if (crashing_rate < (u64)sysctl_crashing_rate_threshold) - pr_warn("fbfam: Fork brute force attack detected\n"); + if (crashing_rate >= (u64)sysctl_crashing_rate_threshold) + return 0; + pr_warn("fbfam: Fork brute force attack detected\n"); + fbfam_kill_tasks(); return 0; }