Message ID | 1516712825-2917-2-git-send-email-schwidefsky@de.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Jan 23, 2018 at 02:07:01PM +0100, Martin Schwidefsky wrote: > Add the PR_ISOLATE_BP operation to prctl. The effect of the process > control is to make all branch prediction entries created by the execution > of the user space code of this task not applicable to kernel code or the > code of any other task. What is the rationale for requiring a per-process *opt-in* for this added protection? For KPTI on x86, the exact opposite approach is being discussed (see, e.g. http://lkml.kernel.org/r/1515612500-14505-1-git-send-email-w@1wt.eu ): By default, play it safe, with KPTI enabled. But for "trusted" processes, one may opt out using prctrl. Thanks, Dominik
On Tue, 23 Jan 2018 18:07:19 +0100 Dominik Brodowski <linux@dominikbrodowski.net> wrote: > On Tue, Jan 23, 2018 at 02:07:01PM +0100, Martin Schwidefsky wrote: > > Add the PR_ISOLATE_BP operation to prctl. The effect of the process > > control is to make all branch prediction entries created by the execution > > of the user space code of this task not applicable to kernel code or the > > code of any other task. > > What is the rationale for requiring a per-process *opt-in* for this added > protection? > > For KPTI on x86, the exact opposite approach is being discussed (see, e.g. > http://lkml.kernel.org/r/1515612500-14505-1-git-send-email-w@1wt.eu ): By > default, play it safe, with KPTI enabled. But for "trusted" processes, one > may opt out using prctrl. The rationale is that there are cases where you got code from *somewhere* and want to run it in an isolated context. Think: a docker container that runs under KVM. But with spectre this is still not really safe. So you include a wrapper program in the docker container to use the trap door prctl to start the potential malicious program. Now you should be good, no?
On 01/23/2018 06:07 PM, Dominik Brodowski wrote: > On Tue, Jan 23, 2018 at 02:07:01PM +0100, Martin Schwidefsky wrote: >> Add the PR_ISOLATE_BP operation to prctl. The effect of the process >> control is to make all branch prediction entries created by the execution >> of the user space code of this task not applicable to kernel code or the >> code of any other task. > > What is the rationale for requiring a per-process *opt-in* for this added > protection? > > For KPTI on x86, the exact opposite approach is being discussed (see, e.g. > http://lkml.kernel.org/r/1515612500-14505-1-git-send-email-w@1wt.eu ): By > default, play it safe, with KPTI enabled. But for "trusted" processes, one > may opt out using prctrl. FWIW, this is not about KPTI. s390 always has the kernel in a separate address space. Its only about potential spectre like attacks. This idea is to be able to isolate in controlled environments, e.g. if you have only one thread with untrusted code (e.g. jitting remote code). The property of the branch prediction mode on s390 is that it protects in two ways - against being attacked but also against being able to attack via the btb.
On Wed, Jan 24, 2018 at 07:29:53AM +0100, Martin Schwidefsky wrote: > On Tue, 23 Jan 2018 18:07:19 +0100 > Dominik Brodowski <linux@dominikbrodowski.net> wrote: > > > On Tue, Jan 23, 2018 at 02:07:01PM +0100, Martin Schwidefsky wrote: > > > Add the PR_ISOLATE_BP operation to prctl. The effect of the process > > > control is to make all branch prediction entries created by the execution > > > of the user space code of this task not applicable to kernel code or the > > > code of any other task. > > > > What is the rationale for requiring a per-process *opt-in* for this added > > protection? > > > > For KPTI on x86, the exact opposite approach is being discussed (see, e.g. > > http://lkml.kernel.org/r/1515612500-14505-1-git-send-email-w@1wt.eu ): By > > default, play it safe, with KPTI enabled. But for "trusted" processes, one > > may opt out using prctrl. > > The rationale is that there are cases where you got code from *somewhere* > and want to run it in an isolated context. Think: a docker container that > runs under KVM. But with spectre this is still not really safe. So you > include a wrapper program in the docker container to use the trap door > prctl to start the potential malicious program. Now you should be good, no? Well, partly. It may be that s390 and its use cases are special -- but as I understand it, this uapi question goes beyond this question: To my understanding, Linux traditionally tried to aim for the security goal of avoiding information leaks *between* users[+], probably even between processes of the same user. It wasn't a guarantee, and there always were (and will be) information leaks -- and that is where additional safeguards such as seccomp come into play, which reduce the attack surface against unknown or unresolved security-related bugs. And everyone knew (or should have known) that allowing "untrusted" code to be run (be it by an user, be it JavaScript, etc.) is more risky. But still, avoiding information leaks between users and between processes was (to my understanding) at least a goal.[§] In recent days however, the outlook on this issue seems to have shifted: - Your proposal would mean to trust all userspace code, unless it is specifically marked as untrusted. As I understand it, this would mean that by default, spectre isn't fully mitigated cross-user and cross-process, though the kernel could. And rogue user-run code may make use of that, unless it is run with a special wrapper. - Concerning x86 and IPBP, the current proposal is to limit the protection offered by IPBP to non-dumpable processes. As I understand it, this would mean that other processes are left hanging out to dry.[~] - Concerning x86 and STIBP, David mentioned that "[t]here's an argument that there are so many other information leaks between HT siblings that we might not care"; in the last couple of hours, a proposal emerged to limit the protection offered by STIBP to non-dumpable processes as well. To my understanding, this would mean that many processes are left hanging out to dry again. I am a bit worried whether this is a sign for a shift in the security goals. I fully understand that there might be processes (e.g. some[?] kernel threads) and users (root) which you need to trust anyway, as they can already access anything. Disabling additional, costly safeguards for those special cases then seems OK. Opting out of additional protections for single-user or single-use systems (haproxy?) might make sense as well. But the kernel[*] not offering full[#] spectre mitigation by default for regular users and their processes? I'm not so sure. Thanks, Dominik [+] root is different. [§] Whether such goals and their pursuit may have legal relevance -- e.g. concerning the criminal law protection against unlawful access to data -- is a related, fascinating topic. [~] For example, I doubt that mutt sets the non-dumpable flag. But I wouldn't want other users to be able to read my mail. [#] Well, at least the best the kernel can currently and reasonably manage. [*] Whether CPUs should enable full mitigation (IBRS_ALL) by default in future has been discussed on this list as well.
On Wed, 2018-01-24 at 09:37 +0100, Dominik Brodowski wrote: > On Wed, Jan 24, 2018 at 07:29:53AM +0100, Martin Schwidefsky wrote: > > > > On Tue, 23 Jan 2018 18:07:19 +0100 > > Dominik Brodowski <linux@dominikbrodowski.net> wrote: > > > > > > > > On Tue, Jan 23, 2018 at 02:07:01PM +0100, Martin Schwidefsky wrote: > > > > > > > > Add the PR_ISOLATE_BP operation to prctl. The effect of the process > > > > control is to make all branch prediction entries created by the execution > > > > of the user space code of this task not applicable to kernel code or the > > > > code of any other task. > > > > > > What is the rationale for requiring a per-process *opt-in* for this added > > > protection? > > > > > > For KPTI on x86, the exact opposite approach is being discussed (see, e.g. > > > http://lkml.kernel.org/r/1515612500-14505-1-git-send-email-w@1wt.eu ): By > > > default, play it safe, with KPTI enabled. But for "trusted" processes, one > > > may opt out using prctrl. > > > > The rationale is that there are cases where you got code from *somewhere* > > and want to run it in an isolated context. Think: a docker container that > > runs under KVM. But with spectre this is still not really safe. So you > > include a wrapper program in the docker container to use the trap door > > prctl to start the potential malicious program. Now you should be good, no? > > Well, partly. It may be that s390 and its use cases are special -- but as I > understand it, this uapi question goes beyond this question: > > To my understanding, Linux traditionally tried to aim for the security goal > of avoiding information leaks *between* users[+], probably even between > processes of the same user. It wasn't a guarantee, and there always were > (and will be) information leaks -- and that is where additional safeguards > such as seccomp come into play, which reduce the attack surface against > unknown or unresolved security-related bugs. And everyone knew (or should > have known) that allowing "untrusted" code to be run (be it by an user, be > it JavaScript, etc.) is more risky. But still, avoiding information leaks > between users and between processes was (to my understanding) at least a > goal.[§] > > In recent days however, the outlook on this issue seems to have shifted: > > - Your proposal would mean to trust all userspace code, unless it is > specifically marked as untrusted. As I understand it, this would mean that > by default, spectre isn't fully mitigated cross-user and cross-process, > though the kernel could. And rogue user-run code may make use of that, > unless it is run with a special wrapper. > > - Concerning x86 and IPBP, the current proposal is to limit the protection > offered by IPBP to non-dumpable processes. As I understand it, this would > mean that other processes are left hanging out to dry.[~] > > - Concerning x86 and STIBP, David mentioned that "[t]here's an argument that > there are so many other information leaks between HT siblings that we > might not care"; in the last couple of hours, a proposal emerged to limit > the protection offered by STIBP to non-dumpable processes as well. To my > understanding, this would mean that many processes are left hanging out to > dry again. > > I am a bit worried whether this is a sign for a shift in the security goals. > I fully understand that there might be processes (e.g. some[?] kernel > threads) and users (root) which you need to trust anyway, as they can > already access anything. Disabling additional, costly safeguards for > those special cases then seems OK. Opting out of additional protections for > single-user or single-use systems (haproxy?) might make sense as well. But > the kernel[*] not offering full[#] spectre mitigation by default for regular > users and their processes? I'm not so sure. Note that for STIBP/IBPB the operation of the flag is different in another way. We're using it as a "protect this process from others" flag, not a "protect others from this process" flag. I'm not sure this is a fundamental shift in overall security goals; more a recognition that on *current* hardware the cost of 100% protection against an attack that was fairly unlikely in the first place, is fairly prohibitive. For a process to make itself non-dumpable is a simple enough way to opt in. And *maybe* we could contemplate a command line option for 'IBPB always' but I'm *really* wary of exposing too much of that stuff, rather than simply trying to Do The Right Thing. > [*] Whether CPUs should enable full mitigation (IBRS_ALL) by default > in future has been discussed on this list as well. The kernel will do that; it's just not implemented yet because it's slightly non-trivial and can't be fully tested yet. We *will* want to ALTERNATIVE away the retpolines and just set IBRS_ALL because it'll be faster to do so. For IBRS_ALL, note that we still need the same IBPB flushes on context switch; just not STIBP. That's because IBRS_ALL, as Linus so eloquently reminded us, is *still* a stop-gap measure and not actually a fix. Reading between the lines, I think tagging predictions with the ring (and HT sibling?) they came from is the best they could slip into the next generation without having to stop the fabs for two years while they go back to the drawing board. A real fix will *hopefully* come later, but unfortunately Intel haven't even defined the bit in IA32_ARCH_CAPABILITIES which advertises "you don't have to do any of this shit any more; we fixed it", analogous to their RDCL_NO bit for "no more Meltdown". I'm *hoping* that's just an oversight in preparing the doc and not looking far enough ahead, rather than an actual *intent* to never fix it properly as Linus inferred.
Hi! On Wed 2018-01-24 09:37:05, Dominik Brodowski wrote: > On Wed, Jan 24, 2018 at 07:29:53AM +0100, Martin Schwidefsky wrote: > > On Tue, 23 Jan 2018 18:07:19 +0100 > > Dominik Brodowski <linux@dominikbrodowski.net> wrote: > > > > > On Tue, Jan 23, 2018 at 02:07:01PM +0100, Martin Schwidefsky wrote: > > > > Add the PR_ISOLATE_BP operation to prctl. The effect of the process > > > > control is to make all branch prediction entries created by the execution > > > > of the user space code of this task not applicable to kernel code or the > > > > code of any other task. > > > > > > What is the rationale for requiring a per-process *opt-in* for this added > > > protection? > > > > > > For KPTI on x86, the exact opposite approach is being discussed (see, e.g. > > > http://lkml.kernel.org/r/1515612500-14505-1-git-send-email-w@1wt.eu ): By > > > default, play it safe, with KPTI enabled. But for "trusted" processes, one > > > may opt out using prctrl. > > > > The rationale is that there are cases where you got code from *somewhere* > > and want to run it in an isolated context. Think: a docker container that > > runs under KVM. But with spectre this is still not really safe. So you > > include a wrapper program in the docker container to use the trap door > > prctl to start the potential malicious program. Now you should be good, no? > > Well, partly. It may be that s390 and its use cases are special -- but as I > understand it, this uapi question goes beyond this question: > > To my understanding, Linux traditionally tried to aim for the security goal > of avoiding information leaks *between* users[+], probably even between > processes of the same user. It wasn't a guarantee, and there always It used to be guarantee. It still is, on non-buggy CPUs. Leaks between users need to be prevented. Leaks between one user should be prevented, too. There are various ways to restrict the user these days, and for example sandboxed chromium process should not be able to read my ~/.ssh. can_ptrace() is closer to "can allow leaks between these two". Still not quite there, as code might be running in process that can_ptrace(), but the code has been audited by JIT or something not to do syscalls. > (and will be) information leaks -- and that is where additional safeguards > such as seccomp come into play, which reduce the attack surface against > unknown or unresolved security-related bugs. And everyone knew (or should > have known) that allowing "untrusted" code to be run (be it by an user, be > it JavaScript, etc.) is more risky. But still, avoiding information leaks > between users and between processes was (to my understanding) at least a > goal.[§] > > In recent days however, the outlook on this issue seems to have shifted: > > - Your proposal would mean to trust all userspace code, unless it is > specifically marked as untrusted. As I understand it, this would mean that > by default, spectre isn't fully mitigated cross-user and cross-process, > though the kernel could. And rogue user-run code may make use of that, > unless it is run with a special wrapper. Yeah, well, that proposal does not fly, then.
On Wed, 24 Jan 2018 12:15:53 +0100 Pavel Machek <pavel@ucw.cz> wrote: > Hi! > > On Wed 2018-01-24 09:37:05, Dominik Brodowski wrote: > > On Wed, Jan 24, 2018 at 07:29:53AM +0100, Martin Schwidefsky wrote: > > > On Tue, 23 Jan 2018 18:07:19 +0100 > > > Dominik Brodowski <linux@dominikbrodowski.net> wrote: > > > > > > > On Tue, Jan 23, 2018 at 02:07:01PM +0100, Martin Schwidefsky wrote: > > > > > Add the PR_ISOLATE_BP operation to prctl. The effect of the process > > > > > control is to make all branch prediction entries created by the execution > > > > > of the user space code of this task not applicable to kernel code or the > > > > > code of any other task. > > > > > > > > What is the rationale for requiring a per-process *opt-in* for this added > > > > protection? > > > > > > > > For KPTI on x86, the exact opposite approach is being discussed (see, e.g. > > > > http://lkml.kernel.org/r/1515612500-14505-1-git-send-email-w@1wt.eu ): By > > > > default, play it safe, with KPTI enabled. But for "trusted" processes, one > > > > may opt out using prctrl. > > > > > > The rationale is that there are cases where you got code from *somewhere* > > > and want to run it in an isolated context. Think: a docker container that > > > runs under KVM. But with spectre this is still not really safe. So you > > > include a wrapper program in the docker container to use the trap door > > > prctl to start the potential malicious program. Now you should be good, no? > > > > Well, partly. It may be that s390 and its use cases are special -- but as I > > understand it, this uapi question goes beyond this question: > > > > To my understanding, Linux traditionally tried to aim for the security goal > > of avoiding information leaks *between* users[+], probably even between > > processes of the same user. It wasn't a guarantee, and there always > > It used to be guarantee. It still is, on non-buggy CPUs. In a perfect world none of this would have ever happened. But reality begs to differ. > Leaks between users need to be prevented. > > Leaks between one user should be prevented, too. There are various > ways to restrict the user these days, and for example sandboxed > chromium process should not be able to read my ~/.ssh. Interesting that you mention the use case of a sandboxed browser process. Why do you sandbox it in the first place? Because your do not trust it as it might download malicious java-script code which uses some form of attack to read the content of your ~/.ssh files. That is the use case for the new prctl, limit this piece of code you *identified* as untrusted. > can_ptrace() is closer to "can allow leaks between these two". Still > not quite there, as code might be running in process that > can_ptrace(), but the code has been audited by JIT or something not to > do syscalls. > > > (and will be) information leaks -- and that is where additional safeguards > > such as seccomp come into play, which reduce the attack surface against > > unknown or unresolved security-related bugs. And everyone knew (or should > > have known) that allowing "untrusted" code to be run (be it by an user, be > > it JavaScript, etc.) is more risky. But still, avoiding information leaks > > between users and between processes was (to my understanding) at least a > > goal.[§] > > > > In recent days however, the outlook on this issue seems to have shifted: > > > > - Your proposal would mean to trust all userspace code, unless it is > > specifically marked as untrusted. As I understand it, this would mean that > > by default, spectre isn't fully mitigated cross-user and cross-process, > > though the kernel could. And rogue user-run code may make use of that, > > unless it is run with a special wrapper. > > Yeah, well, that proposal does not fly, then. It does not fly as a solution for the general case if cross-process attacks. But for the special case where you can identify all of the potential untrusted code in your setup it should work just fine, no?
On Wed, 24 Jan 2018 09:37:05 +0100 > To my understanding, Linux traditionally tried to aim for the security goal > of avoiding information leaks *between* users[+], probably even between > processes of the same user. It wasn't a guarantee, and there always were Not between processes of the same user in general (see ptrace or use gdb). > (and will be) information leaks -- and that is where additional safeguards > such as seccomp come into play, which reduce the attack surface against seccomp is irrelevant on many processors (see the Armageddon paper). You can (given willing partners) transfer data into and out of a seccomp process at quite a respectable rate depending upon your hardware features. > I am a bit worried whether this is a sign for a shift in the security goals. > I fully understand that there might be processes (e.g. some[?] kernel > threads) and users (root) which you need to trust anyway, as they can dumpable is actually very useful but only in a specific way. The question if process A is dumpable by process B then there is no meaningful protection between them and you don't need to do any work. Likewise if A and B can dump each other and are both running on the same ht pair you don't have to worry about them attacking one another. In all those cases they can do it with ptrace already. [There's a corner case here of using BPF filters to block ptrace] Alan
Hi! > > On Wed 2018-01-24 09:37:05, Dominik Brodowski wrote: > > > On Wed, Jan 24, 2018 at 07:29:53AM +0100, Martin Schwidefsky wrote: > > > > On Tue, 23 Jan 2018 18:07:19 +0100 > > > > Dominik Brodowski <linux@dominikbrodowski.net> wrote: > > > > > > > > > On Tue, Jan 23, 2018 at 02:07:01PM +0100, Martin Schwidefsky wrote: > > > Well, partly. It may be that s390 and its use cases are special -- but as I > > > understand it, this uapi question goes beyond this question: > > > > > > To my understanding, Linux traditionally tried to aim for the security goal > > > of avoiding information leaks *between* users[+], probably even between > > > processes of the same user. It wasn't a guarantee, and there always > > > > It used to be guarantee. It still is, on non-buggy CPUs. > > In a perfect world none of this would have ever happened. > But reality begs to differ. Ok, so: "Linux traditionally guarantees lack of information leaks between PIDs". Yes, you can use ptrace, but that should be it. > > Leaks between users need to be prevented. > > > > Leaks between one user should be prevented, too. There are various > > ways to restrict the user these days, and for example sandboxed > > chromium process should not be able to read my ~/.ssh. > > Interesting that you mention the use case of a sandboxed browser process. > Why do you sandbox it in the first place? Because your do not trust it > as it might download malicious java-script code which uses some form of > attack to read the content of your ~/.ssh files. That is the use case for > the new prctl, limit this piece of code you *identified* as > untrusted. See Alan Cox's replies. Anyway. There's more than one way to mark process as untrusted, (setuid nobody, seccomp, chroot nowhere, ptrace jail, ...). Do not attempt to add prctl() to the list. > > > In recent days however, the outlook on this issue seems to have shifted: > > > > > > - Your proposal would mean to trust all userspace code, unless it is > > > specifically marked as untrusted. As I understand it, this would mean that > > > by default, spectre isn't fully mitigated cross-user and cross-process, > > > though the kernel could. And rogue user-run code may make use of that, > > > unless it is run with a special wrapper. > > > > Yeah, well, that proposal does not fly, then. > > It does not fly as a solution for the general case if cross-process attacks. > But for the special case where you can identify all of the potential untrusted > code in your setup it should work just fine, no? Well.. you can identify all of the untrusted code. Anything that does not have CAP_HW_ACCESS is untrusted :-). Anyway, no need to add prctl(), if A can ptrace B and B can ptrace A, leaking info between them should not be a big deal. You can probably find existing macros doing neccessary checks. Pavel
> Anyway, no need to add prctl(), if A can ptrace B and B can ptrace A, > leaking info between them should not be a big deal. You can probably > find existing macros doing neccessary checks. Until one of them is security managed so it shouldn't be able to ptrace the other, or (and this is the nasty one) when a process is executing code it wants to protect from the rest of the same process (eg an untrusted jvm, javascript or probably nastiest of all webassembly) We don't need a prctl for trusted/untrusted IMHO but we do eventually need to think about API's for "this lot is me but I don't trust it" (flatpack, docker, etc) and for what JIT engines need to do. Alan
On Wed 2018-01-24 20:46:22, Alan Cox wrote: > > Anyway, no need to add prctl(), if A can ptrace B and B can ptrace A, > > leaking info between them should not be a big deal. You can probably > > find existing macros doing neccessary checks. > > Until one of them is security managed so it shouldn't be able to ptrace > the other, or (and this is the nasty one) when a process is executing > code it wants to protect from the rest of the same process (eg an > untrusted jvm, javascript or probably nastiest of all webassembly) > > We don't need a prctl for trusted/untrusted IMHO but we do eventually > need to think about API's for "this lot is me but I don't trust > it" (flatpack, docker, etc) and for what JIT engines need to do. Agreed. And yes, JITs are interesting, and given the latest rowhammer/sidechannel attacks, something we may want to limit in future... It sounds nice on paper but is just risky. Pavel
On Mon, 29 Jan 2018 14:14:46 +0100 Pavel Machek <pavel@ucw.cz> wrote: > On Wed 2018-01-24 20:46:22, Alan Cox wrote: > > > Anyway, no need to add prctl(), if A can ptrace B and B can ptrace A, > > > leaking info between them should not be a big deal. You can probably > > > find existing macros doing neccessary checks. > > > > Until one of them is security managed so it shouldn't be able to ptrace > > the other, or (and this is the nasty one) when a process is executing > > code it wants to protect from the rest of the same process (eg an > > untrusted jvm, javascript or probably nastiest of all webassembly) > > > > We don't need a prctl for trusted/untrusted IMHO but we do eventually > > need to think about API's for "this lot is me but I don't trust > > it" (flatpack, docker, etc) and for what JIT engines need to do. > > Agreed. > > And yes, JITs are interesting, and given the latest > rowhammer/sidechannel attacks, something we may want to limit in > future... > > It sounds nice on paper but is just risky. I don't think java, javascript, webassembly, (and for some implementations truetype, pdf, postscript, ... and more) are going away in a hurry. Alan
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index af5f8c2..e7b84c9 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -207,4 +207,12 @@ struct prctl_mm_map { # define PR_SVE_VL_LEN_MASK 0xffff # define PR_SVE_VL_INHERIT (1 << 17) /* inherit across exec */ +/* + * Prevent branch prediction entries created by the execution of + * user space code of this task to be used in any other context. + * This makes it impossible for malicious user space code to train + * a branch in the kernel code or in another task to be mispredicted. + */ +#define PR_ISOLATE_BP 52 + #endif /* _LINUX_PRCTL_H */ diff --git a/kernel/sys.c b/kernel/sys.c index 83ffd7d..e41cb2f 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -117,6 +117,9 @@ #ifndef SVE_GET_VL # define SVE_GET_VL() (-EINVAL) #endif +#ifndef ISOLATE_BP +# define ISOLATE_BP() (-EINVAL) +#endif /* * this is where the system-wide overflow UID and GID are defined, for @@ -2398,6 +2401,9 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, case PR_SVE_GET_VL: error = SVE_GET_VL(); break; + case PR_ISOLATE_BP: + error = ISOLATE_BP(); + break; default: error = -EINVAL; break;
Add the PR_ISOLATE_BP operation to prctl. The effect of the process control is to make all branch prediction entries created by the execution of the user space code of this task not applicable to kernel code or the code of any other task. This can be achieved by the architecture specific implementation in different ways, e.g. by limiting the branch predicion for the task, or by clearing the branch prediction tables on each context switch, or by tagging the branch prediction entries in a suitable way. The architecture code needs to define the ISOLATE_BP macro to implement the hardware specific details of the branch prediction isolation. The control can not be removed from a task once it is activated and it is inherited by all children of the task. The user space wrapper to start a program with the isolated branch prediction: int main(int argc, char *argv[], char *envp[]) { int rc; if (argc < 2) { fprintf(stderr, "Usage: %s <file-to-exec> <arguments>\n", argv[0]); exit(EXIT_FAILURE); } rc = prctl(PR_ISOLATE_BP); if (rc) { perror("PR_ISOLATE_BP"); exit(EXIT_FAILURE); } execve(argv[1], argv + 1, envp); perror("execve"); exit(EXIT_FAILURE); } Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> --- include/uapi/linux/prctl.h | 8 ++++++++ kernel/sys.c | 6 ++++++ 2 files changed, 14 insertions(+)