Message ID | 20170113220256.GA57663@beast (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 13 Jan 2017 at 14:02, Kees Cook wrote: > This plugin detects any structures that contain __user attributes and > makes sure it is being fulling initialized so that a specific class of > information exposure is eliminated. (For example, the exposure of siginfo > in CVE-2013-2141 would have been blocked by this plugin.) why the conditional? the plugin was specifically written to block that bug and block it did ;). > +config GCC_PLUGIN_STRUCTLEAK > + bool "Force initialization of variables containing userspace addresses" > + depends on GCC_PLUGINS > + help > + This plugin zero-initializes any structures that containing a > + __user attribute. This can prevent some classes of information > + exposures. i see that you completely ditched the description in PaX, is there a reason for it? your text isn't correct as is because - the __user attribute (which is an implementation choice, see below) doesn't apply to structures but pointers only (as it does for sparse IIRC) - a structure is a type, but the plugin initializes variables, not types (the latter makes little sense) - the plugin doesn't initialize 'any structures' (well, variables), only locals and only at function scope (subject to further evolution as discussed earlier). > +config GCC_PLUGIN_STRUCTLEAK_VERBOSE > + bool "Report initialized variables" > + depends on GCC_PLUGIN_STRUCTLEAK > + depends on !COMPILE_TEST > + help > + This option will cause a warning to be printed each time the > + structleak plugin finds a variable it thinks needs to be > + initialized. Since not all existing initializers are detected > + by the plugin, this can produce false positive warnings. there are no false positives, a variable either has a constructor or it does not ;) > +/* unused C type flag in all versions 4.5-6 */ > +#define TYPE_USERSPACE(TYPE) TYPE_LANG_FLAG_5(TYPE) FYI, this is a sort of abuse/hack of tree flags and should not be implemented this way in the upstream kernel as it's a finite resource and needs careful verification against all supported gcc versions (these flags are meant for language fronteds, i kinda got lucky to have a few of them unusued but it's not a robust future-proof approach). instead an attribute should be used to mark these types. whether that can/should be __user itself is a good question since that's another hack where the plugin 'hijacks' a sparse address space atttribute (for which gcc 4.6+ has its own facilities and that the checker gcc plugin makes use of thus it's not compatible with structleak as is).
Hi, [adding Dave, so retaining full context below] On Fri, Jan 13, 2017 at 02:02:56PM -0800, Kees Cook wrote: > This plugin detects any structures that contain __user attributes and > makes sure it is being fulling initialized so that a specific class of Nit: s/fulling/fully/ > information exposure is eliminated. (For example, the exposure of siginfo > in CVE-2013-2141 would have been blocked by this plugin.) > > Ported from grsecurity/PaX. This version adds a verbose option to the > plugin and the Kconfig. > > Signed-off-by: Kees Cook <keescook@chromium.org> > --- > arch/Kconfig | 22 +++ > include/linux/compiler.h | 6 +- > scripts/Makefile.gcc-plugins | 4 + > scripts/gcc-plugins/structleak_plugin.c | 246 ++++++++++++++++++++++++++++++++ > 4 files changed, 277 insertions(+), 1 deletion(-) > create mode 100644 scripts/gcc-plugins/structleak_plugin.c I tried giving this a go, but I got the build failure below: ---- [mark@leverpostej:~/src/linux]% uselinaro 15.08 make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- arch/arm64/Makefile:43: Detected assembler with broken .inst; disassembly will be unreliable CHK include/config/kernel.release CHK include/generated/uapi/linux/version.h CHK include/generated/utsrelease.h UPD include/generated/utsrelease.h HOSTCXX -fPIC scripts/gcc-plugins/structleak_plugin.o scripts/gcc-plugins/structleak_plugin.c: In function ‘int plugin_init(plugin_name_args*, plugin_gcc_version*)’: scripts/gcc-plugins/structleak_plugin.c:214:12: error: ‘structleak’ was not declared in this scope PASS_INFO(structleak, "early_optimizations", 1, PASS_POS_INSERT_BEFORE); ^ scripts/gcc-plugins/structleak_plugin.c:214:72: error: ‘PASS_INFO’ was not declared in this scope PASS_INFO(structleak, "early_optimizations", 1, PASS_POS_INSERT_BEFORE); ^ scripts/gcc-plugins/structleak_plugin.c:240:68: error: ‘structleak_pass_info’ was not declared in this scope register_callback(plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL, &structleak_pass_info); ^ make[1]: *** [scripts/gcc-plugins/structleak_plugin.o] Error 1 make: *** [gcc-plugins] Error 2 ---- I'm using the Linaro 15.08 x86_64 aarch64-linux-gnu- cross toolchain. I believe that can be found at: https://releases.linaro.org/components/toolchain/binaries/5.1-2015.08/aarch64-linux-gnu/gcc-linaro-5.1-2015.08-x86_64_aarch64-linux-gnu.tar.xz Unfortunately, due to that (and lack of a good testcase), I can't give much substantial feedback on this. > diff --git a/arch/Kconfig b/arch/Kconfig > index 99839c23d453..f1250ba0b06f 100644 > --- a/arch/Kconfig > +++ b/arch/Kconfig > @@ -410,6 +410,28 @@ config GCC_PLUGIN_LATENT_ENTROPY > * https://grsecurity.net/ > * https://pax.grsecurity.net/ > > +config GCC_PLUGIN_STRUCTLEAK > + bool "Force initialization of variables containing userspace addresses" > + depends on GCC_PLUGINS > + help > + This plugin zero-initializes any structures that containing a > + __user attribute. This can prevent some classes of information > + exposures. > + > + This plugin was ported from grsecurity/PaX. More information at: > + * https://grsecurity.net/ > + * https://pax.grsecurity.net/ > + > +config GCC_PLUGIN_STRUCTLEAK_VERBOSE > + bool "Report initialized variables" It might be better to say "Report forcefully initialized variables", to make it clear that this is only reporting initialization performed by the plugin. > + depends on GCC_PLUGIN_STRUCTLEAK > + depends on !COMPILE_TEST > + help > + This option will cause a warning to be printed each time the > + structleak plugin finds a variable it thinks needs to be > + initialized. Since not all existing initializers are detected > + by the plugin, this can produce false positive warnings. > + > config HAVE_CC_STACKPROTECTOR > bool > help > diff --git a/include/linux/compiler.h b/include/linux/compiler.h > index cf0fa5d86059..91c30cba984e 100644 > --- a/include/linux/compiler.h > +++ b/include/linux/compiler.h > @@ -27,7 +27,11 @@ extern void __chk_user_ptr(const volatile void __user *); > extern void __chk_io_ptr(const volatile void __iomem *); > # define ACCESS_PRIVATE(p, member) (*((typeof((p)->member) __force *) &(p)->member)) > #else /* __CHECKER__ */ > -# define __user > +# ifdef STRUCTLEAK_PLUGIN > +# define __user __attribute__((user)) > +# else > +# define __user > +# endif > # define __kernel > # define __safe > # define __force > diff --git a/scripts/Makefile.gcc-plugins b/scripts/Makefile.gcc-plugins > index 060d2cb373db..a084f7a511d8 100644 > --- a/scripts/Makefile.gcc-plugins > +++ b/scripts/Makefile.gcc-plugins > @@ -25,6 +25,10 @@ ifdef CONFIG_GCC_PLUGINS > endif > endif > > + gcc-plugin-$(CONFIG_GCC_PLUGIN_STRUCTLEAK) += structleak_plugin.so > + gcc-plugin-cflags-$(CONFIG_GCC_PLUGIN_STRUCTLEAK_VERBOSE) += -fplugin-arg-structleak_plugin-verbose > + gcc-plugin-cflags-$(CONFIG_GCC_PLUGIN_STRUCTLEAK) += -DSTRUCTLEAK_PLUGIN > + > GCC_PLUGINS_CFLAGS := $(strip $(addprefix -fplugin=$(objtree)/scripts/gcc-plugins/, $(gcc-plugin-y)) $(gcc-plugin-cflags-y)) > > export PLUGINCC GCC_PLUGINS_CFLAGS GCC_PLUGIN GCC_PLUGIN_SUBDIR > diff --git a/scripts/gcc-plugins/structleak_plugin.c b/scripts/gcc-plugins/structleak_plugin.c > new file mode 100644 > index 000000000000..deddb7291a26 > --- /dev/null > +++ b/scripts/gcc-plugins/structleak_plugin.c > @@ -0,0 +1,246 @@ > +/* > + * Copyright 2013-2017 by PaX Team <pageexec@freemail.hu> > + * Licensed under the GPL v2 > + * > + * Note: the choice of the license means that the compilation process is > + * NOT 'eligible' as defined by gcc's library exception to the GPL v3, > + * but for the kernel it doesn't matter since it doesn't link against > + * any of the gcc libraries It's my understanding that some architectures do link against libgcc, so we might want some kind of guard to avoid mishaps. e.g. ARCH_CAN_USE_NON_GPLV3_GCC_PLUGINS. > + * > + * gcc plugin to forcibly initialize certain local variables that could > + * otherwise leak kernel stack to userland if they aren't properly initialized > + * by later code > + * > + * Homepage: http://pax.grsecurity.net/ > + * > + * Options: > + * -fplugin-arg-structleak_plugin-disable > + * -fplugin-arg-structleak_plugin-verbose > + * > + * Usage: > + * $ # for 4.5/4.6/C based 4.7 > + * $ gcc -I`gcc -print-file-name=plugin`/include -I`gcc -print-file-name=plugin`/include/c-family -fPIC -shared -O2 -o structleak_plugin.so structleak_plugin.c > + * $ # for C++ based 4.7/4.8+ > + * $ g++ -I`g++ -print-file-name=plugin`/include -I`g++ -print-file-name=plugin`/include/c-family -fPIC -shared -O2 -o structleak_plugin.so structleak_plugin.c > + * $ gcc -fplugin=./structleak_plugin.so test.c -O2 > + * > + * TODO: eliminate redundant initializers > + * increase type coverage > + */ > + > +#include "gcc-common.h" > + > +/* unused C type flag in all versions 4.5-6 */ > +#define TYPE_USERSPACE(TYPE) TYPE_LANG_FLAG_5(TYPE) > + > +__visible int plugin_is_GPL_compatible; > + > +static struct plugin_info structleak_plugin_info = { > + .version = "201607271510vanilla", Has this been modified at all in the porting process? Maybe it would be good to include KERNELVERSION here? > + .help = "disable\tdo not activate plugin\n" > + "verbose\tprint all initialized variables\n", > +}; > + > +static bool verbose; > + > +static tree handle_user_attribute(tree *node, tree name, tree args, int flags, bool *no_add_attrs) > +{ > + *no_add_attrs = true; > + > + /* check for types? for now accept everything linux has to offer */ > + if (TREE_CODE(*node) != FIELD_DECL) > + return NULL_TREE; > + > + *no_add_attrs = false; > + return NULL_TREE; > +} > + > +static struct attribute_spec user_attr = { > + .name = "user", > + .min_length = 0, > + .max_length = 0, > + .decl_required = false, > + .type_required = false, > + .function_type_required = false, > + .handler = handle_user_attribute, > +#if BUILDING_GCC_VERSION >= 4007 > + .affects_type_identity = true > +#endif > +}; > + > +static void register_attributes(void *event_data, void *data) > +{ > + register_attribute(&user_attr); > +} > + > +static tree get_field_type(tree field) > +{ > + return strip_array_types(TREE_TYPE(field)); > +} > + > +static bool is_userspace_type(tree type) > +{ > + tree field; > + > + for (field = TYPE_FIELDS(type); field; field = TREE_CHAIN(field)) { > + tree fieldtype = get_field_type(field); > + enum tree_code code = TREE_CODE(fieldtype); > + > + if (code == RECORD_TYPE || code == UNION_TYPE) > + if (is_userspace_type(fieldtype)) > + return true; > + > + if (lookup_attribute("user", DECL_ATTRIBUTES(field))) > + return true; > + } > + return false; > +} > + > +static void finish_type(void *event_data, void *data) > +{ > + tree type = (tree)event_data; > + > + if (type == NULL_TREE || type == error_mark_node) > + return; > + > +#if BUILDING_GCC_VERSION >= 5000 > + if (TREE_CODE(type) == ENUMERAL_TYPE) > + return; > +#endif > + > + if (TYPE_USERSPACE(type)) > + return; > + > + if (is_userspace_type(type)) > + TYPE_USERSPACE(type) = 1; > +} > + > +static void initialize(tree var) > +{ > + basic_block bb; > + gimple_stmt_iterator gsi; > + tree initializer; > + gimple init_stmt; > + > + /* this is the original entry bb before the forced split */ > + bb = single_succ(ENTRY_BLOCK_PTR_FOR_FN(cfun)); > + > + /* first check if variable is already initialized, warn otherwise */ > + for (gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi)) { > + gimple stmt = gsi_stmt(gsi); > + tree rhs1; > + > + /* we're looking for an assignment of a single rhs... */ > + if (!gimple_assign_single_p(stmt)) > + continue; > + rhs1 = gimple_assign_rhs1(stmt); > +#if BUILDING_GCC_VERSION >= 4007 > + /* ... of a non-clobbering expression... */ > + if (TREE_CLOBBER_P(rhs1)) > + continue; > +#endif > + /* ... to our variable... */ > + if (gimple_get_lhs(stmt) != var) > + continue; > + /* if it's an initializer then we're good */ > + if (TREE_CODE(rhs1) == CONSTRUCTOR) > + return; > + } > + > + /* these aren't the 0days you're looking for */ > + if (verbose) > + inform(DECL_SOURCE_LOCATION(var), > + "userspace variable will be forcibly initialized"); > + > + /* build the initializer expression */ > + initializer = build_constructor(TREE_TYPE(var), NULL); > + > + /* build the initializer stmt */ > + init_stmt = gimple_build_assign(var, initializer); > + gsi = gsi_after_labels(single_succ(ENTRY_BLOCK_PTR_FOR_FN(cfun))); > + gsi_insert_before(&gsi, init_stmt, GSI_NEW_STMT); > + update_stmt(init_stmt); I assume that this is only guaranteed to initialise fields in a struct, and not padding, is that correct? I ask due to the issue described in: https://lwn.net/Articles/417989/ As a further step, it would be interesting to see if we could fix padding for __user variables, but that certainly shouldn't hold this back. > +} > + > +static unsigned int structleak_execute(void) > +{ > + basic_block bb; > + unsigned int ret = 0; > + tree var; > + unsigned int i; > + > + /* split the first bb where we can put the forced initializers */ > + gcc_assert(single_succ_p(ENTRY_BLOCK_PTR_FOR_FN(cfun))); > + bb = single_succ(ENTRY_BLOCK_PTR_FOR_FN(cfun)); > + if (!single_pred_p(bb)) { > + split_edge(single_succ_edge(ENTRY_BLOCK_PTR_FOR_FN(cfun))); > + gcc_assert(single_succ_p(ENTRY_BLOCK_PTR_FOR_FN(cfun))); > + } > + > + /* enumarate all local variables and forcibly initialize our targets */ Nit: s/enumarate/enumerate/ > + FOR_EACH_LOCAL_DECL(cfun, i, var) { > + tree type = TREE_TYPE(var); > + > + gcc_assert(DECL_P(var)); > + if (!auto_var_in_fn_p(var, current_function_decl)) > + continue; > + > + /* only care about structure types */ > + if (TREE_CODE(type) != RECORD_TYPE && TREE_CODE(type) != UNION_TYPE) > + continue; > + > + /* if the type is of interest, examine the variable */ > + if (TYPE_USERSPACE(type)) > + initialize(var); > + } > + > + return ret; > +} > + > +#define PASS_NAME structleak > +#define NO_GATE > +#define PROPERTIES_REQUIRED PROP_cfg > +#define TODO_FLAGS_FINISH TODO_verify_il | TODO_verify_ssa | TODO_verify_stmts | TODO_dump_func | TODO_remove_unused_locals | TODO_update_ssa | TODO_ggc_collect | TODO_verify_flow > +#include "gcc-generate-gimple-pass.h" > + > +__visible int plugin_init(struct plugin_name_args *plugin_info, struct plugin_gcc_version *version) > +{ > + int i; > + const char * const plugin_name = plugin_info->base_name; > + const int argc = plugin_info->argc; > + const struct plugin_argument * const argv = plugin_info->argv; > + bool enable = true; > + > + PASS_INFO(structleak, "early_optimizations", 1, PASS_POS_INSERT_BEFORE); > + > + if (!plugin_default_version_check(version, &gcc_version)) { > + error(G_("incompatible gcc/plugin versions")); > + return 1; > + } > + > + if (strncmp(lang_hooks.name, "GNU C", 5) && !strncmp(lang_hooks.name, "GNU C+", 6)) { > + inform(UNKNOWN_LOCATION, G_("%s supports C only, not %s"), plugin_name, lang_hooks.name); > + enable = false; > + } > + > + for (i = 0; i < argc; ++i) { > + if (!strcmp(argv[i].key, "disable")) { > + enable = false; > + continue; > + } > + if (!strcmp(argv[i].key, "verbose")) { > + verbose = true; > + continue; > + } > + error(G_("unknown option '-fplugin-arg-%s-%s'"), plugin_name, argv[i].key); > + } > + > + register_callback(plugin_name, PLUGIN_INFO, NULL, &structleak_plugin_info); > + if (enable) { > + register_callback(plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL, &structleak_pass_info); > + register_callback(plugin_name, PLUGIN_FINISH_TYPE, finish_type, NULL); > + } > + register_callback(plugin_name, PLUGIN_ATTRIBUTES, register_attributes, NULL); > + > + return 0; > +} > -- > 2.7.4 Thanks, Mark.
Hi, On Sat, Jan 14, 2017 at 11:03:14AM +0100, PaX Team wrote: > On 13 Jan 2017 at 14:02, Kees Cook wrote: > > > +config GCC_PLUGIN_STRUCTLEAK_VERBOSE > > + bool "Report initialized variables" > > + depends on GCC_PLUGIN_STRUCTLEAK > > + depends on !COMPILE_TEST > > + help > > + This option will cause a warning to be printed each time the > > + structleak plugin finds a variable it thinks needs to be > > + initialized. Since not all existing initializers are detected > > + by the plugin, this can produce false positive warnings. > > there are no false positives, a variable either has a constructor or it does not ;) ... or it has no constructor, but is clobbered by a memset before it is possibly copied. ;) For example: arch/arm64/kernel/fpsimd.c: In function 'do_fpsimd_exc': arch/arm64/kernel/fpsimd.c:106:12: note: userspace variable will be forcibly initialized siginfo_t info; ... where the code looks like: void do_fpsimd_exc(unsigned int esr, struct pt_regs *regs) { siginfo_t info; unsigned int si_code = 0; if (esr & FPEXC_IOF) si_code = FPE_FLTINV; else if (esr & FPEXC_DZF) si_code = FPE_FLTDIV; else if (esr & FPEXC_OFF) si_code = FPE_FLTOVF; else if (esr & FPEXC_UFF) si_code = FPE_FLTUND; else if (esr & FPEXC_IXF) si_code = FPE_FLTRES; memset(&info, 0, sizeof(info)); info.si_signo = SIGFPE; info.si_code = si_code; info.si_addr = (void __user *)instruction_pointer(regs); send_sig_info(SIGFPE, &info, current); } ... so it's clear to a human that info is initialised prior to use, though not by an explicit field initializer. > > +/* unused C type flag in all versions 4.5-6 */ > > +#define TYPE_USERSPACE(TYPE) TYPE_LANG_FLAG_5(TYPE) > > FYI, this is a sort of abuse/hack of tree flags and should not be implemented this > way in the upstream kernel as it's a finite resource and needs careful verification > against all supported gcc versions (these flags are meant for language fronteds, i > kinda got lucky to have a few of them unusued but it's not a robust future-proof > approach). instead an attribute should be used to mark these types. whether that > can/should be __user itself is a good question since that's another hack where the > plugin 'hijacks' a sparse address space atttribute (for which gcc 4.6+ has its own > facilities and that the checker gcc plugin makes use of thus it's not compatible > with structleak as is). To me, it seems that the __user annotation can only be an indicator of an issue by chance. We have structures with __user pointers in structs that will never be copied to userspace, and conversely we have structs that don't contain a __user field, but will be copied to userspace. Maybe it happens that structs in more complex systems are more likely to contain some __user pointer. Was that part of the rationale? I wonder if there's any analysis we can do of data passing into copy_to_user() and friends. I guess we can't follow the data flow across compilation units, but we might be able to follow it well enough if we added a new attribute that described whether data was to be copied to userspace. Thanks, Mark.
On Mon, 2017-01-16 at 15:24 +0000, Mark Rutland wrote: > Hi, > > On Sat, Jan 14, 2017 at 11:03:14AM +0100, PaX Team wrote: > > On 13 Jan 2017 at 14:02, Kees Cook wrote: > > > > > +config GCC_PLUGIN_STRUCTLEAK_VERBOSE > > > + bool "Report initialized variables" > > > + depends on GCC_PLUGIN_STRUCTLEAK > > > + depends on !COMPILE_TEST > > > + help > > > + This option will cause a warning to be printed each > > > time the > > > + structleak plugin finds a variable it thinks needs to > > > be > > > + initialized. Since not all existing initializers are > > > detected > > > + by the plugin, this can produce false positive > > > warnings. > > > > there are no false positives, a variable either has a constructor or > > it does not ;) > > ... or it has no constructor, but is clobbered by a memset before it > is > possibly copied. ;) > > For example: > > arch/arm64/kernel/fpsimd.c: In function 'do_fpsimd_exc': > arch/arm64/kernel/fpsimd.c:106:12: note: userspace variable will be > forcibly initialized > siginfo_t info; > > ... where the code looks like: > > void do_fpsimd_exc(unsigned int esr, struct pt_regs *regs) > { > siginfo_t info; > unsigned int si_code = 0; > > if (esr & FPEXC_IOF) > si_code = FPE_FLTINV; > else if (esr & FPEXC_DZF) > si_code = FPE_FLTDIV; > else if (esr & FPEXC_OFF) > si_code = FPE_FLTOVF; > else if (esr & FPEXC_UFF) > si_code = FPE_FLTUND; > else if (esr & FPEXC_IXF) > si_code = FPE_FLTRES; > > memset(&info, 0, sizeof(info)); > info.si_signo = SIGFPE; > info.si_code = si_code; > info.si_addr = (void __user *)instruction_pointer(regs); > > send_sig_info(SIGFPE, &info, current); > } > > ... so it's clear to a human that info is initialised prior to use, > though not by an explicit field initializer. It's obvious to the compiler too, not just a human. The compiler can optimize it out when it's so clearly not required, although translation units will get in the way without LTO. There's existing sophisticated analysis for automatic initialization with full coverage. That's part of why I think it makes sense to have a toggle for heuristics vs. full coverage (incl. non-function-scope locals, that's just a heuristic too). > > > +/* unused C type flag in all versions 4.5-6 */ > > > +#define TYPE_USERSPACE(TYPE) TYPE_LANG_FLAG_5(TYPE) > > > > FYI, this is a sort of abuse/hack of tree flags and should not be > > implemented this > > way in the upstream kernel as it's a finite resource and needs > > careful verification > > against all supported gcc versions (these flags are meant for > > language fronteds, i > > kinda got lucky to have a few of them unusued but it's not a robust > > future-proof > > approach). instead an attribute should be used to mark these types. > > whether that > > can/should be __user itself is a good question since that's another > > hack where the > > plugin 'hijacks' a sparse address space atttribute (for which gcc > > 4.6+ has its own > > facilities and that the checker gcc plugin makes use of thus it's > > not compatible > > with structleak as is). > > To me, it seems that the __user annotation can only be an indicator of > an issue by chance. We have structures with __user pointers in structs > that will never be copied to userspace, and conversely we have structs > that don't contain a __user field, but will be copied to userspace. > > Maybe it happens that structs in more complex systems are more likely > to > contain some __user pointer. Was that part of the rationale? > > I wonder if there's any analysis we can do of data passing into > copy_to_user() and friends. I guess we can't follow the data flow > across > compilation units, but we might be able to follow it well enough if we > added a new attribute that described whether data was to be copied to > userspace. > > Thanks, > Mark.
On 16 Jan 2017 at 11:54, Mark Rutland wrote: > > + * Copyright 2013-2017 by PaX Team <pageexec@freemail.hu> > > + * Licensed under the GPL v2 > > + * > > + * Note: the choice of the license means that the compilation process is > > + * NOT 'eligible' as defined by gcc's library exception to the GPL v3, > > + * but for the kernel it doesn't matter since it doesn't link against > > + * any of the gcc libraries > > It's my understanding that some architectures do link against libgcc, so we > might want some kind of guard to avoid mishaps. e.g. > ARCH_CAN_USE_NON_GPLV3_GCC_PLUGINS. AFAIK, plugins aren't enabled on any such archs yet and if/when they get enabled, the better approach would be to simply reimplement those helper routines in the kernel itself like some archs already do. > > + /* build the initializer expression */ > > + initializer = build_constructor(TREE_TYPE(var), NULL); > > + > > + /* build the initializer stmt */ > > + init_stmt = gimple_build_assign(var, initializer); > > + gsi = gsi_after_labels(single_succ(ENTRY_BLOCK_PTR_FOR_FN(cfun))); > > + gsi_insert_before(&gsi, init_stmt, GSI_NEW_STMT); > > + update_stmt(init_stmt); > > I assume that this is only guaranteed to initialise fields in a struct, > and not padding, is that correct? I ask due to the issue described in: > > https://lwn.net/Articles/417989/ the 'issue' is that before C11 the standard didn't make it clear that in case of a partial initializer list the compiler has to initialize not only the remaining fields but also padding as well. as for what the above code does, it fixes this as well since gcc doesn't emit the constructor itself which will then match the pattern.
On 16 Jan 2017 at 15:24, Mark Rutland wrote: > To me, it seems that the __user annotation can only be an indicator of > an issue by chance. We have structures with __user pointers in structs > that will never be copied to userspace, and conversely we have structs > that don't contain a __user field, but will be copied to userspace. > > Maybe it happens that structs in more complex systems are more likely to > contain some __user pointer. Was that part of the rationale? it's as i explained in an earlier email: we wanted to pattern match a specific bug situation and this was the easiest way (as you can see, the plugin's code is very simple, not much effort went into it). > I wonder if there's any analysis we can do of data passing into > copy_to_user() and friends. I guess we can't follow the data flow across > compilation units, but we might be able to follow it well enough if we > added a new attribute that described whether data was to be copied to > userspace. there're are all kinds of data flow analyses you can do within and even across translation units (summary info a'la size overflow hash tables or LTO). i never went into that direction because i think the security goal can be achieved without the performance impact of forced initialization.
On Mon, Jan 16, 2017 at 08:22:24PM +0100, PaX Team wrote: > On 16 Jan 2017 at 11:54, Mark Rutland wrote: [...] > > I assume that this is only guaranteed to initialise fields in a struct, > > and not padding, is that correct? I ask due to the issue described in: > > > > https://lwn.net/Articles/417989/ > > the 'issue' is that before C11 the standard didn't make it clear that in > case of a partial initializer list the compiler has to initialize not only > the remaining fields but also padding as well. Which version of the C11 spec clarifies this? The one I'm looking at (n1570, Apr 12 2011) says: "6.7.9 21 If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration." What is meant by "the remainder of the object" is unclear. Is this just the tail of the object, or all unitialised members -- which may be sparse in the case of an initialiser with designators? And is padding (and if so, which padding) considered to be part of (the remainder of) the object for the purpose of this paragraph? This can be read with the interpretation you suggest, but the wording doesn't seem rock-solid. For the kernel, I guess it's sufficient if GCC commits to this interpretation though. A related, perhaps more interesting issue is that C11 still explicitly states that padding is undefined after assignment: "6.2.6.1 6 When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values. 51)" "51) Thus, for example, structure assignment need not copy any padding bits." which means that in: { struct foo foo1 = FOO1_INTIALIZER; struct foo foo2; foo2 = foo1; } the contents of padding in foo2 is unspecified. Similarly, piecemeal initialisation of an object by assigning to every member leaves the padding unspecified, e.g.: { struct timeval tv; tv.tv_sec = secs; tv.tv_usec = usecs; } (On most or all arches struct timeval contains no padding, but relying implicitly on this is still a bit iffy.) It's highly likely that the kernel passes objects resulting from one of the above to put_user() and friends. It would be good if we had a way to catch at least some of these. (I don't know whether that falls naturally under this plugin.) Cheers ---Dave IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
On 17 Jan 2017 at 10:42, Dave P Martin wrote: > On Mon, Jan 16, 2017 at 08:22:24PM +0100, PaX Team wrote: > > the 'issue' is that before C11 the standard didn't make it clear that in > > case of a partial initializer list the compiler has to initialize not only > > the remaining fields but also padding as well. > > Which version of the C11 spec clarifies this? > > The one I'm looking at (n1570, Apr 12 2011) says: > > "6.7.9 21 If there are fewer initializers in a brace-enclosed list than > there are elements or members of an aggregate, or fewer characters in a > string literal used to initialize an array of known size than there are > elements in the array, the remainder of the aggregate shall be > initialized implicitly the same as objects that have static storage > duration." > > What is meant by "the remainder of the object" is unclear. it's less unclear if you also take into account the rest of the sentence that says "[initialized] the same as objects that have static storage duration" which in turn is described at 6.7.9p10 where it says among others: "if it is an aggregate, every member is initialized (recursively) according to these rules, and any padding is initialized to zero bits" the "and any padding is initialized to zero bits" was the addition in C11 (which i'm not sure was meant as a new feature for C11 or just official enshrinement of what had always been meant). > Is this just the tail of the object, or all unitialised members -- > which may be sparse in the case of an initialiser with designators? And > is padding (and if so, which padding) considered to be part of (the > remainder of) the object for the purpose of this paragraph? to me the quote about 'any padding' is for all kinds of padding. uninitialized members are described in the same paragraph. > This can be read with the interpretation you suggest, but the wording > doesn't seem rock-solid. For the kernel, I guess it's sufficient if > GCC commits to this interpretation though. note that i'm not a language lawyer, merely an interpreter, so take the above with a grain of salt. > A related, perhaps more interesting issue is that C11 still explicitly > states that padding is undefined after assignment: you probably meant 'unspecified'? > "6.2.6.1 6 When a value is stored in an object of structure or union > type, including in a member object, the bytes of the object > representation that correspond to any padding bytes take unspecified > values. 51)" > > "51) Thus, for example, structure assignment need not copy any padding > bits." > > which means that in: > > { > struct foo foo1 = FOO1_INTIALIZER; > struct foo foo2; > > foo2 = foo1; > } > > the contents of padding in foo2 is unspecified. > > Similarly, piecemeal initialisation of an object by assigning to every > member leaves the padding unspecified, e.g.: > > { > struct timeval tv; > > tv.tv_sec = secs; > tv.tv_usec = usecs; > } > > (On most or all arches struct timeval contains no padding, but relying > implicitly on this is still a bit iffy.) > > > It's highly likely that the kernel passes objects resulting from one of > the above to put_user() and friends. It would be good if we had a way > to catch at least some of these. both examples would be handled by the plugin if the types involved already have a __user annotated field (as neither foo2 nor tv are initialized, only assigned to), otherwise it'd need manual annotation and/or more static analysis.
On Mon, Jan 16, 2017 at 08:30:29PM +0100, PaX Team wrote: > On 16 Jan 2017 at 15:24, Mark Rutland wrote: > > > To me, it seems that the __user annotation can only be an indicator of > > an issue by chance. We have structures with __user pointers in structs > > that will never be copied to userspace, and conversely we have structs > > that don't contain a __user field, but will be copied to userspace. > > > > Maybe it happens that structs in more complex systems are more likely to > > contain some __user pointer. Was that part of the rationale? > > it's as i explained in an earlier email: we wanted to pattern match a > specific bug situation and this was the easiest way (as you can see, > the plugin's code is very simple, not much effort went into it). Ok. That being the case, (and given the relevant bug has now been fixed), it's not clear to me what the value of this is today. i.e. given the general case, is this preventing many leaks? > > I wonder if there's any analysis we can do of data passing into > > copy_to_user() and friends. I guess we can't follow the data flow across > > compilation units, but we might be able to follow it well enough if we > > added a new attribute that described whether data was to be copied to > > userspace. > > there're are all kinds of data flow analyses you can do within and even > across translation units (summary info a'la size overflow hash tables or > LTO). Sure. > i never went into that direction because i think the security goal can > be achieved without the performance impact of forced initialization. Was there a particular technique you had in mind? Thanks, Mark.
On Sat, Jan 14, 2017 at 2:03 AM, PaX Team <pageexec@freemail.hu> wrote: > On 13 Jan 2017 at 14:02, Kees Cook wrote: > >> This plugin detects any structures that contain __user attributes and >> makes sure it is being fulling initialized so that a specific class of >> information exposure is eliminated. (For example, the exposure of siginfo >> in CVE-2013-2141 would have been blocked by this plugin.) > > why the conditional? the plugin was specifically written to block that bug > and block it did ;). I can rephrase this. I just wanted to give an example. How about: This plugin detects any structures that contain __user attributes and makes sure it is being fulling initialized so that a specific class of information exposure is eliminated. (This plugin was originally designed to block the exposure of siginfo in CVE-2013-2141.) >> +config GCC_PLUGIN_STRUCTLEAK >> + bool "Force initialization of variables containing userspace addresses" >> + depends on GCC_PLUGINS >> + help >> + This plugin zero-initializes any structures that containing a >> + __user attribute. This can prevent some classes of information >> + exposures. > > i see that you completely ditched the description in PaX, is there a reason > for it? Yup, details below... I didn't use the "bool" text because it wasn't accurate: bool "Forcibly initialize local variables copied to userland" This implies "all", but it's not, and it has nothing to do with stuff copied to userland. It's just looking a __user, with no examination of copy_to_user() calls. The rest: By saying Y here the kernel will zero initialize some local variables that are going to be copied to userland. This in Like the bool text, this isn't accurate. I can be specific about the "some", but the "copied to userland" just isn't true. turn prevents unintended information leakage from the kernel stack should later code forget to explicitly set all parts of the copied variable. I could reuse this part, sure. I tend to like to use the term "exposure" instead of "leak", though, so at this point I just rewrote the description of the "why". The tradeoff is less performance impact than PAX_MEMORY_STACKLEAK at a much smaller coverage. I left this out because upstream doesn't have STACKLEAK to compare against yet. Note that the implementation requires a gcc with plugin support, i.e., gcc 4.5 or newer. You may need to install the supporting headers explicitly in addition to the normal gcc package. I left this out because it's redundant with the description of CONFIG_GCC_PLUGINS. > your text isn't correct as is because > > - the __user attribute (which is an implementation choice, see below) doesn't > apply to structures but pointers only (as it does for sparse IIRC) > - a structure is a type, but the plugin initializes variables, not types > (the latter makes little sense) > - the plugin doesn't initialize 'any structures' (well, variables), only locals > and only at function scope (subject to further evolution as discussed earlier). Including mention of "__user" in Kconfig help text is already a bit "too technical", so I was trying to be concise. I could easily expand this to "any structure variables with __user pointers in function scope", but it seemed like unneeded details. (This level of detail isn't present in the PaX Kconfig either...) >> +config GCC_PLUGIN_STRUCTLEAK_VERBOSE >> + bool "Report initialized variables" >> + depends on GCC_PLUGIN_STRUCTLEAK >> + depends on !COMPILE_TEST >> + help >> + This option will cause a warning to be printed each time the >> + structleak plugin finds a variable it thinks needs to be >> + initialized. Since not all existing initializers are detected >> + by the plugin, this can produce false positive warnings. > > there are no false positives, a variable either has a constructor or it does not ;) Well, as pointed out, there are plenty of false positives where the plug reports the need to initialize the variable when it doesn't. It doesn't report that it's missing a constructor. :) This is a pragmatic description of what is happening, and since the plugin does sometimes needlessly insert initializations where none are needed, that really seems like a false positive to me. :) >> +/* unused C type flag in all versions 4.5-6 */ >> +#define TYPE_USERSPACE(TYPE) TYPE_LANG_FLAG_5(TYPE) > > FYI, this is a sort of abuse/hack of tree flags and should not be implemented this > way in the upstream kernel as it's a finite resource and needs careful verification > against all supported gcc versions (these flags are meant for language fronteds, i > kinda got lucky to have a few of them unusued but it's not a robust future-proof > approach). instead an attribute should be used to mark these types. whether that > can/should be __user itself is a good question since that's another hack where the > plugin 'hijacks' a sparse address space atttribute (for which gcc 4.6+ has its own > facilities and that the checker gcc plugin makes use of thus it's not compatible > with structleak as is). Yeah, I wasn't too happy about that lang flag usage either. It seems reasonable to just build an attribute on the fly. IIRC, initify does this already to mark things it has already processed? -Kees
On Mon, Jan 16, 2017 at 3:54 AM, Mark Rutland <mark.rutland@arm.com> wrote: > Hi, > > [adding Dave, so retaining full context below] > > On Fri, Jan 13, 2017 at 02:02:56PM -0800, Kees Cook wrote: >> This plugin detects any structures that contain __user attributes and >> makes sure it is being fulling initialized so that a specific class of > > Nit: s/fulling/fully/ Whoops, thanks, fixed. >> information exposure is eliminated. (For example, the exposure of siginfo >> in CVE-2013-2141 would have been blocked by this plugin.) >> >> Ported from grsecurity/PaX. This version adds a verbose option to the >> plugin and the Kconfig. >> >> Signed-off-by: Kees Cook <keescook@chromium.org> >> --- >> arch/Kconfig | 22 +++ >> include/linux/compiler.h | 6 +- >> scripts/Makefile.gcc-plugins | 4 + >> scripts/gcc-plugins/structleak_plugin.c | 246 ++++++++++++++++++++++++++++++++ >> 4 files changed, 277 insertions(+), 1 deletion(-) >> create mode 100644 scripts/gcc-plugins/structleak_plugin.c > > I tried giving this a go, but I got the build failure below: > > ---- > [mark@leverpostej:~/src/linux]% uselinaro 15.08 make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- > arch/arm64/Makefile:43: Detected assembler with broken .inst; disassembly will be unreliable > CHK include/config/kernel.release > CHK include/generated/uapi/linux/version.h > CHK include/generated/utsrelease.h > UPD include/generated/utsrelease.h > HOSTCXX -fPIC scripts/gcc-plugins/structleak_plugin.o > scripts/gcc-plugins/structleak_plugin.c: In function ‘int plugin_init(plugin_name_args*, plugin_gcc_version*)’: > scripts/gcc-plugins/structleak_plugin.c:214:12: error: ‘structleak’ was not declared in this scope > PASS_INFO(structleak, "early_optimizations", 1, PASS_POS_INSERT_BEFORE); > ^ Sorry, yes, this depends on the gcc-plugins changes in -next. > [...] >> diff --git a/arch/Kconfig b/arch/Kconfig >> index 99839c23d453..f1250ba0b06f 100644 >> --- a/arch/Kconfig >> +++ b/arch/Kconfig >> @@ -410,6 +410,28 @@ config GCC_PLUGIN_LATENT_ENTROPY >> * https://grsecurity.net/ >> * https://pax.grsecurity.net/ >> >> +config GCC_PLUGIN_STRUCTLEAK >> + bool "Force initialization of variables containing userspace addresses" >> + depends on GCC_PLUGINS >> + help >> + This plugin zero-initializes any structures that containing a >> + __user attribute. This can prevent some classes of information >> + exposures. >> + >> + This plugin was ported from grsecurity/PaX. More information at: >> + * https://grsecurity.net/ >> + * https://pax.grsecurity.net/ >> + >> +config GCC_PLUGIN_STRUCTLEAK_VERBOSE >> + bool "Report initialized variables" > > It might be better to say "Report forcefully initialized variables", to make it > clear that this is only reporting initialization performed by the plugin. Sounds good, changed. > [...] >> + /* these aren't the 0days you're looking for */ >> + if (verbose) >> + inform(DECL_SOURCE_LOCATION(var), >> + "userspace variable will be forcibly initialized"); >> + >> + /* build the initializer expression */ >> + initializer = build_constructor(TREE_TYPE(var), NULL); >> + >> + /* build the initializer stmt */ >> + init_stmt = gimple_build_assign(var, initializer); >> + gsi = gsi_after_labels(single_succ(ENTRY_BLOCK_PTR_FOR_FN(cfun))); >> + gsi_insert_before(&gsi, init_stmt, GSI_NEW_STMT); >> + update_stmt(init_stmt); > > I assume that this is only guaranteed to initialise fields in a struct, > and not padding, is that correct? I ask due to the issue described in: > > https://lwn.net/Articles/417989/ > > As a further step, it would be interesting to see if we could fix > padding for __user variables, but that certainly shouldn't hold this > back. This spawned a whole thread. :) For non-C11, yeah, a plugin to do this would be nice. That's already on the KSPP TODO list, but from what I can tell it either requires walking every variable's structure to check for sizes and padding or do explicitly add a constructor for every variable and hope that the optimization phase does the right thing. ;) >> +} >> + >> +static unsigned int structleak_execute(void) >> +{ >> + basic_block bb; >> + unsigned int ret = 0; >> + tree var; >> + unsigned int i; >> + >> + /* split the first bb where we can put the forced initializers */ >> + gcc_assert(single_succ_p(ENTRY_BLOCK_PTR_FOR_FN(cfun))); >> + bb = single_succ(ENTRY_BLOCK_PTR_FOR_FN(cfun)); >> + if (!single_pred_p(bb)) { >> + split_edge(single_succ_edge(ENTRY_BLOCK_PTR_FOR_FN(cfun))); >> + gcc_assert(single_succ_p(ENTRY_BLOCK_PTR_FOR_FN(cfun))); >> + } >> + >> + /* enumarate all local variables and forcibly initialize our targets */ > > Nit: s/enumarate/enumerate/ Fixed. Thanks for the review! -Kees
On Tue, Jan 17, 2017 at 06:09:49PM +0100, PaX Team wrote: > On 17 Jan 2017 at 10:42, Dave P Martin wrote: > > > On Mon, Jan 16, 2017 at 08:22:24PM +0100, PaX Team wrote: > > > the 'issue' is that before C11 the standard didn't make it clear that in > > > case of a partial initializer list the compiler has to initialize not only > > > the remaining fields but also padding as well. > > > > Which version of the C11 spec clarifies this? > > > > The one I'm looking at (n1570, Apr 12 2011) says: > > > > "6.7.9 21 If there are fewer initializers in a brace-enclosed list than > > there are elements or members of an aggregate, or fewer characters in a > > string literal used to initialize an array of known size than there are > > elements in the array, the remainder of the aggregate shall be > > initialized implicitly the same as objects that have static storage > > duration." > > > > What is meant by "the remainder of the object" is unclear. > > it's less unclear if you also take into account the rest of the > sentence that says "[initialized] the same as objects that have > static storage duration" which in turn is described at 6.7.9p10 > where it says among others: > > "if it is an aggregate, every member is initialized (recursively) > according to these rules, and any padding is initialized to zero bits" > > the "and any padding is initialized to zero bits" was the addition > in C11 (which i'm not sure was meant as a new feature for C11 or > just official enshrinement of what had always been meant). > > > Is this just the tail of the object, or all unitialised members -- > > which may be sparse in the case of an initialiser with designators? And > > is padding (and if so, which padding) considered to be part of (the > > remainder of) the object for the purpose of this paragraph? > > to me the quote about 'any padding' is for all kinds of padding. > uninitialized members are described in the same paragraph. > > > This can be read with the interpretation you suggest, but the wording > > doesn't seem rock-solid. For the kernel, I guess it's sufficient if > > GCC commits to this interpretation though. > > note that i'm not a language lawyer, merely an interpreter, so take > the above with a grain of salt. Me neither, but actually I think this interpretation doesn't make sense. The implication seems to be that padding initialisation of initialised aggregates with automatic storage is guaranteed _only_ if the initialiser is incomplete. Paragraph 19 always applies, but that only asserts "all subobjects that are not initialized explicitly shall be initialized implicitly the same as objects that have static storage duration" -- no statement about padding. Paragraph 21 guarantees initialisation of padding bytes, but applies _only_ if the initializer list is incomplete. As a random test, a recent-ish gcc might actually apply this interpretation... tst.c ---8<--- struct foo { long x; char y; long z; int w; }; void f(struct foo *); void g(long *x, char *y, long *z, int *w) { struct foo o = { .z = *z, .y = *y, .x = *x, .w = *w }; f(&o); } void h(int *y, long *z) { struct foo o = { .z = *z, .y = *y }; f(&o); } --->8--- $ gcc (Debian 6.3.0-2) 6.3.0 20161229 $ gcc -Wall -Wextra -pedantic -O2 -std=c11 -c tst.c $ objdump -d tst.o tst.o: file format elf64-littleaarch64 Disassembly of section .text: 0000000000000000 <g>: 0: a9bd7bfd stp x29, x30, [sp, #-48]! 4: 910003fd mov x29, sp 8: 39400024 ldrb w4, [x1] c: f9400005 ldr x5, [x0] 10: 910043a0 add x0, x29, #0x10 14: b9400061 ldr w1, [x3] 18: f9400042 ldr x2, [x2] 1c: 390063a4 strb w4, [x29, #24] 20: f9000ba5 str x5, [x29, #16] 24: f90013a2 str x2, [x29, #32] 28: b9002ba1 str w1, [x29, #40] 2c: 94000000 bl 0 <f> 30: a8c37bfd ldp x29, x30, [sp], #48 34: d65f03c0 ret 0000000000000038 <h>: 38: a9bd7bfd stp x29, x30, [sp, #-48]! 3c: 910003fd mov x29, sp 40: b9400002 ldr w2, [x0] 44: 910043a0 add x0, x29, #0x10 48: f9400021 ldr x1, [x1] 4c: a9017fbf stp xzr, xzr, [x29, #16] 50: a9027fbf stp xzr, xzr, [x29, #32] 54: 390063a2 strb w2, [x29, #24] 58: f90013a1 str x1, [x29, #32] 5c: 94000000 bl 0 <f> 60: a8c37bfd ldp x29, x30, [sp], #48 64: d65f03c0 ret In g(), where the initialiser is complete, the padding in o is definitely not zeroed out. In h(), the padding is all zeroed out, including padding around the zeoed members -- whether gcc intentionally guarantees this or it's just a coincidence, I don't know. This may or may not be a bug, and may or may not have been fixed in a more recent version. > > > A related, perhaps more interesting issue is that C11 still explicitly > > states that padding is undefined after assignment: > > you probably meant 'unspecified'? Indeed. Sloppy of me... > > "6.2.6.1 6 When a value is stored in an object of structure or union > > type, including in a member object, the bytes of the object > > representation that correspond to any padding bytes take unspecified > > values. 51)" > > > > "51) Thus, for example, structure assignment need not copy any padding > > bits." > > > > which means that in: > > > > { > > struct foo foo1 = FOO1_INTIALIZER; > > struct foo foo2; > > > > foo2 = foo1; > > } > > > > the contents of padding in foo2 is unspecified. > > > > Similarly, piecemeal initialisation of an object by assigning to every > > member leaves the padding unspecified, e.g.: > > > > { > > struct timeval tv; > > > > tv.tv_sec = secs; > > tv.tv_usec = usecs; > > } > > > > (On most or all arches struct timeval contains no padding, but relying > > implicitly on this is still a bit iffy.) > > > > > > It's highly likely that the kernel passes objects resulting from one of > > the above to put_user() and friends. It would be good if we had a way > > to catch at least some of these. > > both examples would be handled by the plugin if the types involved already > have a __user annotated field (as neither foo2 nor tv are initialized, only > assigned to), otherwise it'd need manual annotation and/or more static analysis. Do you see any false positives here? I would expect a fair number of structures that contain __user pointers but that are never copied wholesale to userspace, but I've not reviewed in detail. I would also expect false negatives to be common -- structures that lack __user pointers but _are_ copied to userspace. But again, if this plugin is not designed to catch these in the first place, this may be best addressed in a different way. Cheers ---Dave IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
On 17 Jan 2017 at 17:48, Mark Rutland wrote: > That being the case, (and given the relevant bug has now been fixed), > it's not clear to me what the value of this is today. i.e. given the > general case, is this preventing many leaks? no idea, i stopped looking at the instrumentation log long ago, but everyone can enable the debug output (has a very specific comment on it ;) and look at the results. i keep this plugin around because it costs nothing to maintain it and the alternative (better) solution doesn't exist yet. > > i never went into that direction because i think the security goal can > > be achieved without the performance impact of forced initialization. > > Was there a particular technique you had in mind? sure, i mentioned it in my SSTIC'12 keynote (page 36): https://pax.grsecurity.net/docs/PaXTeam-SSTIC12-keynote-20-years-of-PaX.pdf
On 17 Jan 2017 at 18:07, Dave P Martin wrote: > On Tue, Jan 17, 2017 at 06:09:49PM +0100, PaX Team wrote: > > On 17 Jan 2017 at 10:42, Dave P Martin wrote: > > > > > This can be read with the interpretation you suggest, but the wording > > > doesn't seem rock-solid. For the kernel, I guess it's sufficient if > > > GCC commits to this interpretation though. > > > > note that i'm not a language lawyer, merely an interpreter, so take > > the above with a grain of salt. > > Me neither, but actually I think this interpretation doesn't make sense. > The implication seems to be that padding initialisation of initialised > aggregates with automatic storage is guaranteed _only_ if the > initialiser is incomplete. however illogical it sounds, that's actually what is implied by the standard and gcc behaves this way. my test case with a bit more verbose output than your example (gcc shows in comments what each assignment corresponds to): ------ cut -------- gcc -O2 -x c -S -fverbose-asm -o - - <<EOF struct s1 { char c; int l; }; void f(struct s1*); void g(void) { struct s1 s1 = {}; struct s1 s2 = { .c = 1}; struct s1 s3 = { .l = 2}; struct s1 s4 = { .c = 3, .l = 4}; f(&s1); f(&s2); f(&s3); f(&s4); } EOF ------ cut -------- > This may or may not be a bug, and may or may not have been fixed in a > more recent version. i tested 4.5.4, 6.3 and 7.0 from about a month ago on amd64 and had consistent results as above. i think it's not a bug but an intentional optimization but i don't feel like digging it out from the gcc sources :). > > both examples would be handled by the plugin if the types involved already > > have a __user annotated field (as neither foo2 nor tv are initialized, only > > assigned to), otherwise it'd need manual annotation and/or more static analysis. > > Do you see any false positives here? I would expect a fair number of > structures that contain __user pointers but that are never copied > wholesale to userspace, but I've not reviewed in detail. false positive depends on what you consider a goal ;). for me the structleak plugin had to serve one specific purpose (initialize a local variable that would otherwise leak unintended kernel stack content back to userland), everything else didn't matter. now if you want to move the goalpost and initialize those and only those variables that get copied to userland but aren't otherwise guaranteed to be completely initialized then you'll need something much smarter than the current structleak plugin as it doesn't serve that purpose and will both overinitialize and fail to initialize affected variables...
On Tue, Jan 17, 2017 at 08:25:49PM +0100, PaX Team wrote: > On 17 Jan 2017 at 18:07, Dave P Martin wrote: > > > On Tue, Jan 17, 2017 at 06:09:49PM +0100, PaX Team wrote: > > > On 17 Jan 2017 at 10:42, Dave P Martin wrote: > > > > > > > This can be read with the interpretation you suggest, but the wording > > > > doesn't seem rock-solid. For the kernel, I guess it's sufficient if > > > > GCC commits to this interpretation though. > > > > > > note that i'm not a language lawyer, merely an interpreter, so take > > > the above with a grain of salt. > > > > Me neither, but actually I think this interpretation doesn't make sense. > > The implication seems to be that padding initialisation of initialised > > aggregates with automatic storage is guaranteed _only_ if the > > initialiser is incomplete. > > however illogical it sounds, that's actually what is implied by the standard > and gcc behaves this way. my test case with a bit more verbose output than > your example (gcc shows in comments what each assignment corresponds to): > > ------ cut -------- > gcc -O2 -x c -S -fverbose-asm -o - - <<EOF > struct s1 { > char c; > int l; > }; > > void f(struct s1*); > > void g(void) > { > struct s1 s1 = {}; > struct s1 s2 = { .c = 1}; > struct s1 s3 = { .l = 2}; > struct s1 s4 = { .c = 3, .l = 4}; > > f(&s1); > f(&s2); > f(&s3); > f(&s4); > } > EOF > ------ cut -------- > > > This may or may not be a bug, and may or may not have been fixed in a > > more recent version. > > i tested 4.5.4, 6.3 and 7.0 from about a month ago on amd64 and had > consistent results as above. i think it's not a bug but an intentional > optimization but i don't feel like digging it out from the gcc sources :). Interesting. So I guess we're probably stuck with it, good or bad. Do you know the rationale behind the inconsistency? > > > both examples would be handled by the plugin if the types involved already > > > have a __user annotated field (as neither foo2 nor tv are initialized, only > > > assigned to), otherwise it'd need manual annotation and/or more static analysis. > > > > Do you see any false positives here? I would expect a fair number of > > structures that contain __user pointers but that are never copied > > wholesale to userspace, but I've not reviewed in detail. > > false positive depends on what you consider a goal ;). for me the structleak > plugin had to serve one specific purpose (initialize a local variable that > would otherwise leak unintended kernel stack content back to userland), everything > else didn't matter. now if you want to move the goalpost and initialize those > and only those variables that get copied to userland but aren't otherwise > guaranteed to be completely initialized then you'll need something much smarter > than the current structleak plugin as it doesn't serve that purpose and will > both overinitialize and fail to initialize affected variables... Sure -- the hope was that structleak might cover this sort of thing without too much extra effort, but it sounds less realistic now. Cheers ---Dave IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
On Tue, Jan 17, 2017 at 07:54:38PM +0100, PaX Team wrote: > On 17 Jan 2017 at 17:48, Mark Rutland wrote: > > That being the case, (and given the relevant bug has now been fixed), > > it's not clear to me what the value of this is today. i.e. given the > > general case, is this preventing many leaks? > > no idea, i stopped looking at the instrumentation log long ago, but everyone > can enable the debug output (has a very specific comment on it ;) and look at > the results. i keep this plugin around because it costs nothing to maintain > it and the alternative (better) solution doesn't exist yet. Fair enough; understood. > > > i never went into that direction because i think the security goal can > > > be achieved without the performance impact of forced initialization. > > > > Was there a particular technique you had in mind? > > sure, i mentioned it in my SSTIC'12 keynote (page 36): > https://pax.grsecurity.net/docs/PaXTeam-SSTIC12-keynote-20-years-of-PaX.pdf Thanks for the pointer. I'm probably being very naive here, but IIUC the per-task usercopy stack would require roughly the same analysis to identify relevant variables, unless all local variables (regardless of initialisation) that fed into a usercopy would be on the usercopy stack? Regardless, I can see the benefit of cleanly separating that data from the rest of the kernel data. Thanks, Mark.
diff --git a/arch/Kconfig b/arch/Kconfig index 99839c23d453..f1250ba0b06f 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -410,6 +410,28 @@ config GCC_PLUGIN_LATENT_ENTROPY * https://grsecurity.net/ * https://pax.grsecurity.net/ +config GCC_PLUGIN_STRUCTLEAK + bool "Force initialization of variables containing userspace addresses" + depends on GCC_PLUGINS + help + This plugin zero-initializes any structures that containing a + __user attribute. This can prevent some classes of information + exposures. + + This plugin was ported from grsecurity/PaX. More information at: + * https://grsecurity.net/ + * https://pax.grsecurity.net/ + +config GCC_PLUGIN_STRUCTLEAK_VERBOSE + bool "Report initialized variables" + depends on GCC_PLUGIN_STRUCTLEAK + depends on !COMPILE_TEST + help + This option will cause a warning to be printed each time the + structleak plugin finds a variable it thinks needs to be + initialized. Since not all existing initializers are detected + by the plugin, this can produce false positive warnings. + config HAVE_CC_STACKPROTECTOR bool help diff --git a/include/linux/compiler.h b/include/linux/compiler.h index cf0fa5d86059..91c30cba984e 100644 --- a/include/linux/compiler.h +++ b/include/linux/compiler.h @@ -27,7 +27,11 @@ extern void __chk_user_ptr(const volatile void __user *); extern void __chk_io_ptr(const volatile void __iomem *); # define ACCESS_PRIVATE(p, member) (*((typeof((p)->member) __force *) &(p)->member)) #else /* __CHECKER__ */ -# define __user +# ifdef STRUCTLEAK_PLUGIN +# define __user __attribute__((user)) +# else +# define __user +# endif # define __kernel # define __safe # define __force diff --git a/scripts/Makefile.gcc-plugins b/scripts/Makefile.gcc-plugins index 060d2cb373db..a084f7a511d8 100644 --- a/scripts/Makefile.gcc-plugins +++ b/scripts/Makefile.gcc-plugins @@ -25,6 +25,10 @@ ifdef CONFIG_GCC_PLUGINS endif endif + gcc-plugin-$(CONFIG_GCC_PLUGIN_STRUCTLEAK) += structleak_plugin.so + gcc-plugin-cflags-$(CONFIG_GCC_PLUGIN_STRUCTLEAK_VERBOSE) += -fplugin-arg-structleak_plugin-verbose + gcc-plugin-cflags-$(CONFIG_GCC_PLUGIN_STRUCTLEAK) += -DSTRUCTLEAK_PLUGIN + GCC_PLUGINS_CFLAGS := $(strip $(addprefix -fplugin=$(objtree)/scripts/gcc-plugins/, $(gcc-plugin-y)) $(gcc-plugin-cflags-y)) export PLUGINCC GCC_PLUGINS_CFLAGS GCC_PLUGIN GCC_PLUGIN_SUBDIR diff --git a/scripts/gcc-plugins/structleak_plugin.c b/scripts/gcc-plugins/structleak_plugin.c new file mode 100644 index 000000000000..deddb7291a26 --- /dev/null +++ b/scripts/gcc-plugins/structleak_plugin.c @@ -0,0 +1,246 @@ +/* + * Copyright 2013-2017 by PaX Team <pageexec@freemail.hu> + * Licensed under the GPL v2 + * + * Note: the choice of the license means that the compilation process is + * NOT 'eligible' as defined by gcc's library exception to the GPL v3, + * but for the kernel it doesn't matter since it doesn't link against + * any of the gcc libraries + * + * gcc plugin to forcibly initialize certain local variables that could + * otherwise leak kernel stack to userland if they aren't properly initialized + * by later code + * + * Homepage: http://pax.grsecurity.net/ + * + * Options: + * -fplugin-arg-structleak_plugin-disable + * -fplugin-arg-structleak_plugin-verbose + * + * Usage: + * $ # for 4.5/4.6/C based 4.7 + * $ gcc -I`gcc -print-file-name=plugin`/include -I`gcc -print-file-name=plugin`/include/c-family -fPIC -shared -O2 -o structleak_plugin.so structleak_plugin.c + * $ # for C++ based 4.7/4.8+ + * $ g++ -I`g++ -print-file-name=plugin`/include -I`g++ -print-file-name=plugin`/include/c-family -fPIC -shared -O2 -o structleak_plugin.so structleak_plugin.c + * $ gcc -fplugin=./structleak_plugin.so test.c -O2 + * + * TODO: eliminate redundant initializers + * increase type coverage + */ + +#include "gcc-common.h" + +/* unused C type flag in all versions 4.5-6 */ +#define TYPE_USERSPACE(TYPE) TYPE_LANG_FLAG_5(TYPE) + +__visible int plugin_is_GPL_compatible; + +static struct plugin_info structleak_plugin_info = { + .version = "201607271510vanilla", + .help = "disable\tdo not activate plugin\n" + "verbose\tprint all initialized variables\n", +}; + +static bool verbose; + +static tree handle_user_attribute(tree *node, tree name, tree args, int flags, bool *no_add_attrs) +{ + *no_add_attrs = true; + + /* check for types? for now accept everything linux has to offer */ + if (TREE_CODE(*node) != FIELD_DECL) + return NULL_TREE; + + *no_add_attrs = false; + return NULL_TREE; +} + +static struct attribute_spec user_attr = { + .name = "user", + .min_length = 0, + .max_length = 0, + .decl_required = false, + .type_required = false, + .function_type_required = false, + .handler = handle_user_attribute, +#if BUILDING_GCC_VERSION >= 4007 + .affects_type_identity = true +#endif +}; + +static void register_attributes(void *event_data, void *data) +{ + register_attribute(&user_attr); +} + +static tree get_field_type(tree field) +{ + return strip_array_types(TREE_TYPE(field)); +} + +static bool is_userspace_type(tree type) +{ + tree field; + + for (field = TYPE_FIELDS(type); field; field = TREE_CHAIN(field)) { + tree fieldtype = get_field_type(field); + enum tree_code code = TREE_CODE(fieldtype); + + if (code == RECORD_TYPE || code == UNION_TYPE) + if (is_userspace_type(fieldtype)) + return true; + + if (lookup_attribute("user", DECL_ATTRIBUTES(field))) + return true; + } + return false; +} + +static void finish_type(void *event_data, void *data) +{ + tree type = (tree)event_data; + + if (type == NULL_TREE || type == error_mark_node) + return; + +#if BUILDING_GCC_VERSION >= 5000 + if (TREE_CODE(type) == ENUMERAL_TYPE) + return; +#endif + + if (TYPE_USERSPACE(type)) + return; + + if (is_userspace_type(type)) + TYPE_USERSPACE(type) = 1; +} + +static void initialize(tree var) +{ + basic_block bb; + gimple_stmt_iterator gsi; + tree initializer; + gimple init_stmt; + + /* this is the original entry bb before the forced split */ + bb = single_succ(ENTRY_BLOCK_PTR_FOR_FN(cfun)); + + /* first check if variable is already initialized, warn otherwise */ + for (gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi)) { + gimple stmt = gsi_stmt(gsi); + tree rhs1; + + /* we're looking for an assignment of a single rhs... */ + if (!gimple_assign_single_p(stmt)) + continue; + rhs1 = gimple_assign_rhs1(stmt); +#if BUILDING_GCC_VERSION >= 4007 + /* ... of a non-clobbering expression... */ + if (TREE_CLOBBER_P(rhs1)) + continue; +#endif + /* ... to our variable... */ + if (gimple_get_lhs(stmt) != var) + continue; + /* if it's an initializer then we're good */ + if (TREE_CODE(rhs1) == CONSTRUCTOR) + return; + } + + /* these aren't the 0days you're looking for */ + if (verbose) + inform(DECL_SOURCE_LOCATION(var), + "userspace variable will be forcibly initialized"); + + /* build the initializer expression */ + initializer = build_constructor(TREE_TYPE(var), NULL); + + /* build the initializer stmt */ + init_stmt = gimple_build_assign(var, initializer); + gsi = gsi_after_labels(single_succ(ENTRY_BLOCK_PTR_FOR_FN(cfun))); + gsi_insert_before(&gsi, init_stmt, GSI_NEW_STMT); + update_stmt(init_stmt); +} + +static unsigned int structleak_execute(void) +{ + basic_block bb; + unsigned int ret = 0; + tree var; + unsigned int i; + + /* split the first bb where we can put the forced initializers */ + gcc_assert(single_succ_p(ENTRY_BLOCK_PTR_FOR_FN(cfun))); + bb = single_succ(ENTRY_BLOCK_PTR_FOR_FN(cfun)); + if (!single_pred_p(bb)) { + split_edge(single_succ_edge(ENTRY_BLOCK_PTR_FOR_FN(cfun))); + gcc_assert(single_succ_p(ENTRY_BLOCK_PTR_FOR_FN(cfun))); + } + + /* enumarate all local variables and forcibly initialize our targets */ + FOR_EACH_LOCAL_DECL(cfun, i, var) { + tree type = TREE_TYPE(var); + + gcc_assert(DECL_P(var)); + if (!auto_var_in_fn_p(var, current_function_decl)) + continue; + + /* only care about structure types */ + if (TREE_CODE(type) != RECORD_TYPE && TREE_CODE(type) != UNION_TYPE) + continue; + + /* if the type is of interest, examine the variable */ + if (TYPE_USERSPACE(type)) + initialize(var); + } + + return ret; +} + +#define PASS_NAME structleak +#define NO_GATE +#define PROPERTIES_REQUIRED PROP_cfg +#define TODO_FLAGS_FINISH TODO_verify_il | TODO_verify_ssa | TODO_verify_stmts | TODO_dump_func | TODO_remove_unused_locals | TODO_update_ssa | TODO_ggc_collect | TODO_verify_flow +#include "gcc-generate-gimple-pass.h" + +__visible int plugin_init(struct plugin_name_args *plugin_info, struct plugin_gcc_version *version) +{ + int i; + const char * const plugin_name = plugin_info->base_name; + const int argc = plugin_info->argc; + const struct plugin_argument * const argv = plugin_info->argv; + bool enable = true; + + PASS_INFO(structleak, "early_optimizations", 1, PASS_POS_INSERT_BEFORE); + + if (!plugin_default_version_check(version, &gcc_version)) { + error(G_("incompatible gcc/plugin versions")); + return 1; + } + + if (strncmp(lang_hooks.name, "GNU C", 5) && !strncmp(lang_hooks.name, "GNU C+", 6)) { + inform(UNKNOWN_LOCATION, G_("%s supports C only, not %s"), plugin_name, lang_hooks.name); + enable = false; + } + + for (i = 0; i < argc; ++i) { + if (!strcmp(argv[i].key, "disable")) { + enable = false; + continue; + } + if (!strcmp(argv[i].key, "verbose")) { + verbose = true; + continue; + } + error(G_("unknown option '-fplugin-arg-%s-%s'"), plugin_name, argv[i].key); + } + + register_callback(plugin_name, PLUGIN_INFO, NULL, &structleak_plugin_info); + if (enable) { + register_callback(plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL, &structleak_pass_info); + register_callback(plugin_name, PLUGIN_FINISH_TYPE, finish_type, NULL); + } + register_callback(plugin_name, PLUGIN_ATTRIBUTES, register_attributes, NULL); + + return 0; +}
This plugin detects any structures that contain __user attributes and makes sure it is being fulling initialized so that a specific class of information exposure is eliminated. (For example, the exposure of siginfo in CVE-2013-2141 would have been blocked by this plugin.) Ported from grsecurity/PaX. This version adds a verbose option to the plugin and the Kconfig. Signed-off-by: Kees Cook <keescook@chromium.org> --- arch/Kconfig | 22 +++ include/linux/compiler.h | 6 +- scripts/Makefile.gcc-plugins | 4 + scripts/gcc-plugins/structleak_plugin.c | 246 ++++++++++++++++++++++++++++++++ 4 files changed, 277 insertions(+), 1 deletion(-) create mode 100644 scripts/gcc-plugins/structleak_plugin.c