From patchwork Thu Dec 20 19:59:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tycho Andersen X-Patchwork-Id: 10739521 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E4AA16C5 for ; Thu, 20 Dec 2018 19:59:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D23C628BA7 for ; Thu, 20 Dec 2018 19:59:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C642128BB8; Thu, 20 Dec 2018 19:59:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6DEB228BA7 for ; Thu, 20 Dec 2018 19:59:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731465AbeLTT7n (ORCPT ); Thu, 20 Dec 2018 14:59:43 -0500 Received: from mail-io1-f65.google.com ([209.85.166.65]:33287 "EHLO mail-io1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725372AbeLTT7n (ORCPT ); Thu, 20 Dec 2018 14:59:43 -0500 Received: by mail-io1-f65.google.com with SMTP id t24so1842410ioi.0 for ; Thu, 20 Dec 2018 11:59:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tycho-ws.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=cjBNMXJEe7yhynO0BJQEbrj2pSCrolpSJvzLNJ6ZJp8=; b=xWCon+D+98p4uXj+4WDwFI26UDpsxysfv/Ttb3YOd/KjWYaLQWfyD5/dju4gbh+Ls1 TZt34rnd7BlTYJA+9NAs1RXdYItb8RlMWmB9dXxtdsAADwMkceE5oExxtAqWH4ISuvBu dzbnNmkGbRV3AvQr+XKfTULpcWgTDeF9896gN+NSlPj0BlzFxpZrydQgSFhkh+mU5Jwj XOHj1Zyl54dxkOCU3fu8sHOhas27gSfxXjZZa4Yucqxin9ofZ6pqS2ELiruXimPEE7cT dhDVhH8NOF1eM8pokBXn6Pf4UhfqrZIUJw8ARhcA5hY5LvdgkdKf7R5s9ZgtVCsqILss /5Ew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=cjBNMXJEe7yhynO0BJQEbrj2pSCrolpSJvzLNJ6ZJp8=; b=W7zA71WcmDl2UjHZ0SiTGBOItv9G1PacmtAp5SbLENaLBWAbhVgp1Fg1UEZK+X3jcW xtRjRA+XwnH8UjV2tSHIAPEydVbcyhAbHOvJLsTzis3cJzigMf7ulTq6cHXrxGtzHoYt mgdWFFw/0hc7xloPfVFwl9hBT/jB1h5YFLL3DMx8M8P//ebw535KzUIBY64L57oWk0o4 LyD15ghjq9s+MA9b0LVer6kLc4W8JEN68Q4HDW08M5yMlpONIV0S1P3UNz4ITTRMfji0 NhJByyAlpCuFM4jxReMJbbxe79htJW3QkPTPiYvACdcZUdJWnXMSZSdctfnACnDaX2Lc iNdg== X-Gm-Message-State: AJcUukdKzYjZQMhYxVRyz38e8qw9HSpLyUejCHq6YSPSL22Oq/88iNUD jZTn0vpRMw299Pe7bWtdbxZjTo4olVyIeQ== X-Google-Smtp-Source: AFSGD/XVKFp6FtAKyTPEnv78j4I3diMZorU00KjQ7xnbkGmUpqd2R5ivgcuLBUWXKD6l7vRd3XD8nQ== X-Received: by 2002:a5e:920c:: with SMTP id y12mr3895835iop.264.1545335981883; Thu, 20 Dec 2018 11:59:41 -0800 (PST) Received: from cisco.lan (71-218-133-134.hlrn.qwest.net. [71.218.133.134]) by smtp.gmail.com with ESMTPSA id b140sm5068429itc.4.2018.12.20.11.59.40 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 20 Dec 2018 11:59:41 -0800 (PST) From: Tycho Andersen To: linux-sparse@vger.kernel.org, kernel-hardening@lists.openwall.com Cc: Tycho Andersen Subject: [RFC v1 1/4] expression.h: update comment to include other cast types Date: Thu, 20 Dec 2018 12:59:28 -0700 Message-Id: <20181220195931.20331-2-tycho@tycho.ws> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20181220195931.20331-1-tycho@tycho.ws> References: <20181220195931.20331-1-tycho@tycho.ws> MIME-Version: 1.0 Sender: linux-sparse-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-sparse@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This part of the union is used with other cast types as well, so let's include those in the comment. Signed-off-by: Tycho Andersen --- expression.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/expression.h b/expression.h index ba4157f..fece40d 100644 --- a/expression.h +++ b/expression.h @@ -187,7 +187,7 @@ struct expression { struct expression *base; unsigned r_bitpos, r_nrbits; }; - // EXPR_CAST and EXPR_SIZEOF + // EXPR_CAST, FORCE_CAST, EXPR_IMPLIED_CAST and EXPR_SIZEOF struct /* cast_arg */ { struct symbol *cast_type; struct expression *cast_expression; From patchwork Thu Dec 20 19:59:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tycho Andersen X-Patchwork-Id: 10739523 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 826CA924 for ; Thu, 20 Dec 2018 19:59:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 71C5528BA7 for ; Thu, 20 Dec 2018 19:59:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 65F7E28BB8; Thu, 20 Dec 2018 19:59:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4808028BA7 for ; Thu, 20 Dec 2018 19:59:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731459AbeLTT7p (ORCPT ); Thu, 20 Dec 2018 14:59:45 -0500 Received: from mail-io1-f66.google.com ([209.85.166.66]:46577 "EHLO mail-io1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725372AbeLTT7o (ORCPT ); Thu, 20 Dec 2018 14:59:44 -0500 Received: by mail-io1-f66.google.com with SMTP id v10so1793638ios.13 for ; Thu, 20 Dec 2018 11:59:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tycho-ws.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=53mM/vlP4AAeJST/175iLnAxhsy2Kek2msz5exqLT30=; b=SWr2cqhdiqHVNDrR9yXDfFTD0iPbq8k/62OTj52o3z3g1I+FcbvZBAjeJH7kVqSU5k GJg+iSq2l8s7ueS3QvLwfJasRHC7v+0XNealML39/jI0jep2R/Z1IgW/SFo/Rx5dPyK+ RS2KYN6XYJggo/z+kiH6nihbBo9xN6QrIm5AxWSTQ5Jo7EALJbYxdeeB4td01E8M1JnQ CCm7ukM8zAd2GroB71GDKDh0gdyZ4pk/n+xGZt6Ic/uhwvoRcVrvCAGxEroYpuk3jWry 5CWoSjwawO1flcZZyw3dkroekjB1pDoRp8wna5Stp6wHUfduaJHSWHZupDfSEXimG/fx RIpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=53mM/vlP4AAeJST/175iLnAxhsy2Kek2msz5exqLT30=; b=beR2vUIgIEITYNrejuGC0/Xr0S6nNiAD/yQvNOignjT9d64xpZLjCxTUa3FB5xEasP A0GTgiMkjVV/kuZqFnj8lYXN6CpVzYx2SEOhnb07GrRt08300lezE2Kru0QEJ7VrAnUV xb7NDh8vKiiQbaq1Yeuhi+rWQT1KUmIH+JNME/K4hfoieuxUOe65W6/JinpdyK5WRoCv faQ6aVbL5uOdUiRn2DlogS/3kqaG66+EM33dhB/EPHtDD3PV1n5tSPCG+8BpIlAaJgw1 mK+7ZNm9wnnvk8OHobArt7zGazsyqy3ZALk08eVk4+fCNj0UHRlT/3ISL2BkQjMSwGHf PiQw== X-Gm-Message-State: AA+aEWaHAjulf7J6sQKyB9iRayId2ODAupE5M1uzldxed0iDsEVsrmpN SXGifIB3bIVt7ayEMZyt00yJ3pdubME+BA== X-Google-Smtp-Source: AFSGD/XqxqKyNimANBd/MpEkQh4ZJu48QPGFiyZYzEWg6y3AnS7xit0Gnt7pgSnApTlo6syWiJpE2g== X-Received: by 2002:a6b:bc83:: with SMTP id m125mr21266468iof.83.1545335983007; Thu, 20 Dec 2018 11:59:43 -0800 (PST) Received: from cisco.lan (71-218-133-134.hlrn.qwest.net. [71.218.133.134]) by smtp.gmail.com with ESMTPSA id b140sm5068429itc.4.2018.12.20.11.59.41 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 20 Dec 2018 11:59:42 -0800 (PST) From: Tycho Andersen To: linux-sparse@vger.kernel.org, kernel-hardening@lists.openwall.com Cc: Tycho Andersen Subject: [RFC v1 2/4] move name-based analysis before linearization Date: Thu, 20 Dec 2018 12:59:29 -0700 Message-Id: <20181220195931.20331-3-tycho@tycho.ws> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20181220195931.20331-1-tycho@tycho.ws> References: <20181220195931.20331-1-tycho@tycho.ws> MIME-Version: 1.0 Sender: linux-sparse-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-sparse@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Based on [1], let's move the function name matching analysis to before linearization so that we don't destroy type info. This has the added benefit that we can call the same analysis on inlined functions, since they haven't been totally inlined yet and still have their name information. Unfortunately, this means that walking the AST is much more complicated, because we need to handle all the various expression/statement cases. I've done what I believe is a mostly complete implementation of this, but there are a few TODOs here and there as well. I ported the current size checks to this new point as well. [1]: https://www.spinics.net/lists/linux-sparse/msg09867.html Signed-off-by: Tycho Andersen --- sparse.c | 193 ++++++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 156 insertions(+), 37 deletions(-) diff --git a/sparse.c b/sparse.c index 151eaf4..217a5bf 100644 --- a/sparse.c +++ b/sparse.c @@ -149,68 +149,188 @@ static void check_range_instruction(struct instruction *insn) warning(insn->pos, "value out of range"); } -static void check_byte_count(struct instruction *insn, pseudo_t count) +static void check_byte_count(struct position pos, struct expression *size) { - if (!count) + long long val; + + if (!size) return; - if (count->type == PSEUDO_VAL) { - unsigned long long val = count->value; - if (Wmemcpy_max_count && val > fmemcpy_max_count) - warning(insn->pos, "%s with byte count of %llu", - show_ident(insn->func->sym->ident), val); + + val = get_expression_value_silent(size); + if (val == 0) return; - } + + if (Wmemcpy_max_count && val > fmemcpy_max_count) + warning(pos, "copy with byte count of %llu", val); /* OK, we could try to do the range analysis here */ } -static pseudo_t argument(struct instruction *call, unsigned int argno) +static void check_memset(struct position pos, struct expression_list *args) { - pseudo_t args[8]; - struct ptr_list *arg_list = (struct ptr_list *) call->arguments; - - argno--; - if (linearize_ptr_list(arg_list, (void *)args, 8) > argno) - return args[argno]; - return NULL; + struct expression *size = ptr_list_nth_entry((struct ptr_list *)args, 2); + check_byte_count(pos, size); } -static void check_memset(struct instruction *insn) -{ - check_byte_count(insn, argument(insn, 3)); } + #define check_memcpy check_memset #define check_ctu check_memset #define check_cfu check_memset struct checkfn { struct ident *id; - void (*check)(struct instruction *insn); + void (*check)(struct position pos, struct expression_list *args); +}; + +static const struct checkfn check_fn[] = { + { &memset_ident, check_memset }, + { &memcpy_ident, check_memcpy }, + { ©_to_user_ident, check_ctu }, + { ©_from_user_ident, check_cfu }, }; -static void check_call_instruction(struct instruction *insn) +static void check_one_statement(struct statement *); + +static void check_one_expression(struct position pos, struct expression *expr) { - pseudo_t fn = insn->func; - struct ident *ident; - static const struct checkfn check_fn[] = { - { &memset_ident, check_memset }, - { &memcpy_ident, check_memcpy }, - { ©_to_user_ident, check_ctu }, - { ©_from_user_ident, check_cfu }, - }; + struct expression *call, *fn; + struct symbol *direct; int i; - if (fn->type != PSEUDO_SYM) + if (!expr) + return; + + if (expr->type == EXPR_STATEMENT) { + check_one_statement(expr->statement); + return; + } + + if (expr->type != EXPR_CALL) return; - ident = fn->sym->ident; - if (!ident) + + call = expr; + fn = call->fn; + + direct = NULL; + if (fn->type == EXPR_PREOP) { + if (fn->unop->type == EXPR_SYMBOL) { + struct symbol *sym = fn->unop->symbol; + if (sym->ctype.base_type->type == SYM_FN) + direct = sym; + } + } + + if (!direct) return; + for (i = 0; i < ARRAY_SIZE(check_fn); i++) { - if (check_fn[i].id != ident) + if (check_fn[i].id != direct->ident) continue; - check_fn[i].check(insn); + check_fn[i].check(pos, call->args); + return; + } +} + +static void check_one_statement(struct statement *stmt) +{ + struct ident *ident; + int i; + + if (!stmt) + return; + + switch (stmt->type) { + case STMT_DECLARATION: { + struct symbol *sym; + FOR_EACH_PTR(stmt->declaration, sym) { + check_one_expression(stmt->pos, sym->initializer); + } END_FOR_EACH_PTR(sym); break; } + case STMT_EXPRESSION: + check_one_expression(stmt->pos, stmt->expression); + break; + case STMT_COMPOUND: { + struct statement *s; + + if (stmt->inline_fn) { + ident = stmt->inline_fn->ident; + + for (i = 0; i < ARRAY_SIZE(check_fn); i++) { + struct symbol *sym; + struct expression_list *args = NULL; + + if (check_fn[i].id != ident) + continue; + + FOR_EACH_PTR(stmt->args->declaration, sym) { + add_expression(&args, sym->initializer); + } END_FOR_EACH_PTR(sym); + + check_fn[i].check(stmt->pos, args); + free_ptr_list((struct ptr_list **) &args); + break; + } + break; + } + + FOR_EACH_PTR(stmt->stmts, s) { + check_one_statement(s); + } END_FOR_EACH_PTR(s); + break; + } + case STMT_IF: + check_one_expression(stmt->pos, stmt->if_conditional); + check_one_statement(stmt->if_true); + check_one_statement(stmt->if_false); + break; + case STMT_CASE: + check_one_expression(stmt->pos, stmt->case_expression); + check_one_expression(stmt->pos, stmt->case_to); + check_one_statement(stmt->case_statement); + break; + case STMT_SWITCH: + check_one_expression(stmt->pos, stmt->switch_expression); + check_one_statement(stmt->switch_statement); + break; + case STMT_ITERATOR: + check_one_statement(stmt->iterator_pre_statement); + check_one_expression(stmt->pos, stmt->iterator_pre_condition); + check_one_statement(stmt->iterator_statement); + check_one_statement(stmt->iterator_post_statement); + check_one_expression(stmt->pos, stmt->iterator_post_condition); + break; + case STMT_LABEL: + check_one_statement(stmt->label_statement); + break; + case STMT_RANGE: + check_one_expression(stmt->pos, stmt->range_expression); + check_one_expression(stmt->pos, stmt->range_low); + check_one_expression(stmt->pos, stmt->range_high); + break; + /* + * TODO: STMT_CONTEXT, GOTO, ASM; these could/should all be walked, but + * don't seem super relevant for copy_{to,from}_user(). + */ + default: + return; + } +} + +static void check_symbol(struct symbol *sym) +{ + if (sym->type == SYM_NODE) + sym = sym->ctype.base_type; + + switch (sym->type) { + case SYM_FN: + if (sym->stmt) + check_one_statement(sym->stmt); + break; + default: + return; + } } static void check_one_instruction(struct instruction *insn) @@ -224,9 +344,6 @@ static void check_one_instruction(struct instruction *insn) case OP_RANGE: check_range_instruction(insn); break; - case OP_CALL: - check_call_instruction(insn); - break; default: break; } @@ -314,6 +431,8 @@ static void check_symbols(struct symbol_list *list) struct entrypoint *ep; expand_symbol(sym); + check_symbol(sym); + ep = linearize_symbol(sym); if (ep && ep->entry) { if (dbg_entry) From patchwork Thu Dec 20 19:59:30 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tycho Andersen X-Patchwork-Id: 10739525 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0F1B617E1 for ; Thu, 20 Dec 2018 19:59:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F2EF028BA7 for ; Thu, 20 Dec 2018 19:59:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E73C028BB8; Thu, 20 Dec 2018 19:59:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8FD7828BA7 for ; Thu, 20 Dec 2018 19:59:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731489AbeLTT7q (ORCPT ); Thu, 20 Dec 2018 14:59:46 -0500 Received: from mail-io1-f68.google.com ([209.85.166.68]:45298 "EHLO mail-io1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725372AbeLTT7q (ORCPT ); Thu, 20 Dec 2018 14:59:46 -0500 Received: by mail-io1-f68.google.com with SMTP id p7so1696420iog.12 for ; Thu, 20 Dec 2018 11:59:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tycho-ws.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=0EimkVSLHfSwmfcQEy4BKoOrgVdq12HdkoBv4xVscs0=; b=ZXsCv/D1he3mbj80n9D5c5n4Mfz28OCAPGwlTNp/szwd9o50nZ1/Wn27Ebp9i+dYku uFDy3kVGm/c9PBXi5UOSdyDtpfYP0ixSYSMsjF5Bs8HrcbNvdpFzOewAcOcY16mP7BAV 57Qd6h0rt45GaJ4srfIWeJDxa+M8atQjqThpNeH/iikUg0C2DCMxlgYreb/rqw9sApOC wM1hwYTTfuX48pCpDyIHosgEUTqcIRCsncUu9mbtpiL0f1cJTUbdAa/aMpeUk+f6ldrG NkU28tBTkOAqCpEq2tpSAwsOMuHCA7A+CnsUIr1RZw/P0aC9HOQITu/kjUML8CXLe7Fk X7Nw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=0EimkVSLHfSwmfcQEy4BKoOrgVdq12HdkoBv4xVscs0=; b=bkDjyWw8q586fVoC7O0gySiEBHzKz8ZzmeLnImSjkWrGrUNlOP2XzOXkMszjqVM85Z oYENEpHRHbzwXHiDCshvZhtK92E82qOhPbWjjAJnJGh5/E0ZURF4NqESdhsbvAlZncwD xT7nbbRRjV8sckFnmWXVmOrlKdU5hAKa5LwNWvigFebBWAG19jHvvh+zvEb+mPd9OyNw 4uBlHxsylxwqZElRHqVGSyJ2R++CIChbhL2Vlz1KA2S878XKDomDRTcTTu/7OwEilJRm rPtAfNcBx9AZTGtBDl/RjOEoe4TNfF1+klU1u9Q+tDMwjGfwCqvVdhCJz/CleF8iqYkC GVYw== X-Gm-Message-State: AA+aEWbXIxn++HQLIlpN4pMKLlE6udBGoOgHltDf9060f64KahI0h087 esKCRH2BWBY85a7+hRsPM1FcIN9xBxpPbA== X-Google-Smtp-Source: AFSGD/UjzBl3qVFnPZzo+hXGgtxG3w1oJ+Wq0+EFN7n7Gv0fdGAYIulSkCB+AG2CaOcNIwyKrBKAbg== X-Received: by 2002:a5d:8747:: with SMTP id k7mr20489826iol.279.1545335984354; Thu, 20 Dec 2018 11:59:44 -0800 (PST) Received: from cisco.lan (71-218-133-134.hlrn.qwest.net. [71.218.133.134]) by smtp.gmail.com with ESMTPSA id b140sm5068429itc.4.2018.12.20.11.59.43 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 20 Dec 2018 11:59:43 -0800 (PST) From: Tycho Andersen To: linux-sparse@vger.kernel.org, kernel-hardening@lists.openwall.com Cc: Tycho Andersen Subject: [RFC v1 3/4] add a check for copy_to_user() address spaces Date: Thu, 20 Dec 2018 12:59:30 -0700 Message-Id: <20181220195931.20331-4-tycho@tycho.ws> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20181220195931.20331-1-tycho@tycho.ws> References: <20181220195931.20331-1-tycho@tycho.ws> MIME-Version: 1.0 Sender: linux-sparse-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-sparse@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Leaking kernel pointers to userspace is a bad idea, so let's try to do some analysis for it. The basic idea is that every pointer copied to userspace "should be" (but isn't necessarily) annotated with __user, and if it is not, then it's a potential infoleak. The motivation for this is stuff like [1], which is exactly a case of this. Based on a subsequent manual analysis of the uapi headers, I found [2]. Unfortunately in both of these cases, there is void * (for compat) trickery that masks them from actually turning up in this case. But hey, I wrote it and tried it out, so perhaps the code is useful for someone :) [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/block/floppy.c?id=65eea8edc315589d6c993cf12dbb5d0e9ef1fe4e [2]: https://lkml.org/lkml/2018/11/3/142 Signed-off-by: Tycho Andersen --- sparse.c | 93 ++++++++++++++++++++++++++++++++++++++- validation/copy_to_user.c | 31 +++++++++++++ 2 files changed, 123 insertions(+), 1 deletion(-) diff --git a/sparse.c b/sparse.c index 217a5bf..8bcbe16 100644 --- a/sparse.c +++ b/sparse.c @@ -171,13 +171,104 @@ static void check_memset(struct position pos, struct expression_list *args) check_byte_count(pos, size); } +static struct symbol *resolve_arg_type(struct position pos, struct expression *arg) +{ + struct expression *uncast; + + uncast = arg; + switch (arg->type) { + case EXPR_CAST: + case EXPR_FORCE_CAST: + case EXPR_IMPLIED_CAST: + /* + * Undo any casting done by sparse to the function's + * argument type. + */ + uncast = arg->cast_expression; + break; + case EXPR_SYMBOL: + break; + case EXPR_PREOP: + /* + * handle derefs; these are really just the type of the + * resulting expression. + */ + break; + case EXPR_BINOP: + /* TODO: resolve this pointer math if possible? */ + return NULL; + default: + warning(pos, "huh? arg not a cast or symbol? %d", arg->type); + return NULL; + } + + return uncast->ctype->ctype.base_type; +} + +static void check_ptr_in_other_as(struct position pos, struct symbol *sym, int this_as) +{ + struct ident *ident = sym->ident; + + if (sym->type == SYM_NODE) + sym = sym->ctype.base_type; + + switch (sym->type) { + case SYM_ARRAY: + case SYM_PTR: { + if (sym->ctype.as != this_as) + warning(pos, "member %s is a kernel pointer copied to userspace", show_ident(ident)); + check_ptr_in_other_as(pos, sym->ctype.base_type, this_as); + break; + } + case SYM_STRUCT: + case SYM_UNION: { + struct symbol *member; + + FOR_EACH_PTR(sym->symbol_list, member) { + check_ptr_in_other_as(pos, member, this_as); + } END_FOR_EACH_PTR(member); + break; + } + default: + /* + * scalar types are ok + * TODO: what about SYM_LABEL/PREPROCESSOR? + */ + break; + } +} + +static void check_no_kernel_pointers(struct position pos, struct expression_list *args) +{ + struct expression *src = ptr_list_nth_entry((struct ptr_list *)args, 1); + struct symbol *base = NULL; + + if (!Waddress_space) + return; + + /* get the type of src */ + base = resolve_arg_type(pos, src); + + /* + * And deref it to *src; src will *always* be a kernel pointer, and + * we're really after members of structures here, not the pointers + * themselves. So we do this deref at the top level. + */ + base = base->ctype.base_type; + + check_ptr_in_other_as(pos, base, 1); } #define check_memcpy check_memset -#define check_ctu check_memset #define check_cfu check_memset +void check_ctu(struct position pos, struct expression_list *args) +{ + check_memset(pos, args); + check_no_kernel_pointers(pos, args); +} + struct checkfn { struct ident *id; void (*check)(struct position pos, struct expression_list *args); diff --git a/validation/copy_to_user.c b/validation/copy_to_user.c new file mode 100644 index 0000000..d9cded6 --- /dev/null +++ b/validation/copy_to_user.c @@ -0,0 +1,31 @@ +#define __user __attribute__((address_space(1))) + +struct bar { + char *bar_kptr; +}; + +struct foo { + char *foo_kptr; + char __user *uptr; + struct bar bar; +}; + +static void copy_to_user(void __user *to, const void *from, unsigned long n) +{ +} + +static void bar(void) +{ + struct foo f; + void __user *p = (void __user *)0; + + copy_to_user(p, &f, sizeof(f)); +} +/* + * check-name: copy_to_user arguments + * + * check-error-start +copy_to_user.c:22:9: warning: member foo_kptr is a kernel pointer copied to userspace +copy_to_user.c:22:9: warning: member bar_kptr is a kernel pointer copied to userspace + * check-error-end + */ From patchwork Thu Dec 20 19:59:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tycho Andersen X-Patchwork-Id: 10739527 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8E9076C5 for ; Thu, 20 Dec 2018 19:59:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7E2EA28BA7 for ; Thu, 20 Dec 2018 19:59:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 725D828BBA; Thu, 20 Dec 2018 19:59:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0E87428BA7 for ; Thu, 20 Dec 2018 19:59:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731960AbeLTT7r (ORCPT ); Thu, 20 Dec 2018 14:59:47 -0500 Received: from mail-it1-f193.google.com ([209.85.166.193]:54873 "EHLO mail-it1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725372AbeLTT7r (ORCPT ); Thu, 20 Dec 2018 14:59:47 -0500 Received: by mail-it1-f193.google.com with SMTP id i145so3787018ita.4 for ; Thu, 20 Dec 2018 11:59:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tycho-ws.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=8vSgLE0139Uw28lgL88xITvwXqU1EcObnWsuEHGD+5Y=; b=s9T6xbpzYClqxDvUWuehyBSgKZ7K6qBkBvo98Qs5/uEhnJvMC3O14O9cWwuI6ZS1sK j/2zom16LM/I+Ct0E+9ZM1Q18t/zA4fDBKWUIOAKpJgyNU3nuFiI72H8V3RyxEK5lj+I FHOxfHiBDHhBWYit7vy4trGYT2vad67vOtdVarfufckFN0h04nyS1SF3X1DI5wW9AgEV FK80qGc5kh6uKjgPw9U+dUKAPa8bMHqxmTu5mZKQ5dZ13+zFZvLyIgNzpBZpbb79S8yi PSs93puzVzCNRgYrwWOFF687Vvu15lRG4RXpN7A33zfW439Y+3LC7p9ZPoMlZ4o1fwlq xGDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=8vSgLE0139Uw28lgL88xITvwXqU1EcObnWsuEHGD+5Y=; b=jELqFY1fjXPd65+IGr2jQxYN3QmaP7DSSk8QmgWO+oxKXN3X5cJo7rsrHGDyPE518K rDsHXz4QmcAbkkZx+t6jQ2kAmYQs7FI8yR4Rqa8oI/Iyaidcsy0G0C5sz1arvSP5zz0h hHdftFVZq1cfxN2CPtmA9nnaBelYgoMA6NDEqP4hC7xzuapnr3fOOlZt0qvcFDe2c7xp icWnYHxLDl2lJdc19tHHc+gIq5atmuM3JJS3P9qQf4Te7b0jehshzIxDNDTRJq+ThZP9 GeYc++Y1UrpU7bQSZUiHADYRZNYa8Xsow3fZhgwAzOoS0ziQN3IzY5UnXAI5u6lTjiy6 JgWA== X-Gm-Message-State: AA+aEWbWSpmYVevbYYTcEw2yXFa9ao2pyNogItWU9DEj/Wf5T3PgbkjB 3npyX7ykVomDJBR7TrjVQR4vbY1spiRz7Q== X-Google-Smtp-Source: ALg8bN5gsxi7ln4LRcWe3zrN7ql/+YYwVfsLywhwUqjMi63YIibJ0+Bf/vZNJ4t/bEmzH4Jl5TQnZA== X-Received: by 2002:a05:660c:74b:: with SMTP id a11mr32937itl.27.1545335985751; Thu, 20 Dec 2018 11:59:45 -0800 (PST) Received: from cisco.lan (71-218-133-134.hlrn.qwest.net. [71.218.133.134]) by smtp.gmail.com with ESMTPSA id b140sm5068429itc.4.2018.12.20.11.59.44 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 20 Dec 2018 11:59:44 -0800 (PST) From: Tycho Andersen To: linux-sparse@vger.kernel.org, kernel-hardening@lists.openwall.com Cc: Tycho Andersen Subject: [RFC v1 4/4] check copy_to_user() sizes Date: Thu, 20 Dec 2018 12:59:31 -0700 Message-Id: <20181220195931.20331-5-tycho@tycho.ws> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20181220195931.20331-1-tycho@tycho.ws> References: <20181220195931.20331-1-tycho@tycho.ws> MIME-Version: 1.0 Sender: linux-sparse-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-sparse@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The intent here is to be a smarter version of the size checking that's just a hard coded constant. The insight is that we know the type of the thing that's getting passed to copy_to_user(), so if we can evaluate the size, it should be <= sizeof(*the thing being copied). I've run this over x86 with allyes config, and it didn't find anything, so perhaps it isn't particularly useful. Indeed, most copy_to_user()s look like: copy_to_user(p, &foo, sizeof(foo)) so perhaps this isn't a surprise. The one potentially interesting case is the hard coded array length case: it's possible someone might change one place but not the other. See e.g. the first copy_to_user() in drivers/ata/libata-scsi.c, which looks like: copy_to_user(dst, dev->id, ATA_ID_WORDS * sizeof(u16)) id here is a u16 id[ATA_ID_WORDS], so it would be possible to change this in one place or not the other. Unfortunately, sparse seems to unify the type of dev->id to a u16* instead of u16[], so we can't see through this. Anyway, there are currently no violations of this in the kernel either, although it does emit some false positives like, fs/xfs/xfs_ioctl.c:1813:25: warning: copy_to_user() where size (13) is larger than src (1) due to the above. I wrote the code, so perhaps someday it will be useful to someone, so I'm posting it :) Signed-off-by: Tycho Andersen --- sparse.c | 45 ++++++++++++++++++++++ validation/copy_to_user_sizes.c | 53 ++++++++++++++++++++++++++ validation/copy_to_user_sizes_inline.c | 29 ++++++++++++++ 3 files changed, 127 insertions(+) diff --git a/sparse.c b/sparse.c index 8bcbe16..eb7d5d7 100644 --- a/sparse.c +++ b/sparse.c @@ -31,6 +31,7 @@ #include #include #include +#include #include "lib.h" #include "allocate.h" @@ -259,6 +260,49 @@ static void check_no_kernel_pointers(struct position pos, struct expression_list check_ptr_in_other_as(pos, base, 1); } +static void check_copy_size(struct position pos, struct expression_list *args) +{ + struct expression *src = ptr_list_nth_entry((struct ptr_list *)args, 1); + struct expression *size = ptr_list_nth_entry((struct ptr_list *)args, 2); + long long src_actual = LLONG_MAX; + long long size_actual; + struct symbol *base, *deref; + + /* get the type of src */ + base = resolve_arg_type(pos, src); + + /* and deref it to *src */ + deref = base->ctype.base_type; + + /* + * Note, this is never hit right now because sparse seems to unify + * arrays that are arguments to type *. + */ + if (is_array_type(base)) { + long long array_size = get_expression_value_silent(base->array_size); + printf("%s is array type\n", show_ident(base->ident)); + + /* failed to evaluate */ + if (array_size == 0) + return; + src_actual = array_size * bits_to_bytes(deref->bit_size); + } else { + src_actual = bits_to_bytes(deref->bit_size); + } + + /* Evaluate size, if we can */ + size_actual = get_expression_value_silent(size); + /* + * size_actual == 0 means that get_expression_value failed; of + * course we'll miss something if there is a 0 length copy, but + * then nothing will leak anyway so... + */ + if (size_actual == 0) + return; + + if (size_actual > src_actual) + warning(pos, "copy_to_user() where size (%lld) is larger than src (%lld)", size_actual, src_actual); +} #define check_memcpy check_memset #define check_cfu check_memset @@ -267,6 +311,7 @@ void check_ctu(struct position pos, struct expression_list *args) { check_memset(pos, args); check_no_kernel_pointers(pos, args); + check_copy_size(pos, args); } struct checkfn { diff --git a/validation/copy_to_user_sizes.c b/validation/copy_to_user_sizes.c new file mode 100644 index 0000000..f1f7b8e --- /dev/null +++ b/validation/copy_to_user_sizes.c @@ -0,0 +1,53 @@ +static void copy_to_user(void *to, const void *from, unsigned long n) +{ +} + +struct foo { + int bar; +}; + +static void main(void) +{ + void *p; + struct foo f; + int uninitialized; + + copy_to_user(p, &f, sizeof(f)); + copy_to_user(p, &f, sizeof(f)-1); + copy_to_user(p, &f, sizeof(f)+1); + copy_to_user(p, &f, 1); + copy_to_user(p, &f, 100); + copy_to_user(p, &f, uninitialized); +} + +#if 0 +static void broken(void) +{ + void *p; + char arr[100]; + struct foo foo_arr[2]; + + /* + * These all fail right now, because sparse seems to unify the type of + * arr/foo_arr to char * /struct foo *, instead of char[]/struct foo[]. + * + * That's unfortunate, because it means that these generate false + * positives in the kernel when using a static array length, and that + * seems like a case where this type of check would be especially + * useful. + */ + copy_to_user(p, arr, 100); + copy_to_user(p, arr, 101); + copy_to_user(p, foo_arr, sizeof(foo_arr)); + copy_to_user(p, foo_arr, sizeof(foo_arr)-1); + copy_to_user(p, foo_arr, sizeof(foo_arr[0])*2); +} +#endif +/* + * check-name: copy_to_user sizes + * + * check-error-start +copy_to_user_sizes.c:17:9: warning: copy_to_user() where size (5) is larger than src (4) +copy_to_user_sizes.c:19:9: warning: copy_to_user() where size (100) is larger than src (4) + * check-error-end + */ diff --git a/validation/copy_to_user_sizes_inline.c b/validation/copy_to_user_sizes_inline.c new file mode 100644 index 0000000..bd3c3bd --- /dev/null +++ b/validation/copy_to_user_sizes_inline.c @@ -0,0 +1,29 @@ +static inline void copy_to_user(void *to, const void *from, unsigned long n) +{ +} + +struct foo { + int bar; +}; + +static void main(void) +{ + void *p; + struct foo f; + int uninitialized; + + copy_to_user(p, &f, sizeof(f)); + copy_to_user(p, &f, sizeof(f)-1); + copy_to_user(p, &f, sizeof(f)+1); + copy_to_user(p, &f, 1); + copy_to_user(p, &f, 100); + copy_to_user(p, &f, uninitialized); +} +/* + * check-name: copy_to_user sizes + * + * check-error-start +copy_to_user_sizes_inline.c:17:21: warning: copy_to_user() where size (5) is larger than src (4) +copy_to_user_sizes_inline.c:19:21: warning: copy_to_user() where size (100) is larger than src (4) + * check-error-end + */