From patchwork Thu Dec 20 19:59:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tycho Andersen X-Patchwork-Id: 10739531 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8BEE26C2 for ; Thu, 20 Dec 2018 20:00:04 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7D2B928BAD for ; Thu, 20 Dec 2018 20:00:04 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 70C8928BBA; Thu, 20 Dec 2018 20:00:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id E839B28BAD for ; Thu, 20 Dec 2018 20:00:02 +0000 (UTC) Received: (qmail 17870 invoked by uid 550); 20 Dec 2018 19:59:54 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 17699 invoked from network); 20 Dec 2018 19:59:54 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tycho-ws.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=cjBNMXJEe7yhynO0BJQEbrj2pSCrolpSJvzLNJ6ZJp8=; b=xWCon+D+98p4uXj+4WDwFI26UDpsxysfv/Ttb3YOd/KjWYaLQWfyD5/dju4gbh+Ls1 TZt34rnd7BlTYJA+9NAs1RXdYItb8RlMWmB9dXxtdsAADwMkceE5oExxtAqWH4ISuvBu dzbnNmkGbRV3AvQr+XKfTULpcWgTDeF9896gN+NSlPj0BlzFxpZrydQgSFhkh+mU5Jwj XOHj1Zyl54dxkOCU3fu8sHOhas27gSfxXjZZa4Yucqxin9ofZ6pqS2ELiruXimPEE7cT dhDVhH8NOF1eM8pokBXn6Pf4UhfqrZIUJw8ARhcA5hY5LvdgkdKf7R5s9ZgtVCsqILss /5Ew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=cjBNMXJEe7yhynO0BJQEbrj2pSCrolpSJvzLNJ6ZJp8=; b=AEQWKyjNkukO77zNV1QlSHrs8g3RX9Tj+4CRbrJklbnDnjhljjOSG/vgTORAZs1d7G b/KR2bK+sFdr8FRArUoEOgucYuFSAm+ZPsFihQ1NQAkgHqhrH3JFm/BR5iKCerWOi68/ JxJ+Zn4G4BqE1wwmFWhjg+qWc6WNS9NoU9gIrC1n5vDbF06sgI7xNi8h4p30uU9qJzfD /1LlI7HQf93P8NeptX+ZPqDC13XYONKa8LOr5i70TIt9dztvBYHqp6RcuOpyTR33PnJq K4xOAB9xEtXguIWnaEMdA9D9I/ul9uJidx2PWu+KYTu+YOYYFkavjKMdhn5sh2fCESEY FTFQ== X-Gm-Message-State: AJcUukfTeAzgJyqmwzwUCuoyFXIBDGLl5sX1RG8VBgSZkvBfs9EVb6+g 2wOhD3s2mrYHrlMeQDzZ6SbenQ== X-Google-Smtp-Source: AFSGD/XVKFp6FtAKyTPEnv78j4I3diMZorU00KjQ7xnbkGmUpqd2R5ivgcuLBUWXKD6l7vRd3XD8nQ== X-Received: by 2002:a5e:920c:: with SMTP id y12mr3895835iop.264.1545335981883; Thu, 20 Dec 2018 11:59:41 -0800 (PST) From: Tycho Andersen To: linux-sparse@vger.kernel.org, kernel-hardening@lists.openwall.com Cc: Tycho Andersen Subject: [RFC v1 1/4] expression.h: update comment to include other cast types Date: Thu, 20 Dec 2018 12:59:28 -0700 Message-Id: <20181220195931.20331-2-tycho@tycho.ws> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20181220195931.20331-1-tycho@tycho.ws> References: <20181220195931.20331-1-tycho@tycho.ws> MIME-Version: 1.0 X-Virus-Scanned: ClamAV using ClamSMTP This part of the union is used with other cast types as well, so let's include those in the comment. Signed-off-by: Tycho Andersen --- expression.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/expression.h b/expression.h index ba4157f..fece40d 100644 --- a/expression.h +++ b/expression.h @@ -187,7 +187,7 @@ struct expression { struct expression *base; unsigned r_bitpos, r_nrbits; }; - // EXPR_CAST and EXPR_SIZEOF + // EXPR_CAST, FORCE_CAST, EXPR_IMPLIED_CAST and EXPR_SIZEOF struct /* cast_arg */ { struct symbol *cast_type; struct expression *cast_expression; From patchwork Thu Dec 20 19:59:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tycho Andersen X-Patchwork-Id: 10739533 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A66A36C2 for ; Thu, 20 Dec 2018 20:00:12 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 99D5528C03 for ; Thu, 20 Dec 2018 20:00:12 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 963E228C20; Thu, 20 Dec 2018 20:00:12 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 854E828BFD for ; Thu, 20 Dec 2018 20:00:11 +0000 (UTC) Received: (qmail 18024 invoked by uid 550); 20 Dec 2018 19:59:56 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 17836 invoked from network); 20 Dec 2018 19:59:54 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tycho-ws.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=53mM/vlP4AAeJST/175iLnAxhsy2Kek2msz5exqLT30=; b=SWr2cqhdiqHVNDrR9yXDfFTD0iPbq8k/62OTj52o3z3g1I+FcbvZBAjeJH7kVqSU5k GJg+iSq2l8s7ueS3QvLwfJasRHC7v+0XNealML39/jI0jep2R/Z1IgW/SFo/Rx5dPyK+ RS2KYN6XYJggo/z+kiH6nihbBo9xN6QrIm5AxWSTQ5Jo7EALJbYxdeeB4td01E8M1JnQ CCm7ukM8zAd2GroB71GDKDh0gdyZ4pk/n+xGZt6Ic/uhwvoRcVrvCAGxEroYpuk3jWry 5CWoSjwawO1flcZZyw3dkroekjB1pDoRp8wna5Stp6wHUfduaJHSWHZupDfSEXimG/fx RIpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=53mM/vlP4AAeJST/175iLnAxhsy2Kek2msz5exqLT30=; b=aBFjHgTK78ZIhxoEFs7lWIbudCeJiYSKAxWF8YRQARp1IKvMyoWJjZl4EJQF22BzhE tFQM4ePk8kAorSO1F5JfaJSSorGPZiYUU0KaSBhYlbgDKgwjp8B/JEtgkCEGYcG867DV EdpEDWXfnMrDtsMlSjuOWDBejSkYy7XBLJnxIyqc1plOWBJCHtiH9JAwsohV8zEVHief h2MiXSYVaTCgPbEkJ5fjav2ley21SEXKjtnRjlvkFqxdG/tFqdRj4j8W3u1Bp4UOwpL8 1UElGfhdZpoNJ/xbd/yPuaimO4aclJIOJp/FPZUFvr3nzVv74EBCH6fHpfHCgq4YS/ox Kgbg== X-Gm-Message-State: AA+aEWbjWRsdQVoICin6OSwf15egDQjbYu+GECzVZuplh27bFy1xkcMv BcpIR+kbyjTyuYEs3UsVAzQ8i+vnWvlHOA== X-Google-Smtp-Source: AFSGD/XqxqKyNimANBd/MpEkQh4ZJu48QPGFiyZYzEWg6y3AnS7xit0Gnt7pgSnApTlo6syWiJpE2g== X-Received: by 2002:a6b:bc83:: with SMTP id m125mr21266468iof.83.1545335983007; Thu, 20 Dec 2018 11:59:43 -0800 (PST) From: Tycho Andersen To: linux-sparse@vger.kernel.org, kernel-hardening@lists.openwall.com Cc: Tycho Andersen Subject: [RFC v1 2/4] move name-based analysis before linearization Date: Thu, 20 Dec 2018 12:59:29 -0700 Message-Id: <20181220195931.20331-3-tycho@tycho.ws> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20181220195931.20331-1-tycho@tycho.ws> References: <20181220195931.20331-1-tycho@tycho.ws> MIME-Version: 1.0 X-Virus-Scanned: ClamAV using ClamSMTP Based on [1], let's move the function name matching analysis to before linearization so that we don't destroy type info. This has the added benefit that we can call the same analysis on inlined functions, since they haven't been totally inlined yet and still have their name information. Unfortunately, this means that walking the AST is much more complicated, because we need to handle all the various expression/statement cases. I've done what I believe is a mostly complete implementation of this, but there are a few TODOs here and there as well. I ported the current size checks to this new point as well. [1]: https://www.spinics.net/lists/linux-sparse/msg09867.html Signed-off-by: Tycho Andersen --- sparse.c | 193 ++++++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 156 insertions(+), 37 deletions(-) diff --git a/sparse.c b/sparse.c index 151eaf4..217a5bf 100644 --- a/sparse.c +++ b/sparse.c @@ -149,68 +149,188 @@ static void check_range_instruction(struct instruction *insn) warning(insn->pos, "value out of range"); } -static void check_byte_count(struct instruction *insn, pseudo_t count) +static void check_byte_count(struct position pos, struct expression *size) { - if (!count) + long long val; + + if (!size) return; - if (count->type == PSEUDO_VAL) { - unsigned long long val = count->value; - if (Wmemcpy_max_count && val > fmemcpy_max_count) - warning(insn->pos, "%s with byte count of %llu", - show_ident(insn->func->sym->ident), val); + + val = get_expression_value_silent(size); + if (val == 0) return; - } + + if (Wmemcpy_max_count && val > fmemcpy_max_count) + warning(pos, "copy with byte count of %llu", val); /* OK, we could try to do the range analysis here */ } -static pseudo_t argument(struct instruction *call, unsigned int argno) +static void check_memset(struct position pos, struct expression_list *args) { - pseudo_t args[8]; - struct ptr_list *arg_list = (struct ptr_list *) call->arguments; - - argno--; - if (linearize_ptr_list(arg_list, (void *)args, 8) > argno) - return args[argno]; - return NULL; + struct expression *size = ptr_list_nth_entry((struct ptr_list *)args, 2); + check_byte_count(pos, size); } -static void check_memset(struct instruction *insn) -{ - check_byte_count(insn, argument(insn, 3)); } + #define check_memcpy check_memset #define check_ctu check_memset #define check_cfu check_memset struct checkfn { struct ident *id; - void (*check)(struct instruction *insn); + void (*check)(struct position pos, struct expression_list *args); +}; + +static const struct checkfn check_fn[] = { + { &memset_ident, check_memset }, + { &memcpy_ident, check_memcpy }, + { ©_to_user_ident, check_ctu }, + { ©_from_user_ident, check_cfu }, }; -static void check_call_instruction(struct instruction *insn) +static void check_one_statement(struct statement *); + +static void check_one_expression(struct position pos, struct expression *expr) { - pseudo_t fn = insn->func; - struct ident *ident; - static const struct checkfn check_fn[] = { - { &memset_ident, check_memset }, - { &memcpy_ident, check_memcpy }, - { ©_to_user_ident, check_ctu }, - { ©_from_user_ident, check_cfu }, - }; + struct expression *call, *fn; + struct symbol *direct; int i; - if (fn->type != PSEUDO_SYM) + if (!expr) + return; + + if (expr->type == EXPR_STATEMENT) { + check_one_statement(expr->statement); + return; + } + + if (expr->type != EXPR_CALL) return; - ident = fn->sym->ident; - if (!ident) + + call = expr; + fn = call->fn; + + direct = NULL; + if (fn->type == EXPR_PREOP) { + if (fn->unop->type == EXPR_SYMBOL) { + struct symbol *sym = fn->unop->symbol; + if (sym->ctype.base_type->type == SYM_FN) + direct = sym; + } + } + + if (!direct) return; + for (i = 0; i < ARRAY_SIZE(check_fn); i++) { - if (check_fn[i].id != ident) + if (check_fn[i].id != direct->ident) continue; - check_fn[i].check(insn); + check_fn[i].check(pos, call->args); + return; + } +} + +static void check_one_statement(struct statement *stmt) +{ + struct ident *ident; + int i; + + if (!stmt) + return; + + switch (stmt->type) { + case STMT_DECLARATION: { + struct symbol *sym; + FOR_EACH_PTR(stmt->declaration, sym) { + check_one_expression(stmt->pos, sym->initializer); + } END_FOR_EACH_PTR(sym); break; } + case STMT_EXPRESSION: + check_one_expression(stmt->pos, stmt->expression); + break; + case STMT_COMPOUND: { + struct statement *s; + + if (stmt->inline_fn) { + ident = stmt->inline_fn->ident; + + for (i = 0; i < ARRAY_SIZE(check_fn); i++) { + struct symbol *sym; + struct expression_list *args = NULL; + + if (check_fn[i].id != ident) + continue; + + FOR_EACH_PTR(stmt->args->declaration, sym) { + add_expression(&args, sym->initializer); + } END_FOR_EACH_PTR(sym); + + check_fn[i].check(stmt->pos, args); + free_ptr_list((struct ptr_list **) &args); + break; + } + break; + } + + FOR_EACH_PTR(stmt->stmts, s) { + check_one_statement(s); + } END_FOR_EACH_PTR(s); + break; + } + case STMT_IF: + check_one_expression(stmt->pos, stmt->if_conditional); + check_one_statement(stmt->if_true); + check_one_statement(stmt->if_false); + break; + case STMT_CASE: + check_one_expression(stmt->pos, stmt->case_expression); + check_one_expression(stmt->pos, stmt->case_to); + check_one_statement(stmt->case_statement); + break; + case STMT_SWITCH: + check_one_expression(stmt->pos, stmt->switch_expression); + check_one_statement(stmt->switch_statement); + break; + case STMT_ITERATOR: + check_one_statement(stmt->iterator_pre_statement); + check_one_expression(stmt->pos, stmt->iterator_pre_condition); + check_one_statement(stmt->iterator_statement); + check_one_statement(stmt->iterator_post_statement); + check_one_expression(stmt->pos, stmt->iterator_post_condition); + break; + case STMT_LABEL: + check_one_statement(stmt->label_statement); + break; + case STMT_RANGE: + check_one_expression(stmt->pos, stmt->range_expression); + check_one_expression(stmt->pos, stmt->range_low); + check_one_expression(stmt->pos, stmt->range_high); + break; + /* + * TODO: STMT_CONTEXT, GOTO, ASM; these could/should all be walked, but + * don't seem super relevant for copy_{to,from}_user(). + */ + default: + return; + } +} + +static void check_symbol(struct symbol *sym) +{ + if (sym->type == SYM_NODE) + sym = sym->ctype.base_type; + + switch (sym->type) { + case SYM_FN: + if (sym->stmt) + check_one_statement(sym->stmt); + break; + default: + return; + } } static void check_one_instruction(struct instruction *insn) @@ -224,9 +344,6 @@ static void check_one_instruction(struct instruction *insn) case OP_RANGE: check_range_instruction(insn); break; - case OP_CALL: - check_call_instruction(insn); - break; default: break; } @@ -314,6 +431,8 @@ static void check_symbols(struct symbol_list *list) struct entrypoint *ep; expand_symbol(sym); + check_symbol(sym); + ep = linearize_symbol(sym); if (ep && ep->entry) { if (dbg_entry) From patchwork Thu Dec 20 19:59:30 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tycho Andersen X-Patchwork-Id: 10739535 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 34A4D924 for ; Thu, 20 Dec 2018 20:00:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 26FBD28C03 for ; Thu, 20 Dec 2018 20:00:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2502E28BFC; Thu, 20 Dec 2018 20:00:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 1C96828BF0 for ; Thu, 20 Dec 2018 20:00:21 +0000 (UTC) Received: (qmail 18196 invoked by uid 550); 20 Dec 2018 19:59:57 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 18025 invoked from network); 20 Dec 2018 19:59:56 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tycho-ws.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=0EimkVSLHfSwmfcQEy4BKoOrgVdq12HdkoBv4xVscs0=; b=ZXsCv/D1he3mbj80n9D5c5n4Mfz28OCAPGwlTNp/szwd9o50nZ1/Wn27Ebp9i+dYku uFDy3kVGm/c9PBXi5UOSdyDtpfYP0ixSYSMsjF5Bs8HrcbNvdpFzOewAcOcY16mP7BAV 57Qd6h0rt45GaJ4srfIWeJDxa+M8atQjqThpNeH/iikUg0C2DCMxlgYreb/rqw9sApOC wM1hwYTTfuX48pCpDyIHosgEUTqcIRCsncUu9mbtpiL0f1cJTUbdAa/aMpeUk+f6ldrG NkU28tBTkOAqCpEq2tpSAwsOMuHCA7A+CnsUIr1RZw/P0aC9HOQITu/kjUML8CXLe7Fk X7Nw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=0EimkVSLHfSwmfcQEy4BKoOrgVdq12HdkoBv4xVscs0=; b=g97LNYJPNcf36JvngxLzL+vASvIYaDBXRfz95W76mkMKILk9xfkmhqSqRrdVYhVcnM sxPn11EwQHHTFmM3y6faMi4ZwIsMlfubdHM62R4Exd898sxf07RAjxmp1iPpMEMzgHD3 T5LbOrQomjYtH/I9S96ZD8dehu0z6+7cLvYGmBRXfS1aUrIoVi3kmazFArVXfV31qjAV tx3V+w1CCFqA6cLx/pswD69Wfsy3eit4uE26WUeV9lmru+7Oiw86WAoidd2785ekdnUp yzl0JKhwDbq3lkc540Q3PSPwcKJXGpYTZYfXS0OwVdDPs+2WxGEs4w0uf98QB/yHslYq IS6A== X-Gm-Message-State: AA+aEWa6cihjMV+KuBFgwWTKqyo7f0hNgjmh+qSd7fHmfj4WPB6worGu DAtifv8b5iLdCvbLA9XhPeRFRw== X-Google-Smtp-Source: AFSGD/UjzBl3qVFnPZzo+hXGgtxG3w1oJ+Wq0+EFN7n7Gv0fdGAYIulSkCB+AG2CaOcNIwyKrBKAbg== X-Received: by 2002:a5d:8747:: with SMTP id k7mr20489826iol.279.1545335984354; Thu, 20 Dec 2018 11:59:44 -0800 (PST) From: Tycho Andersen To: linux-sparse@vger.kernel.org, kernel-hardening@lists.openwall.com Cc: Tycho Andersen Subject: [RFC v1 3/4] add a check for copy_to_user() address spaces Date: Thu, 20 Dec 2018 12:59:30 -0700 Message-Id: <20181220195931.20331-4-tycho@tycho.ws> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20181220195931.20331-1-tycho@tycho.ws> References: <20181220195931.20331-1-tycho@tycho.ws> MIME-Version: 1.0 X-Virus-Scanned: ClamAV using ClamSMTP Leaking kernel pointers to userspace is a bad idea, so let's try to do some analysis for it. The basic idea is that every pointer copied to userspace "should be" (but isn't necessarily) annotated with __user, and if it is not, then it's a potential infoleak. The motivation for this is stuff like [1], which is exactly a case of this. Based on a subsequent manual analysis of the uapi headers, I found [2]. Unfortunately in both of these cases, there is void * (for compat) trickery that masks them from actually turning up in this case. But hey, I wrote it and tried it out, so perhaps the code is useful for someone :) [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/block/floppy.c?id=65eea8edc315589d6c993cf12dbb5d0e9ef1fe4e [2]: https://lkml.org/lkml/2018/11/3/142 Signed-off-by: Tycho Andersen --- sparse.c | 93 ++++++++++++++++++++++++++++++++++++++- validation/copy_to_user.c | 31 +++++++++++++ 2 files changed, 123 insertions(+), 1 deletion(-) diff --git a/sparse.c b/sparse.c index 217a5bf..8bcbe16 100644 --- a/sparse.c +++ b/sparse.c @@ -171,13 +171,104 @@ static void check_memset(struct position pos, struct expression_list *args) check_byte_count(pos, size); } +static struct symbol *resolve_arg_type(struct position pos, struct expression *arg) +{ + struct expression *uncast; + + uncast = arg; + switch (arg->type) { + case EXPR_CAST: + case EXPR_FORCE_CAST: + case EXPR_IMPLIED_CAST: + /* + * Undo any casting done by sparse to the function's + * argument type. + */ + uncast = arg->cast_expression; + break; + case EXPR_SYMBOL: + break; + case EXPR_PREOP: + /* + * handle derefs; these are really just the type of the + * resulting expression. + */ + break; + case EXPR_BINOP: + /* TODO: resolve this pointer math if possible? */ + return NULL; + default: + warning(pos, "huh? arg not a cast or symbol? %d", arg->type); + return NULL; + } + + return uncast->ctype->ctype.base_type; +} + +static void check_ptr_in_other_as(struct position pos, struct symbol *sym, int this_as) +{ + struct ident *ident = sym->ident; + + if (sym->type == SYM_NODE) + sym = sym->ctype.base_type; + + switch (sym->type) { + case SYM_ARRAY: + case SYM_PTR: { + if (sym->ctype.as != this_as) + warning(pos, "member %s is a kernel pointer copied to userspace", show_ident(ident)); + check_ptr_in_other_as(pos, sym->ctype.base_type, this_as); + break; + } + case SYM_STRUCT: + case SYM_UNION: { + struct symbol *member; + + FOR_EACH_PTR(sym->symbol_list, member) { + check_ptr_in_other_as(pos, member, this_as); + } END_FOR_EACH_PTR(member); + break; + } + default: + /* + * scalar types are ok + * TODO: what about SYM_LABEL/PREPROCESSOR? + */ + break; + } +} + +static void check_no_kernel_pointers(struct position pos, struct expression_list *args) +{ + struct expression *src = ptr_list_nth_entry((struct ptr_list *)args, 1); + struct symbol *base = NULL; + + if (!Waddress_space) + return; + + /* get the type of src */ + base = resolve_arg_type(pos, src); + + /* + * And deref it to *src; src will *always* be a kernel pointer, and + * we're really after members of structures here, not the pointers + * themselves. So we do this deref at the top level. + */ + base = base->ctype.base_type; + + check_ptr_in_other_as(pos, base, 1); } #define check_memcpy check_memset -#define check_ctu check_memset #define check_cfu check_memset +void check_ctu(struct position pos, struct expression_list *args) +{ + check_memset(pos, args); + check_no_kernel_pointers(pos, args); +} + struct checkfn { struct ident *id; void (*check)(struct position pos, struct expression_list *args); diff --git a/validation/copy_to_user.c b/validation/copy_to_user.c new file mode 100644 index 0000000..d9cded6 --- /dev/null +++ b/validation/copy_to_user.c @@ -0,0 +1,31 @@ +#define __user __attribute__((address_space(1))) + +struct bar { + char *bar_kptr; +}; + +struct foo { + char *foo_kptr; + char __user *uptr; + struct bar bar; +}; + +static void copy_to_user(void __user *to, const void *from, unsigned long n) +{ +} + +static void bar(void) +{ + struct foo f; + void __user *p = (void __user *)0; + + copy_to_user(p, &f, sizeof(f)); +} +/* + * check-name: copy_to_user arguments + * + * check-error-start +copy_to_user.c:22:9: warning: member foo_kptr is a kernel pointer copied to userspace +copy_to_user.c:22:9: warning: member bar_kptr is a kernel pointer copied to userspace + * check-error-end + */ From patchwork Thu Dec 20 19:59:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tycho Andersen X-Patchwork-Id: 10739537 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 46FAF924 for ; Thu, 20 Dec 2018 20:00:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 39BC128C0D for ; Thu, 20 Dec 2018 20:00:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 37E6328C52; Thu, 20 Dec 2018 20:00:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 2A87328C0B for ; Thu, 20 Dec 2018 20:00:30 +0000 (UTC) Received: (qmail 18408 invoked by uid 550); 20 Dec 2018 19:59:59 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 18198 invoked from network); 20 Dec 2018 19:59:57 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tycho-ws.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=8vSgLE0139Uw28lgL88xITvwXqU1EcObnWsuEHGD+5Y=; b=s9T6xbpzYClqxDvUWuehyBSgKZ7K6qBkBvo98Qs5/uEhnJvMC3O14O9cWwuI6ZS1sK j/2zom16LM/I+Ct0E+9ZM1Q18t/zA4fDBKWUIOAKpJgyNU3nuFiI72H8V3RyxEK5lj+I FHOxfHiBDHhBWYit7vy4trGYT2vad67vOtdVarfufckFN0h04nyS1SF3X1DI5wW9AgEV FK80qGc5kh6uKjgPw9U+dUKAPa8bMHqxmTu5mZKQ5dZ13+zFZvLyIgNzpBZpbb79S8yi PSs93puzVzCNRgYrwWOFF687Vvu15lRG4RXpN7A33zfW439Y+3LC7p9ZPoMlZ4o1fwlq xGDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=8vSgLE0139Uw28lgL88xITvwXqU1EcObnWsuEHGD+5Y=; b=NiUpkCa/zm/m+7HnpwTXY+ZiRcYidW1hUNs53hLq8FoEs832Cx37Jm2LgJCDPkoQGv IlSxMLZWACwwykZtsORp07g6iRNlxfPaVtmQCWFycXvO8w2oTBi8CQOf9cbOLaTS8BwA a3SodMK2nmbPAA/v1NrnMAKCYbg2bTIH4EHnMeJfu1vDCMZ19VOKCKAQP0u2rNoNtViB 953e1zofmojhaug7Lz+kbMVBCRxSkNTVrqUhBqzFt1fu5w2NRsWuyIol/hUKkicVVeUl RCPuxoKPm2H6EpevUa0l1E8+KYLcKY0rzMP1jEsdfvGVtdOG8K6synr7T1ltSc93ZG9u 7f5w== X-Gm-Message-State: AA+aEWbS+QOK0EtsEJL58tS876uF69eYOPcnNEiBKcM2f7q7bnDujH/l WLgnuzv+W9Cx0w0gSq0E8rFgYw== X-Google-Smtp-Source: ALg8bN5gsxi7ln4LRcWe3zrN7ql/+YYwVfsLywhwUqjMi63YIibJ0+Bf/vZNJ4t/bEmzH4Jl5TQnZA== X-Received: by 2002:a05:660c:74b:: with SMTP id a11mr32937itl.27.1545335985751; Thu, 20 Dec 2018 11:59:45 -0800 (PST) From: Tycho Andersen To: linux-sparse@vger.kernel.org, kernel-hardening@lists.openwall.com Cc: Tycho Andersen Subject: [RFC v1 4/4] check copy_to_user() sizes Date: Thu, 20 Dec 2018 12:59:31 -0700 Message-Id: <20181220195931.20331-5-tycho@tycho.ws> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20181220195931.20331-1-tycho@tycho.ws> References: <20181220195931.20331-1-tycho@tycho.ws> MIME-Version: 1.0 X-Virus-Scanned: ClamAV using ClamSMTP The intent here is to be a smarter version of the size checking that's just a hard coded constant. The insight is that we know the type of the thing that's getting passed to copy_to_user(), so if we can evaluate the size, it should be <= sizeof(*the thing being copied). I've run this over x86 with allyes config, and it didn't find anything, so perhaps it isn't particularly useful. Indeed, most copy_to_user()s look like: copy_to_user(p, &foo, sizeof(foo)) so perhaps this isn't a surprise. The one potentially interesting case is the hard coded array length case: it's possible someone might change one place but not the other. See e.g. the first copy_to_user() in drivers/ata/libata-scsi.c, which looks like: copy_to_user(dst, dev->id, ATA_ID_WORDS * sizeof(u16)) id here is a u16 id[ATA_ID_WORDS], so it would be possible to change this in one place or not the other. Unfortunately, sparse seems to unify the type of dev->id to a u16* instead of u16[], so we can't see through this. Anyway, there are currently no violations of this in the kernel either, although it does emit some false positives like, fs/xfs/xfs_ioctl.c:1813:25: warning: copy_to_user() where size (13) is larger than src (1) due to the above. I wrote the code, so perhaps someday it will be useful to someone, so I'm posting it :) Signed-off-by: Tycho Andersen --- sparse.c | 45 ++++++++++++++++++++++ validation/copy_to_user_sizes.c | 53 ++++++++++++++++++++++++++ validation/copy_to_user_sizes_inline.c | 29 ++++++++++++++ 3 files changed, 127 insertions(+) diff --git a/sparse.c b/sparse.c index 8bcbe16..eb7d5d7 100644 --- a/sparse.c +++ b/sparse.c @@ -31,6 +31,7 @@ #include #include #include +#include #include "lib.h" #include "allocate.h" @@ -259,6 +260,49 @@ static void check_no_kernel_pointers(struct position pos, struct expression_list check_ptr_in_other_as(pos, base, 1); } +static void check_copy_size(struct position pos, struct expression_list *args) +{ + struct expression *src = ptr_list_nth_entry((struct ptr_list *)args, 1); + struct expression *size = ptr_list_nth_entry((struct ptr_list *)args, 2); + long long src_actual = LLONG_MAX; + long long size_actual; + struct symbol *base, *deref; + + /* get the type of src */ + base = resolve_arg_type(pos, src); + + /* and deref it to *src */ + deref = base->ctype.base_type; + + /* + * Note, this is never hit right now because sparse seems to unify + * arrays that are arguments to type *. + */ + if (is_array_type(base)) { + long long array_size = get_expression_value_silent(base->array_size); + printf("%s is array type\n", show_ident(base->ident)); + + /* failed to evaluate */ + if (array_size == 0) + return; + src_actual = array_size * bits_to_bytes(deref->bit_size); + } else { + src_actual = bits_to_bytes(deref->bit_size); + } + + /* Evaluate size, if we can */ + size_actual = get_expression_value_silent(size); + /* + * size_actual == 0 means that get_expression_value failed; of + * course we'll miss something if there is a 0 length copy, but + * then nothing will leak anyway so... + */ + if (size_actual == 0) + return; + + if (size_actual > src_actual) + warning(pos, "copy_to_user() where size (%lld) is larger than src (%lld)", size_actual, src_actual); +} #define check_memcpy check_memset #define check_cfu check_memset @@ -267,6 +311,7 @@ void check_ctu(struct position pos, struct expression_list *args) { check_memset(pos, args); check_no_kernel_pointers(pos, args); + check_copy_size(pos, args); } struct checkfn { diff --git a/validation/copy_to_user_sizes.c b/validation/copy_to_user_sizes.c new file mode 100644 index 0000000..f1f7b8e --- /dev/null +++ b/validation/copy_to_user_sizes.c @@ -0,0 +1,53 @@ +static void copy_to_user(void *to, const void *from, unsigned long n) +{ +} + +struct foo { + int bar; +}; + +static void main(void) +{ + void *p; + struct foo f; + int uninitialized; + + copy_to_user(p, &f, sizeof(f)); + copy_to_user(p, &f, sizeof(f)-1); + copy_to_user(p, &f, sizeof(f)+1); + copy_to_user(p, &f, 1); + copy_to_user(p, &f, 100); + copy_to_user(p, &f, uninitialized); +} + +#if 0 +static void broken(void) +{ + void *p; + char arr[100]; + struct foo foo_arr[2]; + + /* + * These all fail right now, because sparse seems to unify the type of + * arr/foo_arr to char * /struct foo *, instead of char[]/struct foo[]. + * + * That's unfortunate, because it means that these generate false + * positives in the kernel when using a static array length, and that + * seems like a case where this type of check would be especially + * useful. + */ + copy_to_user(p, arr, 100); + copy_to_user(p, arr, 101); + copy_to_user(p, foo_arr, sizeof(foo_arr)); + copy_to_user(p, foo_arr, sizeof(foo_arr)-1); + copy_to_user(p, foo_arr, sizeof(foo_arr[0])*2); +} +#endif +/* + * check-name: copy_to_user sizes + * + * check-error-start +copy_to_user_sizes.c:17:9: warning: copy_to_user() where size (5) is larger than src (4) +copy_to_user_sizes.c:19:9: warning: copy_to_user() where size (100) is larger than src (4) + * check-error-end + */ diff --git a/validation/copy_to_user_sizes_inline.c b/validation/copy_to_user_sizes_inline.c new file mode 100644 index 0000000..bd3c3bd --- /dev/null +++ b/validation/copy_to_user_sizes_inline.c @@ -0,0 +1,29 @@ +static inline void copy_to_user(void *to, const void *from, unsigned long n) +{ +} + +struct foo { + int bar; +}; + +static void main(void) +{ + void *p; + struct foo f; + int uninitialized; + + copy_to_user(p, &f, sizeof(f)); + copy_to_user(p, &f, sizeof(f)-1); + copy_to_user(p, &f, sizeof(f)+1); + copy_to_user(p, &f, 1); + copy_to_user(p, &f, 100); + copy_to_user(p, &f, uninitialized); +} +/* + * check-name: copy_to_user sizes + * + * check-error-start +copy_to_user_sizes_inline.c:17:21: warning: copy_to_user() where size (5) is larger than src (4) +copy_to_user_sizes_inline.c:19:21: warning: copy_to_user() where size (100) is larger than src (4) + * check-error-end + */