From patchwork Mon Sep 7 13:40:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marco Elver X-Patchwork-Id: 11761049 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 72819746 for ; Mon, 7 Sep 2020 13:41:37 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1A9F3216C4 for ; Mon, 7 Sep 2020 13:41:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=google.com header.i=@google.com header.b="ALL9RpQB" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1A9F3216C4 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 61EAB8E000A; Mon, 7 Sep 2020 09:41:34 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 5D1228E0001; Mon, 7 Sep 2020 09:41:34 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 498408E000A; Mon, 7 Sep 2020 09:41:34 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0041.hostedemail.com [216.40.44.41]) by kanga.kvack.org (Postfix) with ESMTP id 2D87D8E0001 for ; Mon, 7 Sep 2020 09:41:34 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id E4B3B181AC9C6 for ; Mon, 7 Sep 2020 13:41:33 +0000 (UTC) X-FDA: 77236377666.29.boats67_43159e3270cc Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin29.hostedemail.com (Postfix) with ESMTP id ACDC0180868DC for ; Mon, 7 Sep 2020 13:41:33 +0000 (UTC) X-Spam-Summary: 50,0,0,fa2abac6226b9104,d41d8cd98f00b204,3jdhwxwukcfc3ak3g5dd5a3.1dba7cjm-bb9kz19.dg5@flex--elver.bounces.google.com,,RULES_HIT:4:41:69:152:355:379:541:800:960:966:967:973:988:989:1260:1263:1277:1313:1314:1345:1359:1437:1516:1518:1593:1594:1605:1730:1747:1777:1792:2194:2196:2199:2200:2393:2525:2553:2560:2564:2639:2682:2685:2693:2736:2740:2859:2892:2901:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3152:3743:3865:3866:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4034:4250:4321:4385:4559:4605:5007:6117:6119:6120:6261:6299:6653:6742:6743:7901:7903:7904:7974:8599:8603:8784:8957:9008:9010:9025:9040:9163:9388:9855:9969:10946:11026:11232:11257:11473:11658:11914:12043:12048:12291:12296:12297:12438:12555:12663:12683:12740:12776:12895:12986:13142:13146:13230:14394:14651:14659:21080:21094:21220:21323:21433:21444:21451:21627:21740:21939:30012:30029:30034:30045:30054:30056:30062:30070:30074:30075:30090,0,RBL:209.85 .221.73: X-HE-Tag: boats67_43159e3270cc X-Filterd-Recvd-Size: 17083 Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) by imf27.hostedemail.com (Postfix) with ESMTP for ; Mon, 7 Sep 2020 13:41:33 +0000 (UTC) Received: by mail-wr1-f73.google.com with SMTP id b7so5717996wrn.6 for ; Mon, 07 Sep 2020 06:41:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=VRGvWOinGw0btNVyOH340+TVwWpL0hbTJw9JR94PpB4=; b=ALL9RpQBmEPbBRqb5ELx8D38l7uZ2fXlF779RdUFbsUuGJL6gcBUM7tJu/64ISq4My YiAVlSrw9ghLoaIar3j3DmPnyxXsWrtPv2hmyhS+Nh0lOTqZmq/TEHAAEnK92wnT7PmA xIX/7/VCEr5weGUe/HeJ0n13Bu0Cd4VgsRBoKRmnNlDstFMCKYwn/2mMCjQG2cDBLTZ8 xy2KY7WJHh1+7zCZES2tvtQxcSlt1v2EojD7M3Avm7GLnisEB5LbabRNiH2MZhyOlqsb wj1LdJ2GatvNgSj4H1oAR5GVoqiODfuXW7nxdrulY3Q9I8oF7WMayLQ4fMnmW0nGjlKc hlnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=VRGvWOinGw0btNVyOH340+TVwWpL0hbTJw9JR94PpB4=; b=AwWOHXQZWJzjrDdr/WoCoLu5jayyUj15TFoUo4mgpRoXMp606YXIRj8YDNAuM0TIGE NLMaCLa+DvuTs8raJmMwevx26bxmx/Vicf8QgQavNZGE3QXb/pTI5irweFW7NoBMJvuo Xqab8Fas4hLmmDHD519rh1E4H3jy0zVupee1Xa13ymOQ3og0g0kwmCPVBZmxVYrjwJev PqAhXxW+rv7Qicj30+HQ3SGFIe110F1Yr0uADYDnMPwhhZ76dzTa31DDRcT1QRUGCjlp oO9klUmt5QiIoErCdazllabVeeipCIU+FV/jIDriaK2B84jayWgptEOhbXZ8eUJiqmJe p5Sg== X-Gm-Message-State: AOAM530rvJYjpiCSSn6fPmrsHrA8X5z3jK4AlVGDcJ8Z1bJnFta9OqhK HiWE3f/UvYfTl3f3+qxKKGfdBik+4A== X-Google-Smtp-Source: ABdhPJyYq0tbe8OJm8WhlgT1J8VR0ZbAfLGaI4xhC6D0qARX6o+xgg9QlBKjgP5zThx95lmrqWGuCVhVLg== X-Received: from elver.muc.corp.google.com ([2a00:79e0:15:13:f693:9fff:fef4:2449]) (user=elver job=sendgmr) by 2002:a1c:3886:: with SMTP id f128mr20829871wma.121.1599486092008; Mon, 07 Sep 2020 06:41:32 -0700 (PDT) Date: Mon, 7 Sep 2020 15:40:54 +0200 In-Reply-To: <20200907134055.2878499-1-elver@google.com> Message-Id: <20200907134055.2878499-10-elver@google.com> Mime-Version: 1.0 References: <20200907134055.2878499-1-elver@google.com> X-Mailer: git-send-email 2.28.0.526.ge36021eeef-goog Subject: [PATCH RFC 09/10] kfence, Documentation: add KFENCE documentation From: Marco Elver To: elver@google.com, glider@google.com, akpm@linux-foundation.org, catalin.marinas@arm.com, cl@linux.com, rientjes@google.com, iamjoonsoo.kim@lge.com, mark.rutland@arm.com, penberg@kernel.org Cc: hpa@zytor.com, paulmck@kernel.org, andreyknvl@google.com, aryabinin@virtuozzo.com, luto@kernel.org, bp@alien8.de, dave.hansen@linux.intel.com, dvyukov@google.com, edumazet@google.com, gregkh@linuxfoundation.org, mingo@redhat.com, jannh@google.com, corbet@lwn.net, keescook@chromium.org, peterz@infradead.org, cai@lca.pw, tglx@linutronix.de, will@kernel.org, x86@kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, kasan-dev@googlegroups.com, linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org X-Rspamd-Queue-Id: ACDC0180868DC X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam03 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add KFENCE documentation in dev-tools/kfence.rst, and add to index. Co-developed-by: Alexander Potapenko Signed-off-by: Alexander Potapenko Signed-off-by: Marco Elver --- Documentation/dev-tools/index.rst | 1 + Documentation/dev-tools/kfence.rst | 285 +++++++++++++++++++++++++++++ 2 files changed, 286 insertions(+) create mode 100644 Documentation/dev-tools/kfence.rst diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst index f7809c7b1ba9..1b1cf4f5c9d9 100644 --- a/Documentation/dev-tools/index.rst +++ b/Documentation/dev-tools/index.rst @@ -22,6 +22,7 @@ whole; patches welcome! ubsan kmemleak kcsan + kfence gdb-kernel-debugging kgdb kselftest diff --git a/Documentation/dev-tools/kfence.rst b/Documentation/dev-tools/kfence.rst new file mode 100644 index 000000000000..254f4f089104 --- /dev/null +++ b/Documentation/dev-tools/kfence.rst @@ -0,0 +1,285 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Kernel Electric-Fence (KFENCE) +============================== + +Kernel Electric-Fence (KFENCE) is a low-overhead sampling-based memory safety +error detector. KFENCE detects heap out-of-bounds access, use-after-free, and +invalid-free errors. + +KFENCE is designed to be enabled in production kernels, and has near zero +performance overhead. Compared to KASAN, KFENCE trades performance for +precision. The main motivation behind KFENCE's design, is that with enough +total uptime KFENCE will detect bugs in code paths not typically exercised by +non-production test workloads. One way to quickly achieve a large enough total +uptime is when the tool is deployed across a large fleet of machines. + +Usage +----- + +To enable KFENCE, configure the kernel with:: + + CONFIG_KFENCE=y + +KFENCE provides several other configuration options to customize behaviour (see +the respective help text in ``lib/Kconfig.kfence`` for more info). + +Tuning performance +~~~~~~~~~~~~~~~~~~ + +The most important parameter is KFENCE's sample interval, which can be set via +the kernel boot parameter ``kfence.sample_interval`` in milliseconds. The +sample interval determines the frequency with which heap allocations will be +guarded by KFENCE. The default is configurable via the Kconfig option +``CONFIG_KFENCE_SAMPLE_INTERVAL``. Setting ``kfence.sample_interval=0`` +disables KFENCE. + +With the Kconfig option ``CONFIG_KFENCE_NUM_OBJECTS`` (default 255), the number +of available guarded objects can be controlled. Each object requires 2 pages, +one for the object itself and the other one used as a guard page; object pages +are interleaved with guard pages, and every object page is therefore surrounded +by two guard pages. + +The total memory dedicated to the KFENCE memory pool can be computed as:: + + ( #objects + 1 ) * 2 * PAGE_SIZE + +Using the default config, and assuming a page size of 4 KiB, results in +dedicating 2 MiB to the KFENCE memory pool. + +Error reports +~~~~~~~~~~~~~ + +A typical out-of-bounds access looks like this:: + + ================================================================== + BUG: KFENCE: out-of-bounds in test_out_of_bounds_read+0xa3/0x22b + + Out-of-bounds access at 0xffffffffb672efff (left of kfence-#17): + test_out_of_bounds_read+0xa3/0x22b + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + kfence-#17 [0xffffffffb672f000-0xffffffffb672f01f, size=32, cache=kmalloc-32] allocated in: + __kfence_alloc+0x42d/0x4c0 + __kmalloc+0x133/0x200 + test_alloc+0xf3/0x25b + test_out_of_bounds_read+0x98/0x22b + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + CPU: 4 PID: 107 Comm: kunit_try_catch Not tainted 5.8.0-rc6+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 + ================================================================== + +The header of the report provides a short summary of the function involved in +the access. It is followed by more detailed information about the access and +its origin. + +Use-after-free accesses are reported as:: + + ================================================================== + BUG: KFENCE: use-after-free in test_use_after_free_read+0xb3/0x143 + + Use-after-free access at 0xffffffffb673dfe0: + test_use_after_free_read+0xb3/0x143 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + kfence-#24 [0xffffffffb673dfe0-0xffffffffb673dfff, size=32, cache=kmalloc-32] allocated in: + __kfence_alloc+0x277/0x4c0 + __kmalloc+0x133/0x200 + test_alloc+0xf3/0x25b + test_use_after_free_read+0x76/0x143 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + freed in: + kfence_guarded_free+0x158/0x380 + __kfence_free+0x38/0xc0 + test_use_after_free_read+0xa8/0x143 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + CPU: 4 PID: 109 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 + ================================================================== + +KFENCE also reports on invalid frees, such as double-frees:: + + ================================================================== + BUG: KFENCE: invalid free in test_double_free+0xdc/0x171 + + Invalid free of 0xffffffffb6741000: + test_double_free+0xdc/0x171 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + kfence-#26 [0xffffffffb6741000-0xffffffffb674101f, size=32, cache=kmalloc-32] allocated in: + __kfence_alloc+0x42d/0x4c0 + __kmalloc+0x133/0x200 + test_alloc+0xf3/0x25b + test_double_free+0x76/0x171 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + freed in: + kfence_guarded_free+0x158/0x380 + __kfence_free+0x38/0xc0 + test_double_free+0xa8/0x171 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + CPU: 4 PID: 111 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 + ================================================================== + +KFENCE also uses pattern-based redzones on the other side of an object's guard +page, to detect out-of-bounds writes on the unprotected side of the object. +These are reported on frees:: + + ================================================================== + BUG: KFENCE: memory corruption in test_kmalloc_aligned_oob_write+0xef/0x184 + + Detected corrupted memory at 0xffffffffb6797ff9 [ 0xac . . . . . . ]: + test_kmalloc_aligned_oob_write+0xef/0x184 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + kfence-#69 [0xffffffffb6797fb0-0xffffffffb6797ff8, size=73, cache=kmalloc-96] allocated in: + __kfence_alloc+0x277/0x4c0 + __kmalloc+0x133/0x200 + test_alloc+0xf3/0x25b + test_kmalloc_aligned_oob_write+0x57/0x184 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + CPU: 4 PID: 120 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 + ================================================================== + +For such errors, the address where the corruption as well as the corrupt bytes +are shown. + +And finally, KFENCE may also report on invalid accesses to any protected page +where it was not possible to determine an associated object, e.g. if adjacent +object pages had not yet been allocated:: + + ================================================================== + BUG: KFENCE: invalid access in test_invalid_access+0x26/0xe0 + + Invalid access at 0xffffffffb670b00a: + test_invalid_access+0x26/0xe0 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + CPU: 4 PID: 124 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 + ================================================================== + +DebugFS interface +~~~~~~~~~~~~~~~~~ + +Some debugging information is exposed via debugfs: + +* The file ``/sys/kernel/debug/kfence/stats`` provides runtime statistics. + +* The file ``/sys/kernel/debug/kfence/objects`` provides a list of objects + allocated via KFENCE, including those already freed but protected. + +Implementation Details +---------------------- + +Guarded allocations are set up based on the sample interval. After expiration +of the sample interval, a guarded allocation from the KFENCE object pool is +returned to the main allocator (SLAB or SLUB). At this point, the timer is +reset, and the next allocation is set up after the expiration of the interval. +To "gate" a KFENCE allocation through the main allocator's fast-path without +overhead, KFENCE relies on static branches via the static keys infrastructure. +The static branch is toggled to redirect the allocation to KFENCE. + +KFENCE objects each reside on a dedicated page, at either the left or right +page boundaries selected at random. The pages to the left and right of the +object page are "guard pages", whose attributes are changed to a protected +state, and cause page faults on any attempted access. Such page faults are then +intercepted by KFENCE, which handles the fault gracefully by reporting an +out-of-bounds access. The side opposite of an object's guard page is used as a +pattern-based redzone, to detect out-of-bounds writes on the unprotected sed of +the object on frees (for special alignment and size combinations, both sides of +the object are redzoned). + +KFENCE also uses pattern-based redzones on the other side of an object's guard +page, to detect out-of-bounds writes on the unprotected side of the object; +these are reported on frees. + +The following figure illustrates the page layout:: + + ---+-----------+-----------+-----------+-----------+-----------+--- + | xxxxxxxxx | O : | xxxxxxxxx | : O | xxxxxxxxx | + | xxxxxxxxx | B : | xxxxxxxxx | : B | xxxxxxxxx | + | x GUARD x | J : RED- | x GUARD x | RED- : J | x GUARD x | + | xxxxxxxxx | E : ZONE | xxxxxxxxx | ZONE : E | xxxxxxxxx | + | xxxxxxxxx | C : | xxxxxxxxx | : C | xxxxxxxxx | + | xxxxxxxxx | T : | xxxxxxxxx | : T | xxxxxxxxx | + ---+-----------+-----------+-----------+-----------+-----------+--- + +Upon deallocation of a KFENCE object, the object's page is again protected and +the object is marked as freed. Any further access to the object causes a fault +and KFENCE reports a use-after-free access. Freed objects are inserted at the +tail of KFENCE's freelist, so that the least recently freed objects are reused +first, and the chances of detecting use-after-frees of recently freed objects +is increased. + +Interface +--------- + +The following describes the functions which are used by allocators as well page +handling code to set up and deal with KFENCE allocations. + +.. kernel-doc:: include/linux/kfence.h + :functions: is_kfence_address + kfence_shutdown_cache + kfence_alloc kfence_free + kfence_ksize kfence_object_start + kfence_handle_page_fault + +Related Tools +------------- + +In userspace, a similar approach is taken by `GWP-ASan +`_. GWP-ASan also relies on guard pages and +a sampling strategy to detect memory unsafety bugs at scale. KFENCE's design is +directly influenced by GWP-ASan, and can be seen as its kernel sibling. Another +similar but non-sampling approach, that also inspired the name "KFENCE", can be +found in the userspace `Electric Fence Malloc Debugger +`_. + +In the kernel, several tools exist to debug memory access errors, and in +particular KASAN can detect all bug classes that KFENCE can detect. While KASAN +is more precise, relying on compiler instrumentation, this comes at a +performance cost. We want to highlight that KASAN and KFENCE are complementary, +with different target environments. For instance, KASAN is the better +debugging-aid, where a simple reproducer exists: due to the lower chance to +detect the error, it would require more effort using KFENCE to debug. +Deployments at scale, however, would benefit from using KFENCE to discover bugs +due to code paths not exercised by test cases or fuzzers.