From patchwork Wed Mar 18 20:39:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: George Spelvin X-Patchwork-Id: 11445955 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B13991392 for ; Wed, 18 Mar 2020 20:40:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7E7DF20775 for ; Wed, 18 Mar 2020 20:40:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7E7DF20775 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=SDF.ORG Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9D7EF6B0092; Wed, 18 Mar 2020 16:40:18 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 988C46B0093; Wed, 18 Mar 2020 16:40:18 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 89DA06B0095; Wed, 18 Mar 2020 16:40:18 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0163.hostedemail.com [216.40.44.163]) by kanga.kvack.org (Postfix) with ESMTP id 6E0356B0092 for ; Wed, 18 Mar 2020 16:40:18 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id BFF014DDD for ; Wed, 18 Mar 2020 20:40:17 +0000 (UTC) X-FDA: 76609650474.23.smile25_2806fb8e4e5f X-Spam-Summary: 2,0,0,046586200c618057,d41d8cd98f00b204,lkml@sdf.org,,RULES_HIT:41:69:355:379:800:960:965:966:973:988:989:1260:1277:1312:1313:1314:1345:1359:1431:1437:1516:1518:1519:1535:1544:1593:1594:1595:1596:1711:1730:1747:1777:1792:2005:2195:2196:2199:2200:2393:2559:2562:2895:3138:3139:3140:3141:3142:3355:3865:3866:3867:3868:3870:3871:3872:3874:4385:4390:4395:5007:6117:6119:6261:7514:7903:7904:8660:9036:10004:11026:11473:11658:11914:12043:12114:12296:12297:12438:12517:12519:12555:12895:13095:13148:13161:13229:13230:13439:13846:13895:14096:14097:14181:14394:14721:21080:21324:21433:21451:21627:30012:30045:30054:30056:30064:30070,0,RBL:205.166.94.20:@sdf.org:.lbl8.mailshell.net-64.201.201.201 62.8.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fn,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:28,LUA_SUMMARY:none X-HE-Tag: smile25_2806fb8e4e5f X-Filterd-Recvd-Size: 5723 Received: from mx.sdf.org (mx.sdf.org [205.166.94.20]) by imf10.hostedemail.com (Postfix) with ESMTP for ; Wed, 18 Mar 2020 20:40:17 +0000 (UTC) Received: from sdf.org (IDENT:lkml@otaku.sdf.org [205.166.94.8]) by mx.sdf.org (8.15.2/8.14.5) with ESMTPS id 02IKdHLZ002171 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits) verified NO); Wed, 18 Mar 2020 20:39:17 GMT Received: (from lkml@localhost) by sdf.org (8.15.2/8.12.8/Submit) id 02IKdE7V015032; Wed, 18 Mar 2020 20:39:14 GMT Date: Wed, 18 Mar 2020 20:39:14 +0000 From: George Spelvin To: Kees Cook Cc: Dan Williams , linux-mm@kvack.org, Andrew Morton , Alexander Duyck , Randy Dunlap , lkml@sdf.org Subject: [PATCH v3] mm/shuffle.c: Fix races in add_to_free_area_random() Message-ID: <20200318203914.GA16083@SDF.ORG> References: <20200317135035.GA19442@SDF.ORG> <202003171435.41F7F0DF9@keescook> <20200317230612.GB19442@SDF.ORG> <202003171619.23210A7E0@keescook> <20200318014410.GA2281@SDF.ORG> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20200318014410.GA2281@SDF.ORG> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The old code had separate "rand" and "rand_count" variables, which could get out of sync with bad results. In the worst case, two threads would see rand_count = 1 and both decrement it, resulting in rand_count = 255 and rand being filled with zeros for the next 255 calls. Instead, pack them both into a single, atomically updateable, variable. This makes it a lot easier to reason about race conditions. They are still there - the code deliberately eschews locking - but basically harmless on the rare occasions that they happen. Second, use READ_ONCE and WRITE_ONCE. Without them, we are deep in the land of nasal demons. The compiler would be free to spill temporaries to the static variables in arbitrarily perverse ways and create hard-to-find bugs. (Alternatively, we could declare the static variable "volatile", one of the few places in the Linux kernel that would be correct, but it would probably annoy Linus.) Third, use long rather than u64. This not only keeps the state atomically updateable, it also speeds up the fast path on 32-bit machines. Saving at least three instructions on the fast path (one load, one add-with-carry, and one store) is worth a second call to get_random_u*() per 64 bits. The fast path of get_random_u* is less than the 3*64 = 192 instructions saved, and the slow path happens every 64 bytes so isn't affected by the change. Fourth, use left-shift rather than right. Testing the sign bit produces slightly smaller/faster code than testing the lsbit. I've tried shifting both ways, and copying the desired bit to a boolean before shifting rather than keeping separate full-width r and rshift variables, but both produce larger code: x86_64 i386 This code 94 95 Explicit bool 103 99 Lsbits 99 101 Both 96 100 In a perfect world, the x86-64 object code would be tiny, and dominated by the unavoidable unpredictable branch, but this function doesn't warrant arch-specific optimization. add_to_free_area_random: shlq rand(%rip) jz 3f 1: jnc 2f jmp add_to_free_area # tail call 2: jmp add_to_free_area_tail 3: pushq %rdx pushq %rsi pushq %rdi call get_random_u64 popq %rdi popq %rsi popq %rdx stc adcq %rax,%rax # not lea because we need carry out movq %rax, rand(%rip) jmp 1b Signed-off-by: George Spelvin Acked-by: Kees Cook Acked-by: Dan Williams Cc: Alexander Duyck Cc: Randy Dunlap Cc: Andrew Morton Cc: linux-mm@kvack.org --- v2: Rewrote commit message to explain existing races better. Made local variables unsigned to avoid (technically undefined) signed overflow. v3: Typos fixed, Acked-by, expanded commit message. mm/shuffle.c | 26 ++++++++++++++++---------- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/mm/shuffle.c b/mm/shuffle.c index e0ed247f8d90..16c5fddc292f 100644 --- a/mm/shuffle.c +++ b/mm/shuffle.c @@ -186,22 +186,28 @@ void __meminit __shuffle_free_memory(pg_data_t *pgdat) void add_to_free_area_random(struct page *page, struct free_area *area, int migratetype) { - static u64 rand; - static u8 rand_bits; + static unsigned long rand; /* buffered random bits */ + unsigned long r = READ_ONCE(rand), rshift = r << 1; /* - * The lack of locking is deliberate. If 2 threads race to - * update the rand state it just adds to the entropy. + * rand holds 0..BITS_PER_LONG-1 random msbits, followed by a + * 1 bit, then zero-padding in the lsbits. This allows us to + * maintain the pre-generated bits and the count of bits in a + * single, atomically updatable, variable. + * + * The lack of locking is deliberate. If two threads race to + * update the rand state it just adds to the entropy. The + * worst that can happen is a random bit is used twice, or + * get_random_long is called redundantly. */ - if (rand_bits == 0) { - rand_bits = 64; - rand = get_random_u64(); + if (unlikely(rshift == 0)) { + r = get_random_long(); + rshift = r << 1 | 1; } + WRITE_ONCE(rand, rshift); - if (rand & 1) + if ((long)r < 0) add_to_free_area(page, area, migratetype); else add_to_free_area_tail(page, area, migratetype); - rand_bits--; - rand >>= 1; }