From patchwork Mon May 30 05:39:22 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Theodore Ts'o X-Patchwork-Id: 9140253 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id B8B0760759 for ; Mon, 30 May 2016 05:41:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id ADFC728210 for ; Mon, 30 May 2016 05:41:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A2FC828213; Mon, 30 May 2016 05:41:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 37D0628210 for ; Mon, 30 May 2016 05:41:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1161021AbcE3FlB (ORCPT ); Mon, 30 May 2016 01:41:01 -0400 Received: from imap.thunk.org ([74.207.234.97]:45518 "EHLO imap.thunk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1161092AbcE3FkO (ORCPT ); Mon, 30 May 2016 01:40:14 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=thunk.org; s=ef5046eb; h=References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From; bh=FlY1SmXpl9RFmzzuxpuGoAggwaGXf7Pyxg2UqD8Ig3I=; b=bjo89vKgrl4369DfbfmU9DnJWqOr3Ni4dnKSfgwJlJ8wvbEcgrdho5vRzRRkFVGQ+jEdR8N5+v9I2dIgutxpEVWg7T1gDz5GGscEY2vZ6Wy5uQd1PtJY94WfurFFWZg0C3unuxRQpooeb0xAPPRnc2e/XtAo1xJoxiE4bnlcdE8=; Received: from root (helo=closure.thunk.org) by imap.thunk.org with local-esmtp (Exim 4.84_2) (envelope-from ) id 1b7Fvj-0003Wq-SF; Mon, 30 May 2016 05:39:59 +0000 Received: by closure.thunk.org (Postfix, from userid 15806) id 2499382EBA8; Mon, 30 May 2016 01:39:59 -0400 (EDT) From: Theodore Ts'o To: Linux Kernel Developers List Cc: smueller@chronox.de, herbert@gondor.apana.org.au, andi@firstfloor.org, sandyinchina@gmail.com, cryptography@lakedaemon.net, jsd@av8n.com, hpa@zytor.com, linux-crypto@vger.kernel.org, Theodore Ts'o Subject: [PATCH 2/5] random: make /dev/urandom scalable for silly userspace programs Date: Mon, 30 May 2016 01:39:22 -0400 Message-Id: <1464586765-14436-3-git-send-email-tytso@mit.edu> X-Mailer: git-send-email 2.5.0 In-Reply-To: <1464586765-14436-1-git-send-email-tytso@mit.edu> References: <1464586765-14436-1-git-send-email-tytso@mit.edu> X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: tytso@thunk.org X-SA-Exim-Scanned: No (on imap.thunk.org); SAEximRunCond expanded to false Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On a system with a 4 socket (NUMA) system where a large number of application threads were all trying to read from /dev/urandom, this can result in the system spending 80% of its time contending on the global urandom spinlock. The application should have used its own PRNG, but let's try to help it from running, lemming-like, straight over the locking cliff. Reported-by: Andi Kleen Signed-off-by: Theodore Ts'o --- drivers/char/random.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 53 insertions(+), 4 deletions(-) diff --git a/drivers/char/random.c b/drivers/char/random.c index 3a4d865..1778c84 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -434,6 +434,8 @@ struct crng_state primary_crng = { */ static int crng_init = 0; #define crng_ready() (likely(crng_init >= 2)) +static void _extract_crng(struct crng_state *crng, + __u8 out[CHACHA20_BLOCK_SIZE]); static void extract_crng(__u8 out[CHACHA20_BLOCK_SIZE]); static void process_random_ready_list(void); @@ -754,6 +756,17 @@ static void credit_entropy_bits_safe(struct entropy_store *r, int nbits) static DECLARE_WAIT_QUEUE_HEAD(crng_init_wait); +#ifdef CONFIG_NUMA +/* + * Hack to deal with crazy userspace progams when they are all trying + * to access /dev/urandom in parallel. The programs are almost + * certainly doing something terribly wrong, but we'll work around + * their brain damage. + */ +static struct crng_state **crng_node_pool __read_mostly; +#endif + + static void _initialize_crng(struct crng_state *crng) { int i; @@ -774,11 +787,13 @@ static void _initialize_crng(struct crng_state *crng) crng->init_time = jiffies - CRNG_RESEED_INTERVAL - 1; } +#ifdef CONFIG_NUMA static void initialize_crng(struct crng_state *crng) { _initialize_crng(crng); spin_lock_init(&crng->lock); } +#endif static int crng_fast_load(__u32 pool[4]) { @@ -818,7 +833,7 @@ static void crng_reseed(struct crng_state *crng, struct entropy_store *r) if (num == 0) return; } else - extract_crng(buf.block); + _extract_crng(&primary_crng, buf.block); spin_lock_irqsave(&primary_crng.lock, flags); for (i = 0; i < 8; i++) { unsigned long rv; @@ -838,19 +853,26 @@ static void crng_reseed(struct crng_state *crng, struct entropy_store *r) spin_unlock_irqrestore(&primary_crng.lock, flags); } +static inline void maybe_reseed_primary_crng(void) +{ + if (crng_init > 2 && + time_after(jiffies, primary_crng.init_time + CRNG_RESEED_INTERVAL)) + crng_reseed(&primary_crng, &input_pool); +} + static inline void crng_wait_ready(void) { wait_event_interruptible(crng_init_wait, crng_ready()); } -static void extract_crng(__u8 out[CHACHA20_BLOCK_SIZE]) +static void _extract_crng(struct crng_state *crng, + __u8 out[CHACHA20_BLOCK_SIZE]) { unsigned long v, flags; - struct crng_state *crng = &primary_crng; if (crng_init > 2 && time_after(jiffies, crng->init_time + CRNG_RESEED_INTERVAL)) - crng_reseed(crng, &input_pool); + crng_reseed(crng, crng == &primary_crng ? &input_pool : NULL); spin_lock_irqsave(&crng->lock, flags); if (arch_get_random_long(&v)) crng->state[14] ^= v; @@ -860,6 +882,17 @@ static void extract_crng(__u8 out[CHACHA20_BLOCK_SIZE]) spin_unlock_irqrestore(&crng->lock, flags); } +static void extract_crng(__u8 out[CHACHA20_BLOCK_SIZE]) +{ +#ifndef CONFIG_NUMA + struct crng_state *crng = &primary_crng; +#else + struct crng_state *crng = crng_node_pool[numa_node_id()]; +#endif + + _extract_crng(crng, out); +} + static ssize_t extract_crng_user(void __user *buf, size_t nbytes) { ssize_t ret = 0, i; @@ -1573,6 +1606,22 @@ static void init_std_data(struct entropy_store *r) */ static int rand_initialize(void) { +#ifdef CONFIG_NUMA + int i; + int num_nodes = num_possible_nodes(); + struct crng_state *crng; + + crng_node_pool = kmalloc(num_nodes * sizeof(void *), + GFP_KERNEL|__GFP_NOFAIL); + + for (i=0; i < num_nodes; i++) { + crng = kmalloc(sizeof(struct crng_state), + GFP_KERNEL | __GFP_NOFAIL); + initialize_crng(crng); + crng_node_pool[i] = crng; + + } +#endif init_std_data(&input_pool); init_std_data(&blocking_pool); _initialize_crng(&primary_crng);