[v10,3/4] random: introduce generic vDSO getrandom() implementation

Provide a generic C vDSO getrandom() implementation, which operates on
an opaque state returned by vgetrandom_alloc() and produces random bytes
the same way as getrandom(). This has a the API signature:

  ssize_t vgetrandom(void *buffer, size_t len, unsigned int flags, void *opaque_state);

The return value and the first 3 arguments are the same as ordinary
getrandom(), while the last argument is a pointer to the opaque
allocated state. Were all four arguments passed to the getrandom()
syscall, nothing different would happen, and the functions would have
the exact same behavior.

The actual vDSO RNG algorithm implemented is the same one implemented by
drivers/char/random.c, using the same fast-erasure techniques as that.
Should the in-kernel implementation change, so too will the vDSO one.

It requires an implementation of ChaCha20 that does not use any stack,
in order to maintain forward secrecy if a multi-threaded program forks
(though this does not account for a similar issue with SA_SIGINFO
copying registers to the stack), so this is left as an
architecture-specific fill-in. Stack-less ChaCha20 is an easy algorithm
to implement on a variety of architectures, so this shouldn't be too
onerous.

Initially, the state is keyless, and so the first call makes a
getrandom() syscall to generate that key, and then uses it for
subsequent calls. By keeping track of a generation counter, it knows
when its key is invalidated and it should fetch a new one using the
syscall. Later, more than just a generation counter might be used.

Since MADV_WIPEONFORK is set on the opaque state, the key and related
state is wiped during a fork(), so secrets don't roll over into new
processes, and the same state doesn't accidentally generate the same
random stream. The generation counter, as well, is always >0, so that
the 0 counter is a useful indication of a fork() or otherwise
uninitialized state.

If the kernel RNG is not yet initialized, then the vDSO always calls the
syscall, because that behavior cannot be emulated in userspace, but
fortunately that state is short lived and only during early boot. If it
has been initialized, then there is no need to inspect the `flags`
argument, because the behavior does not change post-initialization
regardless of the `flags` value.

Since the opaque state passed to it is mutated, vDSO getrandom() is not
reentrant, when used with the same opaque state, which libc should be
mindful of.

vgetrandom_alloc() and vDSO getrandom() together provide the ability for
userspace to generate random bytes quickly and safely, and is intended
to be integrated into libc's thread management. As an illustrative
example, the following code might be used to do the same outside of
libc. All of the static functions are to be considered implementation
private, including the vgetrandom_alloc() syscall wrapper, which
generally shouldn't be exposed outside of libc, with the non-static
vgetrandom() function at the end being the exported interface. The
various pthread-isms are expected to be elided into libc internals. This
per-thread allocation scheme is very naive and does not shrink; other
implementations may choose to be more complex.

  static void *vgetrandom_alloc(unsigned int *num, unsigned int *size_per_each, unsigned int flags)
  {
    long ret = syscall(__NR_vgetrandom_alloc, &num, &size_per_each, flags);
    return ret == -1 ? NULL : (void *)ret;
  }

  static struct {
    pthread_mutex_t lock;
    void **states;
    size_t len, cap;
  } grnd_allocator = {
    .lock = PTHREAD_MUTEX_INITIALIZER
  };

  static void *vgetrandom_get_state(void)
  {
    void *state = NULL;

    pthread_mutex_lock(&grnd_allocator.lock);
    if (!grnd_allocator.len) {
      size_t new_cap;
      unsigned int size_per_each, num = 16; /* Just a hint. Could also be nr_cpus. */
      void *new_block = vgetrandom_alloc(&num, &size_per_each, 0), *new_states;

      if (!new_block)
        goto out;
      new_cap = grnd_allocator.cap + num;
      new_states = reallocarray(grnd_allocator.states, new_cap, sizeof(*grnd_allocator.states));
      if (!new_states) {
        munmap(new_block, num * size_per_each);
        goto out;
      }
      grnd_allocator.cap = new_cap;
      grnd_allocator.states = new_states;

      for (size_t i = 0; i < num; ++i) {
        grnd_allocator.states[i] = new_block;
        new_block += size_per_each;
      }
      grnd_allocator.len = num;
    }
    state = grnd_allocator.states[--grnd_allocator.len];

  out:
    pthread_mutex_unlock(&grnd_allocator.lock);
    return state;
  }

  static void vgetrandom_put_state(void *state)
  {
    if (!state)
      return;
    pthread_mutex_lock(&grnd_allocator.lock);
    grnd_allocator.states[grnd_allocator.len++] = state;
    pthread_mutex_unlock(&grnd_allocator.lock);
  }

  static struct {
    ssize_t(*fn)(void *buf, size_t len, unsigned long flags, void *state);
    pthread_key_t key;
    pthread_once_t initialized;
  } grnd_ctx = {
    .initialized = PTHREAD_ONCE_INIT
  };

  static void vgetrandom_init(void)
  {
    if (pthread_key_create(&grnd_ctx.key, vgetrandom_put_state) != 0)
      return;
    grnd_ctx.fn = __vdsosym("LINUX_2.6", "__vdso_getrandom");
  }

  ssize_t vgetrandom(void *buf, size_t len, unsigned long flags)
  {
    void *state;

    pthread_once(&grnd_ctx.initialized, vgetrandom_init);
    if (!grnd_ctx.fn)
      return getrandom(buf, len, flags);
    state = pthread_getspecific(grnd_ctx.key);
    if (!state) {
      state = vgetrandom_get_state();
      if (pthread_setspecific(grnd_ctx.key, state) != 0) {
        vgetrandom_put_state(state);
        state = NULL;
      }
      if (!state)
        return getrandom(buf, len, flags);
    }
    return grnd_ctx.fn(buf, len, flags, state);
  }

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
 MAINTAINERS             |   1 +
 drivers/char/random.c   |   9 ++
 include/vdso/datapage.h |  11 +++
 lib/vdso/Kconfig        |   7 +-
 lib/vdso/getrandom.c    | 204 ++++++++++++++++++++++++++++++++++++++++
 5 files changed, 231 insertions(+), 1 deletion(-)
 create mode 100644 lib/vdso/getrandom.c

Message ID	20221129210639.42233-4-Jason@zx2c4.com (mailing list archive)
State	Not Applicable
Delegated to:	Herbert Xu
Headers	show Return-Path: <linux-crypto-owner@kernel.org> X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB7A2C4332F for <linux-crypto@archiver.kernel.org>; Tue, 29 Nov 2022 21:07:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236720AbiK2VHZ (ORCPT <rfc822;linux-crypto@archiver.kernel.org>); Tue, 29 Nov 2022 16:07:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34006 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236697AbiK2VHU (ORCPT <rfc822;linux-crypto@vger.kernel.org>); Tue, 29 Nov 2022 16:07:20 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8960358BDE; Tue, 29 Nov 2022 13:07:08 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 20BEC6191A; Tue, 29 Nov 2022 21:07:08 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 388FEC433B5; Tue, 29 Nov 2022 21:07:06 +0000 (UTC) Authentication-Results: smtp.kernel.org; dkim=pass (1024-bit key) header.d=zx2c4.com header.i=@zx2c4.com header.b="l98jJRu4" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zx2c4.com; s=20210105; t=1669756025; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kNiuwmSJOw6Ap+zxCQGqOrHjw6UOb8uPbeHwYkOyWQY=; b=l98jJRu4U/7IDK3cUik0a9wY+DuHsYykhQrPjpf+IipKTbRIEvnosQZyiBeJ3vLJMAW96D hqnSOVmH1QYBBpuMJ6d/E0AW5YbCNaJCP4FA4QEa6LuT2f9/ZuHvT7w2GMrDBre0czJyAA T/1SCQHE4Ev70vDDi2lribYrzc375+c= Received: by mail.zx2c4.com (ZX2C4 Mail Server) with ESMTPSA id 770b3365 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Tue, 29 Nov 2022 21:07:05 +0000 (UTC) From: "Jason A. Donenfeld" <Jason@zx2c4.com> To: linux-kernel@vger.kernel.org, patches@lists.linux.dev, tglx@linutronix.de Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>, linux-crypto@vger.kernel.org, linux-api@vger.kernel.org, x86@kernel.org, Greg Kroah-Hartman <gregkh@linuxfoundation.org>, Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>, Carlos O'Donell <carlos@redhat.com>, Florian Weimer <fweimer@redhat.com>, Arnd Bergmann <arnd@arndb.de>, Christian Brauner <brauner@kernel.org> Subject: [PATCH v10 3/4] random: introduce generic vDSO getrandom() implementation Date: Tue, 29 Nov 2022 22:06:38 +0100 Message-Id: <20221129210639.42233-4-Jason@zx2c4.com> In-Reply-To: <20221129210639.42233-1-Jason@zx2c4.com> References: <20221129210639.42233-1-Jason@zx2c4.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: <linux-crypto.vger.kernel.org> X-Mailing-List: linux-crypto@vger.kernel.org
Series	implement getrandom() in vDSO \| expand [v10,0/4] implement getrandom() in vDSO [v10,1/4] random: add vgetrandom_alloc() syscall [v10,2/4] arch: allocate vgetrandom_alloc() syscall number [v10,3/4] random: introduce generic vDSO getrandom() implementation [v10,4/4] x86: vdso: Wire up getrandom() vDSO implementation

[v10,3/4] random: introduce generic vDSO getrandom() implementation

Commit Message

Comments

Patch