From patchwork Mon Feb 1 17:24:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zhijian Li (Fujitsu)\" via" X-Patchwork-Id: 12059565 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 94324C433DB for ; Mon, 1 Feb 2021 17:28:47 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C42B964EAA for ; Mon, 1 Feb 2021 17:28:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C42B964EAA Authentication-Results: mail.kernel.org; dmarc=pass (p=none dis=none) header.from=nongnu.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:59934 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1l6d0H-0003uC-9n for qemu-devel@archiver.kernel.org; Mon, 01 Feb 2021 12:28:45 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:37462) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1l6cxL-0001Qg-9I for qemu-devel@nongnu.org; Mon, 01 Feb 2021 12:25:43 -0500 Received: from smtp-fw-2101.amazon.com ([72.21.196.25]:44475) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1l6cxG-0003bH-AY for qemu-devel@nongnu.org; Mon, 01 Feb 2021 12:25:42 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1612200339; x=1643736339; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=4oHGTJN/orKtC2bgJVlWjuCAjE01eDUVrhYY9F6POR0=; b=jrjnMGVVpbB+c0ZFdmYK5Ue+HDrGjikIJ5PtGvbazfv41rwCubc6Iuwl RrYvrhOPknjvSMsFccyODN632VIHDj1s8NfNud/iZ2xw5zsQgczyPNpii CaN7WTb9K4U8jd1s0Rjm4jHLUIczTJXawPzIPcJ1nvG0RxO7Zf+KUOFz8 Y=; X-IronPort-AV: E=Sophos;i="5.79,393,1602547200"; d="scan'208";a="78853071" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-2a-d0be17ee.us-west-2.amazon.com) ([10.43.8.6]) by smtp-border-fw-out-2101.iad2.amazon.com with ESMTP; 01 Feb 2021 17:25:27 +0000 Received: from EX13D08EUB004.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan3.pdx.amazon.com [10.236.137.198]) by email-inbound-relay-2a-d0be17ee.us-west-2.amazon.com (Postfix) with ESMTPS id 85CDDA24D8; Mon, 1 Feb 2021 17:25:23 +0000 (UTC) Received: from uf6ed9c851f4556.ant.amazon.com (10.43.161.253) by EX13D08EUB004.ant.amazon.com (10.43.166.158) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Mon, 1 Feb 2021 17:25:08 +0000 To: , , , , Subject: [PATCH v5 0/2] System Generation ID driver and VMGENID backend Date: Mon, 1 Feb 2021 19:24:52 +0200 Message-ID: <1612200294-17561-1-git-send-email-acatan@amazon.com> X-Mailer: git-send-email 2.7.4 MIME-Version: 1.0 X-Originating-IP: [10.43.161.253] X-ClientProxiedBy: EX13D24UWB002.ant.amazon.com (10.43.161.159) To EX13D08EUB004.ant.amazon.com (10.43.166.158) Precedence: Bulk Received-SPF: pass client-ip=72.21.196.25; envelope-from=prvs=659a040dc=acatan@amazon.com; helo=smtp-fw-2101.amazon.com X-Spam_score_int: -148 X-Spam_score: -14.9 X-Spam_bar: -------------- X-Spam_report: (-14.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.351, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jason@zx2c4.com, areber@redhat.com, mst@redhat.com, ghammer@redhat.com, vijaysun@ca.ibm.com, 0x7f454c46@gmail.com, mhocko@kernel.org, dgunigun@redhat.com, avagin@gmail.com, pavel@ucw.cz, ptikhomirov@virtuozzo.com, corbet@lwn.net, mpe@ellerman.id.au, rafael@kernel.org, ebiggers@kernel.org, borntraeger@de.ibm.com, sblbir@amazon.com, bonzini@gnu.org, arnd@arndb.de, jannh@google.com, raduweis@amazon.com, asmehra@redhat.com, Adrian Catangiu , graf@amazon.com, rppt@kernel.org, luto@kernel.org, gil@azul.com, oridgar@gmail.com, colmmacc@amazon.com, tytso@mit.edu, gregkh@linuxfoundation.org, rdunlap@infradead.org, ebiederm@xmission.com, ovzxemul@gmail.com, w@1wt.eu, dwmw@amazon.co.uk Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reply-to: Adrian Catangiu X-Patchwork-Original-From: acatan--- via From: "Zhijian Li (Fujitsu)\" via" This feature is aimed at virtualized or containerized environments where VM or container snapshotting duplicates memory state, which is a challenge for applications that want to generate unique data such as request IDs, UUIDs, and cryptographic nonces. The patch set introduces a mechanism that provides a userspace interface for applications and libraries to be made aware of uniqueness breaking events such as VM or container snapshotting, and allow them to react and adapt to such events. Solving the uniqueness problem strongly enough for cryptographic purposes requires a mechanism which can deterministically reseed userspace PRNGs with new entropy at restore time. This mechanism must also support the high-throughput and low-latency use-cases that led programmers to pick a userspace PRNG in the first place; be usable by both application code and libraries; allow transparent retrofitting behind existing popular PRNG interfaces without changing application code; it must be efficient, especially on snapshot restore; and be simple enough for wide adoption. The first patch in the set implements a device driver which exposes a read-only device /dev/sysgenid to userspace, which contains a monotonically increasing u32 generation counter. Libraries and applications are expected to open() the device, and then call read() which blocks until the SysGenId changes. Following an update, read() calls no longer block until the application acknowledges the new SysGenId by write()ing it back to the device. Non-blocking read() calls return EAGAIN when there is no new SysGenId available. Alternatively, libraries can mmap() the device to get a single shared page which contains the latest SysGenId at offset 0. SysGenId also supports a waiting mechanism exposed through a IOCTL on the device. The SYSGENID_WAIT_WATCHERS IOCTL blocks until there are no open file handles on the device which haven’t acknowledged the new id. This waiting mechanism is intended for serverless and container control planes, which want to confirm that all application code has detected and reacted to the new SysGenId before sending an invoke to the newly-restored sandbox. The second patch in the set adds a VmGenId driver which makes use of the ACPI vmgenid device to drive SysGenId and to reseed kernel entropy on VM snapshots. --- v4 -> v5: - sysgenid: generation changes are also exported through uevents - remove SYSGENID_GET_OUTDATED_WATCHERS ioctl - document sysgenid ioctl major/minor numbers - rebase on linus latest + various nits v3 -> v4: - split functionality in two separate kernel modules: 1. drivers/misc/sysgenid.c which provides the generic userspace interface and mechanisms 2. drivers/virt/vmgenid.c as VMGENID acpi device driver that seeds kernel entropy and acts as a driving backend for the generic sysgenid - renamed /dev/vmgenid -> /dev/sysgenid - renamed uapi header file vmgenid.h -> sysgenid.h - renamed ioctls VMGENID_* -> SYSGENID_* - added ‘min_gen’ parameter to SYSGENID_FORCE_GEN_UPDATE ioctl - fixed races in documentation examples - various style nits - rebased on top of linus latest v2 -> v3: - separate the core driver logic and interface, from the ACPI device. The ACPI vmgenid device is now one possible backend. - fix issue when timeout=0 in VMGENID_WAIT_WATCHERS - add locking to avoid races between fs ops handlers and hw irq driven generation updates - change VMGENID_WAIT_WATCHERS ioctl so if the current caller is outdated or a generation change happens while waiting (thus making current caller outdated), the ioctl returns -EINTR to signal the user to handle event and retry. Fixes blocking on oneself. - add VMGENID_FORCE_GEN_UPDATE ioctl conditioned by CAP_CHECKPOINT_RESTORE capability, through which software can force generation bump. v1 -> v2: - expose to userspace a monotonically increasing u32 Vm Gen Counter instead of the hw VmGen UUID - since the hw/hypervisor-provided 128-bit UUID is not public anymore, add it to the kernel RNG as device randomness - insert driver page containing Vm Gen Counter in the user vma in the driver's mmap handler instead of using a fault handler - turn driver into a misc device driver to auto-create /dev/vmgenid - change ioctl arg to avoid leaking kernel structs to userspace - update documentation - various nits - rebase on top of linus latest Adrian Catangiu (2): drivers/misc: sysgenid: add system generation id driver drivers/virt: vmgenid: add vm generation id driver Documentation/misc-devices/sysgenid.rst | 236 ++++++++++++++++ Documentation/userspace-api/ioctl/ioctl-number.rst | 1 + Documentation/virt/vmgenid.rst | 34 +++ MAINTAINERS | 15 + drivers/misc/Kconfig | 16 ++ drivers/misc/Makefile | 1 + drivers/misc/sysgenid.c | 307 +++++++++++++++++++++ drivers/virt/Kconfig | 14 + drivers/virt/Makefile | 1 + drivers/virt/vmgenid.c | 153 ++++++++++ include/uapi/linux/sysgenid.h | 17 ++ 11 files changed, 795 insertions(+) create mode 100644 Documentation/misc-devices/sysgenid.rst create mode 100644 Documentation/virt/vmgenid.rst create mode 100644 drivers/misc/sysgenid.c create mode 100644 drivers/virt/vmgenid.c create mode 100644 include/uapi/linux/sysgenid.h