From patchwork Mon Oct 3 22:21:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Ali Raza X-Patchwork-Id: 12997843 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 097F1C47084 for ; Mon, 3 Oct 2022 22:22:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 747B480007; Mon, 3 Oct 2022 18:22:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6D2FC80009; Mon, 3 Oct 2022 18:22:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4875580007; Mon, 3 Oct 2022 18:22:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 17DCC80009 for ; Mon, 3 Oct 2022 18:22:09 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id A8AF0C08BD for ; Mon, 3 Oct 2022 22:22:08 +0000 (UTC) X-FDA: 79981062336.03.0E3F87C Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2101.outbound.protection.outlook.com [40.107.244.101]) by imf18.hostedemail.com (Postfix) with ESMTP id 5101D1C0018 for ; Mon, 3 Oct 2022 22:22:08 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Tpc4xA8fN6UW293eHu+jViLcNIxCAC4gyh8/bnK/QlZ/vLcVVHgmgqzzo/3f+HO+aX2CO4qEn4y242+p56gOrCZy8ux9QLRZm0Q5nffo4V1r7pkB49/NyVkSHvvyyr38gxyZdfdvVOYpVgxh5pB+MvbGrqCpMwYW2F1HxAvgMQM3b5lbgNEdMOHibbqJ4sJnW+rezxsgnHzXh4TnLbdjO59GG1Z0JkYQZtZzPuwHvJkDwqKDKD7uhm8UgDBKGS2w7Dfuft91Hu6CyKeRkiDpq2HQtIw7kkDpcVg+oztA65MibqXEiH/PxcV1xy4BTaxLVTbzUKU/sLRzLO9MDu7u6w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=pMQWJJqnY8f9a48zMpCnUDxBB+bfezktXkVD9Awamf8=; b=NNbZTylGENwN4dRkEGJWVyvp487gHzeezbn4qe0Pq5KEDIo1mKJ1dI9hXMiI0xXlPKgXNM+o78BTvE7Y9huQpxFK+MaCVNPZPBBB49DFunJpqeQX2HK9DBzyWxmlJEZ653EEtURsOeIYSRe6sCe0ZQAW6Ju3uHopEJa3fYPXWQgbAuON/2qXlUg+s2+4ew0MlQTqT9XBO/vyYhqYXjZV7megLQT3qIOOJkfhgkrGNjbDmtxR152pLcs5aoeDsIiienmVY4yDla9NdR2LWNLy4Dlyn9xODdFqWb5wEHVNeRJJkk87X/oB5gnKtgmO20fhEBmIziO60ws3Qo6yelfT/A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=bu.edu; dmarc=pass action=none header.from=bu.edu; dkim=pass header.d=bu.edu; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bu.edu; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=pMQWJJqnY8f9a48zMpCnUDxBB+bfezktXkVD9Awamf8=; b=o7dJCv+37xHHnYPdmb6o9t+Ks5JgBdeT17RG3mc5rQpCNEyx0B9J4m3pj8KuEKP+29yMQXt8YrVu5rjH8aob6bUmo0H6lddi1Wo23W6nUrF5x0zTTYiBCeSgAAUL6Z7SNlRMJ6mrz6uGkOMH0s/AYvNgKmHoYuAKcR+B/mTK2UQ8WeWZAmavPOWhXay6Azt+UeyIAnFnKsDvkFxKGgZwvQn6iUSJmdFJo/pFmXrt7DntQpm8skzE+XmX94+IKeLkY/VOBo6Opnrlmxh3+EdScYA8ZjTaQ03BICuVy9kNWGHify9yBZuHNJjkh8jZdS4FadzHzmmDEMZL6osQgX2YVQ== Received: from BL0PR03MB4129.namprd03.prod.outlook.com (2603:10b6:208:65::33) by SJ0PR03MB6270.namprd03.prod.outlook.com (2603:10b6:a03:3ba::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5676.23; Mon, 3 Oct 2022 22:22:05 +0000 Received: from BL0PR03MB4129.namprd03.prod.outlook.com ([fe80::9e2b:bf05:79ec:581]) by BL0PR03MB4129.namprd03.prod.outlook.com ([fe80::9e2b:bf05:79ec:581%4]) with mapi id 15.20.5676.030; Mon, 3 Oct 2022 22:22:05 +0000 From: Ali Raza To: linux-kernel@vger.kernel.org Cc: corbet@lwn.net, masahiroy@kernel.org, michal.lkml@markovi.net, ndesaulniers@google.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, luto@kernel.org, ebiederm@xmission.com, keescook@chromium.org, peterz@infradead.org, viro@zeniv.linux.org.uk, arnd@arndb.de, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, pbonzini@redhat.com, jpoimboe@kernel.org, linux-doc@vger.kernel.org, linux-kbuild@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, x86@kernel.org, rjones@redhat.com, munsoner@bu.edu, tommyu@bu.edu, drepper@redhat.com, lwoodman@redhat.com, mboydmcse@gmail.com, okrieg@bu.edu, rmancuso@bu.edu, Ali Raza Subject: [RFC UKL 10/10] Kconfig: Add config option for enabling and sample for testing UKL Date: Mon, 3 Oct 2022 18:21:33 -0400 Message-Id: <20221003222133.20948-11-aliraza@bu.edu> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221003222133.20948-1-aliraza@bu.edu> References: <20221003222133.20948-1-aliraza@bu.edu> X-ClientProxiedBy: MN2PR04CA0031.namprd04.prod.outlook.com (2603:10b6:208:d4::44) To BL0PR03MB4129.namprd03.prod.outlook.com (2603:10b6:208:65::33) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BL0PR03MB4129:EE_|SJ0PR03MB6270:EE_ X-MS-Office365-Filtering-Correlation-Id: 9ce1047e-2e19-48d8-727c-08daa58db387 X-LD-Processed: d57d32cc-c121-488f-b07b-dfe705680c71,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: wgS/z7lAWqtXzVwcKL5Cu0p6jakw/MVl3ikUvx+h8gKeIJAJolstNRxYgdcgFbp9hTCwKnN5MIM9m2lzj8YnHHHTywjH6FOq9ZkFrCiPwiEYwSb1TF1Hbaooe98hdBODZle23u73JPE3mGZoaceBAM+FHl8ZAOVVdld7eQFS6sH9x3b0CxpD9I7aAPfnPsIbjR/I8zn7cEXfOKF1uxdjgUPIRkA4vBEoGXX3ch2r8QGiFx+tTJWKT9GEWdRRTjzWxoliYwSbPclb6IIokARhgwG0+LOK6FDGImaD/xNHxqz//GI3Te6SfY9lKU+3446fYrEliQUY2HJZV+sosm7HClr4+2rX4uP2viw5VEWFCLhUO25KJtRF74WIDCqCGCnmeGmxzCWm+/wsXUX0qiS5nTpHu+XQgS8HwReM+ZAg2NovhpITQTB4sr1BCoqdzoHV2UGrAnu+iOIs2bLyiyYpgN9FEAMZsS+73pjqBzNBKcThC8/UqGnSKdODdK/B37mW3l/g7skkcoTzMx3tJbU4xPXRDoSNv9goe6/tkApleui1CarG3U/hpe0S07o8JsfvP6VHX+YeooKeMzl1wEyZvHBjHbDxeI95lyh0AUo1oIGVqutA9rPEEkP6sKtyPSAmLoeY+iRB2r5JkGkoA8YouPw3MPr6YSEzC5nXH6QLFEhBs0Cp7PPUhs4TiktbMw97dygFSBY1/R3n3vFczms0DowwzQCdcwZumvYpqQPEdgRal55yjSTU0IqEEdDsDMVwxH4hu1b+sWr0CqmpcueC+0piO8gwSekvRsQft0BbqfdHzLNQrq9/58Am7d24zFt0+SUD0ckHfmDADDc+cto2Rg== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BL0PR03MB4129.namprd03.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(396003)(39860400002)(346002)(366004)(136003)(376002)(451199015)(52116002)(186003)(41320700001)(38100700002)(38350700002)(8676002)(4326008)(66476007)(66556008)(316002)(786003)(6916009)(2906002)(7406005)(41300700001)(8936002)(30864003)(5660300002)(7416002)(2616005)(1076003)(966005)(6486002)(478600001)(26005)(6512007)(6666004)(83380400001)(6506007)(66946007)(86362001)(36756003)(75432002);DIR:OUT;SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?N++ehaz1b8mGYG9yaSWOh9ghtpis?= =?utf-8?q?8SvuIBJUyIBt3oMClyf5oXJfPbQfKvdQD3rzUGd07iIBw/qO0nljYsFkQ0/BHC1T3?= =?utf-8?q?ZipQbztz4/DQVnqIXh8XMQ0WC29ULgyb5uDfkCJyag1+QsVAPAJcMQ5ZOYa6jExuC?= =?utf-8?q?PJKe5Uv8mQTotxLXePEykthvQtdeS+6rerXe1Qq8rw8vN27nG0+b5xOh3lJsgZuSN?= =?utf-8?q?MSSIvdPpcFPuiwBCv7dOpDdpTF+hm7/KFu4M5Z0XundyzNY6nFBfiV9ltcM3vBUvW?= =?utf-8?q?Cfry+w9wSC+DIZajFeyjeoag5qzZkkdrnDdPHDuCpteLRllZdLfNyVM85Rk0DaoTd?= =?utf-8?q?rtG6M5DSh621f0mRydOzdzgSjTDdUya1ph9ZoC+XFIn9aLt3S+h6PU7fm9LtZGmmW?= =?utf-8?q?pxwU4PvuNMJSzU/Ry6DJTFxtbd7cAjXGNg6N7jEpUP+5IknGcYbjlzUV9+jf049oP?= =?utf-8?q?dn58Vc0QLrJwF/Gh13Z+OzuXCPoNG4lHOEX0wFfc5AhJshO+WpRImvtwFoYqX1P6m?= =?utf-8?q?Q+N7lD/rUZ+8r6jeKaZ/IufsAUwHp1Zr+kPUPnQbleQ+CMBDTyCWVrSQGBGtkjpQS?= =?utf-8?q?Io6jALRrek4QhJeRaynx9Liq9+mvqZ/KaNfzdzUzuAnwzALUZIDrd/V2tGxzj+YEa?= =?utf-8?q?1TukMLc82GOTXheUruFGdCiOPLsSVzLDotEUwshW8kouJspNM4ckLco7cuwIQLScd?= =?utf-8?q?sPn9dNJQ3QXPvFjOMVJw/9frychH1rI5qsmTMHOooCbyZTT0cUHNnEIFfRgBAfShN?= =?utf-8?q?OE483Ag8vVTRr0lgll/7z87oKKrBXo6cNW0P1TT/EPLk/BpbKSLcVqdg+mFgv5Azp?= =?utf-8?q?N8auwR8xi1eHp5mGmOgJmwZ9bLwdaFH7E8o7lBaiu+uahoee81gXNt7aIzhvAed46?= =?utf-8?q?25nsEVCdpxsEzAbmIILdeK8Nm1sI/tRGzeZTUjaNzUeBaDheyxRlwhd31LMTErA0d?= =?utf-8?q?WiCZtpTYpsoyznb079hlHAZW3Ow1FgpqU/XGysZtDM21JKxn+o11/DfMOeqp+mXIy?= =?utf-8?q?8Xgd54prYsSb3LDbZiwr5Sie1k9jiPhJaqRGfgM96WLsn1KZZK3b0ju4YQRVuUE5c?= =?utf-8?q?qg7NIw1HYS/dS1N0B6CCl2/wTwhS4DAReTcwoavIy+RWigMsEgA3EbwPntXr13vz7?= =?utf-8?q?O2aJbCLODYA9gEjGOL6rV00J8UlISMDbgmU/H2//2DaWn9OGPzoSa4ho0KmPWdiMy?= =?utf-8?q?3mZADChPEzZDwUjYAgROdUS8//mVTYREAA+4cgwhMuXhWcFe9nJDBKicXOLRWipdl?= =?utf-8?q?zr6tDCT/5aYKORaGJa+2hV7d0h9HRKEDWKCSA9yXX5q7BOjJvArvILhqukMto+N4i?= =?utf-8?q?53MekYHTXIKD6QLNPNyk2ZUw1mcrjPFzA2R3+o3UsjQYQWyv8Rs+zKzDqpHv/1ALG?= =?utf-8?q?3gyB85cgCamng5cbN9FCh6saJH3ShHlTOqR38IXQqKLSUm/Xeq2LL1OCx68WHj1Zv?= =?utf-8?q?DGEOtEoGD1MbcPMuppzZoUs9QUqUsTrcZqX93BOVLZXK6QeM0UL+W3yv1DhBYWspm?= =?utf-8?q?vD4UBctwGpMc?= X-OriginatorOrg: bu.edu X-MS-Exchange-CrossTenant-Network-Message-Id: 9ce1047e-2e19-48d8-727c-08daa58db387 X-MS-Exchange-CrossTenant-AuthSource: BL0PR03MB4129.namprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 03 Oct 2022 22:22:04.9972 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: d57d32cc-c121-488f-b07b-dfe705680c71 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 5xeZ+m5rEp93nnu8wrG0q4p2nNogGkVYlwxGEgHrCJbHa4N3kBf2+r+U/DBKgMDo X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR03MB6270 ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1664835728; a=rsa-sha256; cv=pass; b=B/CpFxhui0eec29/9baB6kpw5wmyPnq4pobzgPhGbHKEbdVeuWtJgaVJQO3sn3D0VTYl7L QOJVn0g9CR0INArKacWKomnQ0uQbGfbor9soDREPXtKwQPdXWK7u5Qdti73QG+SgT3MQzS kQzqG6bSOQVU1vGpsZQaErqrqJD5kxw= ARC-Authentication-Results: i=2; imf18.hostedemail.com; dkim=pass header.d=bu.edu header.s=selector2 header.b=o7dJCv+3; spf=pass (imf18.hostedemail.com: domain of aliraza@bu.edu designates 40.107.244.101 as permitted sender) smtp.mailfrom=aliraza@bu.edu; dmarc=pass (policy=none) header.from=bu.edu; arc=pass ("microsoft.com:s=arcselector9901:i=1") ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1664835728; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pMQWJJqnY8f9a48zMpCnUDxBB+bfezktXkVD9Awamf8=; b=yAwvBy5IwRN7dQ4g25kP9OxRrHyVBhbcoR/3e3F3vU3aedYC1J765j1USwUlGExDiF7q67 U1YAzYHYo1Qo/q3qIewdQA9eiel9x0HYzyN9hlIW6rpHQ7KLBtAV6zzI6HcBjD3pfkEQVD dEy165+atmtY/QKmMSUSJphKDEM8P4s= X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 5101D1C0018 X-Rspam-User: Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=bu.edu header.s=selector2 header.b=o7dJCv+3; spf=pass (imf18.hostedemail.com: domain of aliraza@bu.edu designates 40.107.244.101 as permitted sender) smtp.mailfrom=aliraza@bu.edu; dmarc=pass (policy=none) header.from=bu.edu; arc=pass ("microsoft.com:s=arcselector9901:i=1") X-Stat-Signature: esznwncfwptda7938118bj6uku9fdqgj X-HE-Tag: 1664835728-896959 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add the KConfig file that will enable building UKL. Documentation introduces the technical details for how UKL works and the motivations behind why it is useful. Sample provides a simple program that still uses the standard system call interface, but does not require a modified C library. Cc: Jonathan Corbet Cc: Masahiro Yamada Cc: Michal Marek Cc: Nick Desaulniers Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: "H. Peter Anvin" Cc: Andy Lutomirski Cc: Eric Biederman Cc: Kees Cook Cc: Peter Zijlstra Cc: Alexander Viro Cc: Arnd Bergmann Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Steven Rostedt Cc: Ben Segall Cc: Mel Gorman Cc: Daniel Bristot de Oliveira Cc: Valentin Schneider Cc: Paolo Bonzini Cc: Josh Poimboeuf Co-developed-by: Eric B Munson Signed-off-by: Eric B Munson Co-developed-by: Ali Raza Signed-off-by: Ali Raza --- Documentation/index.rst | 1 + Documentation/ukl/ukl.rst | 104 ++++++++++++++++++++++++++++++++++++++ Kconfig | 2 + kernel/Kconfig.ukl | 41 +++++++++++++++ samples/ukl/Makefile | 16 ++++++ samples/ukl/README | 17 +++++++ samples/ukl/syscall.S | 28 ++++++++++ samples/ukl/tcp_server.c | 99 ++++++++++++++++++++++++++++++++++++ 8 files changed, 308 insertions(+) create mode 100644 Documentation/ukl/ukl.rst create mode 100644 kernel/Kconfig.ukl create mode 100644 samples/ukl/Makefile create mode 100644 samples/ukl/README create mode 100644 samples/ukl/syscall.S create mode 100644 samples/ukl/tcp_server.c diff --git a/Documentation/index.rst b/Documentation/index.rst index 4737c18c97ff..42f8cb7d4cae 100644 --- a/Documentation/index.rst +++ b/Documentation/index.rst @@ -167,6 +167,7 @@ to ReStructured Text format, or are simply too old. tools/index staging/index + ukl/ukl.rst Translations diff --git a/Documentation/ukl/ukl.rst b/Documentation/ukl/ukl.rst new file mode 100644 index 000000000000..a07ebb51169e --- /dev/null +++ b/Documentation/ukl/ukl.rst @@ -0,0 +1,104 @@ +SPDX-License-Identifier: GPL-2.0 + +Unikernel Linux (UKL) +===================== + +Unikernel Linux (UKL) is a research project aimed at integrating +application specific optimizations to the Linux kernel. This RFC aims to +introduce this research to the community. Any feedback regarding the idea, +goals, implementation and research is highly appreciated. + +Unikernels are specialized operating systems where an application is linked +directly with the kernel and runs in supervisor mode. This allows the +developers to implement application specific optimizations to the kernel, +which can be directly invoked by the application (without going through the +syscall path). An application can control scheduling and resource +management and directly access the hardware. Application and the kernel can +be co-optimized, e.g., through LTO, PGO, etc. All of these optimizations, +and others, provide applications with huge performance benefits over +general purpose operating systems. + +Linux is the de-facto operating system of today. Applications depend on its +battle tested code base, large developer community, support for legacy +code, a huge ecosystem of tools and utilities, and a wide range of +compatible hardware and device drivers. Linux also allows some degree of +application specific optimizations through build time config options, +runtime configuration, and recently through eBPF. But still, there is a +need for even more fine-grained application specific optimizations, and +some developers resort to kernel bypass techniques. + +Unikernel Linux (UKL) aims to get the best of both worlds by bringing +application specific optimizations to the Linux ecosystem. This way, +unmodified applications can keep getting the benefits of Linux while taking +advantage of the unikernel-style optimizations. Optionally, applications +can be modified to invoke deeper optimizations. + +There are two steps to unikernel-izing Linux, i.e., first, equip Linux with +a unikernel model, and second, actually use that model to implement +application specific optimizations. This patch focuses on the first part. +Through this patch, unmodified applications can be built as Linux +unikernels, albeit with only modest performance advantages. Like +unikernels, UKL would allow an application to be statically linked into the +kernel and executed in supervisor mode. However, UKL preserves most of the +invariants and design of Linux, including a separate page-able application +portion of the address space and a pinned kernel portion, the ability to +run multiple processes, and distinct execution modes for application and +kernel code. Kernel execution mode and application execution mode are +different, e.g., the application execution mode allows application threads +to be scheduled, handle signals, etc., which do not apply to kernel +threads. Application built as a Linux unikernel will have its text and data +loaded with the kernel at boot time, while the rest of the address space +would remain unchanged. These applications invoke the system call +functionality through a function call into the kernel system call entry +point instead of through the syscall assembly instruction. UKL would +support a normal userspace so the UKL application can be started, managed, +profiled, etc., using normal command line utilities. + +Once Linux has a unikernel model, different application specific +optimizations are possible. We have tried a few, e.g., fast system call +transitions, shared stacks to allow LTO, invoking kernel functions +directly, etc. We have seen huge performance benefits, details of which are +not relevant to this patch and can be found in our paper. +(https://arxiv.org/pdf/2206.00789.pdf) + +UKL differs significantly from previous projects, e.g., UML, KML and LKL. +User Mode Linux (UML) is a virtual machine monitor implemented on syscall +interface, a very different goal from UKL. Kernel Mode Linux (KML) allows +applications to run in kernel mode and replaces syscalls with function +calls. While KML stops there, UKL goes further. UKL links applications and +kernel together which allows further optimizations e.g., fast system call +transitions, shared stacks to allow LTO, invoking kernel functions directly +etc. Details can be found in the paper linked above. Linux Kernel Library +(LKL) harvests arch independent code from Linux, takes it to userspace as a +library to be linked with applications. A host needs to provide arch +dependent functionality. This model is very different from UKL. A detailed +discussion of related work is present in the paper linked above. + +See samples/ukl for a simple TCP echo server example which can be built as +a normal user space application and also as a UKL application. In the Linux +config options, a path to the compiled and partially linked application +binary can be specified. Kernel built with UKL enabled will search this +location for the binary and link with the kernel. Applications and required +libraries need to be compiled with -mno-red-zone -mcmodel=kernel flags +because kernel mode execution can trample on application red zones and in +order to link with the kernel and be loaded in the high end of the address +space, application should have the correct memory model. Examples of other +applications like Redis, Memcached etc along with glibc and libgcc etc., +can be found at https://github.com/unikernelLinux/ukl + +List of authors and contributors: +================================= + +Ali Raza - aliraza@bu.edu +Thomas Unger - tommyu@bu.edu +Matthew Boyd - mboydmcse@gmail.com +Eric Munson - munsoner@bu.edu +Parul Sohal - psohal@bu.edu +Ulrich Drepper - drepper@redhat.com +Richard Jones - rjones@redhat.com +Daniel Bristot de Oliveira - bristot@kernel.org +Larry Woodman - lwoodman@redhat.com +Renato Mancuso - rmancuso@bu.edu +Jonathan Appavoo - jappavoo@bu.edu +Orran Krieger - okrieg@bu.edu + diff --git a/Kconfig b/Kconfig index 745bc773f567..2a4594ae472c 100644 --- a/Kconfig +++ b/Kconfig @@ -29,4 +29,6 @@ source "lib/Kconfig" source "lib/Kconfig.debug" +source "kernel/Kconfig.ukl" + source "Documentation/Kconfig" diff --git a/kernel/Kconfig.ukl b/kernel/Kconfig.ukl new file mode 100644 index 000000000000..c2c5e1003605 --- /dev/null +++ b/kernel/Kconfig.ukl @@ -0,0 +1,41 @@ +menuconfig UNIKERNEL_LINUX + bool "Unikernel Linux" + depends on X86_64 && !RANDOMIZE_BASE && !PAGE_TABLE_ISOLATION + help + Unikernel Linux allows for a single, privileged process to be + linked with the kernel binary and be executed inplace of or + along side a more traditional user space. + + If you don't know what this is, say N. + +config UKL_TLS + bool "Enable TLS for UKL application" + depends on UNIKERNEL_LINUX + default Y + help + Not all applications will make use of thread local storage, + but we need to account for it in the linker script if used. + For the application in samples/ this should be disabled, but + if you are working with glibc this should be 'Y'. + + If unsure say 'Y' here + +config UKL_NAME + string "UKL Exec target" + depends on UNIKERNEL_LINUX + default "/UKL" + help + We need a way to trigger the start of the UKL application, + either by the kernel inplace of init or userspace when setup + is finished. The value given here is compared against the + filename passed to exec and if they match UKL is started. + For a more 'traditional' unikernel model, the value set here + should be given to the init= boot parameter. + +config UKL_ARCHIVE_PATH + string "Path static application archive" + depends on UNIKERNEL_LINUX + default "../UKL.a" + help + Where the linker should look for the statically linked application + and dependency archive. diff --git a/samples/ukl/Makefile b/samples/ukl/Makefile new file mode 100644 index 000000000000..93beb7750d4b --- /dev/null +++ b/samples/ukl/Makefile @@ -0,0 +1,16 @@ +# SPDX-License-Identifier: GPL-2.0 + +CFLAGS += -I usr/include -fno-PIC -mno-red-zone -mcmodel=kernel + +UKL.a: tcp_server.o syscall.o userspace + $(AR) cr UKL.a tcp_server.o syscall.o + objcopy --prefix-symbols=ukl_ UKL.a + +tcp_server.o: tcp_server.c +syscall.o: syscall.S + +userspace: + gcc -o tcp_server tcp_server.c + +clean: + rm -f UKL.a tcp_server.o syscall.o tcp_server diff --git a/samples/ukl/README b/samples/ukl/README new file mode 100644 index 000000000000..fbb771da033a --- /dev/null +++ b/samples/ukl/README @@ -0,0 +1,17 @@ +// SPDX-License-Identifier: GPL-2.0-only + +UKL test program +================ + +tcp_server.c is a epoll based TCP echo server written in C which uses port +no. 5555 by default. syscall.S translates syscall() function to a call +instruction in assembly. Normally, C libraries provide syscall() function +that translate into syscall assembly instruction. Run `make` and it will +create a UKL.a and a tcp_server. UKL.a can then be copied to where UKL +Linux build expects it to be present. This can be changed through the Linux +config options (by running `make menuconfig` etc.) The resulting Linux +kernel can be run, and once the userspace comes up, the echo server can be +started by running the UKL exec command, again chosen through the Linux +config options. tcp_server is a userspace binary of the same echo server +which can be run normally. This is meant to show that UKL can run code +which can also be run as a userspace binary without modification. diff --git a/samples/ukl/syscall.S b/samples/ukl/syscall.S new file mode 100644 index 000000000000..95d1c177fb05 --- /dev/null +++ b/samples/ukl/syscall.S @@ -0,0 +1,28 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + + .global _start +_start: + jmp main + + .global syscall + +/* Usage: long syscall (syscall_number, arg1, arg2, arg3, arg4, arg5, arg6) + We need to do some arg shifting, the syscall_number will be in + rax. */ + + .text +syscall: + movq %rdi, %rax /* Syscall number -> rax. */ + movq %rsi, %rdi /* shift arg1 - arg5. */ + movq %rdx, %rsi + movq %rcx, %rdx + movq %r8, %r10 + movq %r9, %r8 + movq 8(%rsp),%r9 /* arg6 is on the stack. */ + call entry_SYSCALL_64 /* Do the system call. */ + cmpq $-4095, %rax /* Check %rax for error. */ + jae loop /* Jump to error handler if error. */ + ret /* Return to caller. */ + +loop: + jmp loop diff --git a/samples/ukl/tcp_server.c b/samples/ukl/tcp_server.c new file mode 100644 index 000000000000..abf1a0e2bb79 --- /dev/null +++ b/samples/ukl/tcp_server.c @@ -0,0 +1,99 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#define _GNU_SOURCE +#include +#include +#include +#include + +#define BACKLOG 512 +#define MAX_EVENTS 128 +#define MAX_MESSAGE_LEN 2048 + +void error(char *msg); +extern long syscall(long number, ...); + +int main(void) +{ + // some variables we need + struct sockaddr_in server_addr, client_addr; + socklen_t client_len = sizeof(client_addr); + int bytes_received; + char buffer[MAX_MESSAGE_LEN]; + int on; + int result; + int sock_listen_fd, newsockfd; + + // setup socket + sock_listen_fd = syscall(41, AF_INET, SOCK_STREAM, 0); + if (sock_listen_fd < 0) + error("Error creating socket..\n"); + + server_addr.sin_family = AF_INET; + server_addr.sin_port = 45845; //htons(portno); + server_addr.sin_addr.s_addr = INADDR_ANY; + + // set TCP NODELAY + on = 1; + result = syscall(54, sock_listen_fd, IPPROTO_TCP, TCP_NODELAY, &on, sizeof(on)); + if (result < 0) + error("Can't set TCP_NODELAY to on"); + + // bind socket and listen for connections + if (syscall(49, sock_listen_fd, (struct sockaddr *)&server_addr, sizeof(server_addr)) < 0) + error("Error binding socket..\n"); + + if (syscall(50, sock_listen_fd, BACKLOG) < 0) + error("Error listening..\n"); + + struct epoll_event ev, events[MAX_EVENTS]; + int new_events, sock_conn_fd, epollfd; + + epollfd = syscall(213, MAX_EVENTS); + if (epollfd < 0) + error("Error creating epoll..\n"); + + ev.events = EPOLLIN; + ev.data.fd = sock_listen_fd; + + if (syscall(233, epollfd, EPOLL_CTL_ADD, sock_listen_fd, &ev) == -1) + error("Error adding new listeding socket to epoll..\n"); + + while (1) { + new_events = syscall(232, epollfd, events, MAX_EVENTS, -1); + + if (new_events == -1) + error("Error in epoll_wait..\n"); + + for (int i = 0; i < new_events; ++i) { + if (events[i].data.fd == sock_listen_fd) { + sock_conn_fd = syscall(288, sock_listen_fd, + (struct sockaddr *)&client_addr, + &client_len, SOCK_NONBLOCK); + if (sock_conn_fd == -1) + error("Error accepting new connection..\n"); + + ev.events = EPOLLIN | EPOLLET; + ev.data.fd = sock_conn_fd; + if (syscall(233, epollfd, EPOLL_CTL_ADD, sock_conn_fd, &ev) == -1) + error("Error adding new event to epoll..\n"); + } else { + newsockfd = events[i].data.fd; + bytes_received = syscall(45, newsockfd, buffer, MAX_MESSAGE_LEN, + 0, NULL, NULL); + if (bytes_received <= 0) { + syscall(233, epollfd, EPOLL_CTL_DEL, newsockfd, NULL); + syscall(48, newsockfd, SHUT_RDWR); + } else { + syscall(44, newsockfd, buffer, bytes_received, 0, NULL, 0); + } + } + } + } +} + +void error(char *msg) +{ + syscall(1, 1, msg, 15); + syscall(60, 1); +}