From patchwork Fri Oct 6 20:09:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Clash X-Patchwork-Id: 13411956 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 459BCE94115 for ; Fri, 6 Oct 2023 20:09:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233392AbjJFUJh (ORCPT ); Fri, 6 Oct 2023 16:09:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60936 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233398AbjJFUJg (ORCPT ); Fri, 6 Oct 2023 16:09:36 -0400 Received: from DM5PR00CU002.outbound.protection.outlook.com (mail-centralusazon11021020.outbound.protection.outlook.com [52.101.62.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 39D1FC6; Fri, 6 Oct 2023 13:09:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=NDQ69YQh2595JTIWCPdALcFWiqVCX2NT1dWF3bc8sN7iub4Ox/b+iVeHULeoWtX/lKHtOWXPw0q58NKlkQEMusXis105LcEPI46SJY2lyjvTkPTMm97V581oxW+ap3cCJZ36Py1eyjs3TzHEjEgRF97e1z4PaBX81FthNi/Fs2h6lpQ3A44CBdCvXavjSUX5lHiCoJslcHYd6hef3GPg6I2kcyyaanD1Iu79MBs4mf6d0IMhz24gORUSVPEeLM8LdIJ+Is5qNcmbQIs6nL8h2RLsUP7F++XvfdrZ+uGb0RtFRBUWEgwrL4AUpmJ/uwXoJvwqwp6HtFUGjPWwNIfkSw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=tuSP7/jMQrFfYw/fB3SMvIyrvR7F0YRHRqNOOzSoZ5k=; b=NFLbOx7Rc00vFWkZ1ge8Yuqf8nqqQ+dmiYqUTS8QgBQDfIkBLXFNlALof1U6SbuUyaTAA66zRy/vOfQ5QNvm3TUcAf5ZQxdLYeviYP6w1ikRIA9Kl4mg5mkWrykd0oH+8DrKGyKF4SnFy6xLRcQXy3/ERwucq3sCs8TN0dYNH3rJOgkybSMsBK0ExwQ4DpMhzjOL3EaM5A2+f9Sb9z3on6vUBjBED+mRUUJN/SI7vTUIfBmOK0VmPOHo1236+pG73wGbP5e3rqQhaiN7t+QjPsHOQ3h0cncWPAlu9P/gW8UlpBT2AAYZ5b4p8W22t56QKQAtdnfBxkjdZD00Em6aFA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=microsoft.com; dmarc=pass action=none header.from=microsoft.com; dkim=pass header.d=microsoft.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=tuSP7/jMQrFfYw/fB3SMvIyrvR7F0YRHRqNOOzSoZ5k=; b=htVhROeFKgssvLEsWe+c2IdXcFN2XI9ylLx6ZOetw5lYV1Q+KG1sY+AXo0PpsFVUWCARX75a1pDZL1uFQiLApO/yCcAVTnuyll6JF985ZdL8KJ4pqwzPl4g+4EDOplYdUcyBOE9BbyNjJylq4fCQxodQjzg1IHI3WvHcrecl4Lk= Received: from MW2PR2101MB1033.namprd21.prod.outlook.com (2603:10b6:302:4::32) by MN0PR21MB3266.namprd21.prod.outlook.com (2603:10b6:208:37e::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6886.11; Fri, 6 Oct 2023 20:09:30 +0000 Received: from MW2PR2101MB1033.namprd21.prod.outlook.com ([fe80::e6d1:7be6:9ded:9b42]) by MW2PR2101MB1033.namprd21.prod.outlook.com ([fe80::e6d1:7be6:9ded:9b42%6]) with mapi id 15.20.6886.016; Fri, 6 Oct 2023 20:09:30 +0000 From: Dan Clash To: "audit@vger.kernel.org" , "io-uring@vger.kernel.org" CC: "paul@paul-moore.com" , "axboe@kernel.dk" , "linux-kernel@vger.kernel.org" Subject: audit: io_uring openat triggers audit reference count underflow in worker thread Thread-Topic: audit: io_uring openat triggers audit reference count underflow in worker thread Thread-Index: AQHZ+JAylfbNgEMpYU+qEdC5Blzyag== Date: Fri, 6 Oct 2023 20:09:30 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Enabled=True;MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_SiteId=72f988bf-86f1-41af-91ab-2d7cd011db47;MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_SetDate=2023-10-06T20:09:26.683Z;MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Name=General;MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_ContentBits=0;MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Method=Standard; authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=microsoft.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: MW2PR2101MB1033:EE_|MN0PR21MB3266:EE_ x-ms-office365-filtering-correlation-id: c5adbb83-a827-4de1-a3f3-08dbc6a82637 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: rr4oj0kki6I0maPbF5463gnpz16/pYfTgK1ngXw7NznUZJrkGnxvbH+ldmoNw7u0h7CTR+TQnyC9WqOGOkePReS63UopOt05UUwWRtNZrQPTdpXe0obqGfOHZkB245iHqZlfx3quW2KRdf76YUV382eOOeBuX3HYuBuaFq8WRsoEGpNHZEbD+5Zh0kGaFQKYXypMk2smZK4CL5AgJugHl6R4qc6CeW2Rc1PCHF/EvkktJUV1q3bRjNH8THyP2827G5TnGplbc8cZuHADBtBGV7w9N6JrygfG13xqddtfGfdvGnw4gjN2uzKyubFar9AsQq2Kg0n72d4/UfncHfUFE8ht3Q5EDWm5N7NO3ylaLymTKvE2ebJnS8Tnnb6i5tabBO+n6q2Jqv3FUg8w7tNd9YzObpW07Q47UZ/g3eZz4L9WGEPtg15KBbmsl5Oqd347G3O8gN68OcrScf73exNPwl5m4IWJHz4dDhZ6GT/zzR10MqhF3Fe9CxeJoWFar8nRo4EtQH8UHnaUGxXUpGo115zumlwp9d4DsUVjpEWg5fS4+S93fDCrr3b3TwlngTeeWhKPs/uyPpZZ9+s1BG8ZH6ceqOgzqwOfrhXX6aIWeiusI5y7CCG3SNLptvY2+rKnxcWLYwURMjz6VELe61mVX7cmhPL/jT64PFOt3QiLNQiwcu4b6WUYWiXash6rOOVOfs+zeNcpUjfvh4U9R7gWOg== x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:MW2PR2101MB1033.namprd21.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(39860400002)(396003)(346002)(376002)(136003)(366004)(230922051799003)(64100799003)(186009)(1800799009)(451199024)(55016003)(7696005)(6506007)(10290500003)(478600001)(71200400001)(38070700005)(86362001)(122000001)(82960400001)(82950400001)(38100700002)(8990500004)(41300700001)(2906002)(9686003)(83380400001)(8936002)(8676002)(4326008)(66556008)(5660300002)(66476007)(91956017)(76116006)(316002)(33656002)(54906003)(66946007)(110136005)(66446008)(64756008)(52536014)(22770500002)(349545003);DIR:OUT;SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?iso-8859-1?q?ZXTV8TI+9cEKcUjyttGIqV8?= =?iso-8859-1?q?Unk/z6BqjO7rDJrQ1w9FQX9SdsyYk9L4FCpxcC+MryJSrf7PKeUjPW0b/6R7?= =?iso-8859-1?q?VFdisIeNEsIGsIVJ6SsGPxd62934z4ofV2XHvUP3S4902vr3PdK/08mMpnRV?= =?iso-8859-1?q?Z8S583k69Rvk/MS7FxcIcN6vkDQe9d7WJkQtuI2JncwietlgPD34jCgISZ7Q?= =?iso-8859-1?q?0cpeE1Hg/VU1UtSmv7K5rGTULGOwFGDC0ID5KwmLFjqmgRIBdhefzWirmM+I?= =?iso-8859-1?q?twM8UPdSyL8RPVO7AbGt4V0eWijCip9yku35TZbnkU33+LfHVf8wfx9S/VoJ?= =?iso-8859-1?q?Q0a4qxubt+eZKnyp0wNkYeAi2Z0NyDVU2umzxnsmQObOp+zZoW3F60e7Cnx7?= =?iso-8859-1?q?KnveH7qo2arBPjmWS5mlb78cDdSPe/HJEdOK6+hBAUHg1LWGwM+L3MYcNibH?= =?iso-8859-1?q?sN+mJlTAiGKf9L2kOYemoqotnb9Rs3orTD2M9HKQDRp2JFH4qjdqwjo+wxEm?= =?iso-8859-1?q?JAB0JO35LGGwgd2usSrw9Qtc5jcqo52rJ+5gXy3frkaA5G/WRwPJHIHq+Vm7?= =?iso-8859-1?q?XScDLvCdxECVg/15O8p0m4ky2t/bAt7V/CHjlrU4VkccEYD8qXoKH+LtTIFD?= =?iso-8859-1?q?fCAcsvsD9Q8buoi/C8Toc4CS2eRWkyr8ALYsSUltiAoV07WTX3L7nx63zYN/?= =?iso-8859-1?q?v9l+p/d8uteQTjuVvLwwJ4kCnPPhpQlW+Hsr5/Gnh946mHMl/SeAVqtaigKR?= =?iso-8859-1?q?+qcrB7ELzrt5cRIDk7QIl4WwMQ5B73xkBTxSrewDCWuSQVO+8x6mG+/m9hWx?= =?iso-8859-1?q?zTsQGBX4RrOvnkja3FDs0AwiENai8npwhXcwd2YfeFAA+NN1a1Fm0kL7nd7+?= =?iso-8859-1?q?Ixwi1d91wqwi1ek2JppIEBYZGDAIRNXdLE/VHiN55VXf9i48+DftPdktcLQ+?= =?iso-8859-1?q?2JBFp11WyESCcBQ9kb6YwjkcPoEN5DV8C74zRNKWm62KPZ62HRmEBV6BSq4u?= =?iso-8859-1?q?vXoZK6EJTPJ6SoQ0e40agzEESBBiHRLxS8zXEAoxidQfZzCkKST/cu2L5Td7?= =?iso-8859-1?q?ilKVmseCS45BSicubgQgIFxpD4Vy7DrfQsgfi0bWiMyvU/OXX2NQV27a6djW?= =?iso-8859-1?q?pvumiQtEsZ05nqLP+thW54sW7viUWbs+AGGZ0vai6N0KveBLGOpbGM3sU5k9?= =?iso-8859-1?q?PD1+SDsJu4vmTFMaQwNQ0WlWCj+aJl3oq0v+lAQImmGfqvwyLcj5CmoOp+Av?= =?iso-8859-1?q?gf+m0Fp6ndhKmgydS3kgin6oFv1f/Jc8iEPWlkMMSDsGZ8N+GBx28Csiw+fz?= =?iso-8859-1?q?n3QoEfa2S03ux/0zCDZw4nPKFxjMZrNdpwP2Nloosu3Yb0m121p1CllrkyC9?= =?iso-8859-1?q?f58IBU3lEDOJyyJX9a26b4k3RfIZElgdeeT4iwe1DOSuhGUr5WamzjPdPVo8?= =?iso-8859-1?q?tbYy8W+Sp2PEKxITXKkZ16Wdy5FTbm1nszxw6MoFDbtdjSkaOvH/P9iiu3Z/?= =?iso-8859-1?q?Ac9IG0ydUWHJoUqMfPF5BFAtRW9lJsnQapPoRcpsukT/3gB9c++7AA/64FCE?= =?iso-8859-1?q?mTbI885tbtukbjAqY2IwJuTq0kyYZfuQudvinRfefvTeFMD0I180uTL0V0AV?= =?iso-8859-1?q?NQ/0jS5kYxkQp/Evpxc2H1c+Y8ydFvjRI+uIaeh/7VDBPa6Zx93ajOgw4O2b?= =?iso-8859-1?q?3bozC1i6KNkDotqvnO4hz?= MIME-Version: 1.0 X-OriginatorOrg: microsoft.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: MW2PR2101MB1033.namprd21.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: c5adbb83-a827-4de1-a3f3-08dbc6a82637 X-MS-Exchange-CrossTenant-originalarrivaltime: 06 Oct 2023 20:09:30.0691 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 72f988bf-86f1-41af-91ab-2d7cd011db47 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: fF+ScyEpp3JLMda3dE+BIcqnGY36Abs993f4oTZ6ljbTXl5vxE6ukqLcShYF/Jkw6+yeDgcV+Mgflt/kSwG6kQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN0PR21MB3266 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org This discussion is about how to fix an audit reference count decrement race between two io_uring threads. Original discussion link: https : / / github . com / axboe / liburing / issues / 958 Details: The test program below hangs indefinitely waiting for an openat cqe. The reproduction is with a distro kernel Ubuntu-azure-6.2-6.2.0-1012.12_22.04.1. However, the bug seems possible with an upstream kernel. An experiment of changing the reference count in struct filename from int to refcount_t allows the test program to complete. The bug did not occur with this test program until a kernel containing commit 5bd2182d58e9 was used. I have not found a matching reported issue or upstream commit yet. The dmseg log shows an audit related path: [27883.992550] kernel BUG at fs/namei.c:262! [27883.994051] invalid opcode: 0000 [#15] SMP PTI [27883.995719] CPU: 3 PID: 84988 Comm: iou-wrk-84835 Tainted: G D 6.2.0-1012-azure #12~22.04.1-Ubuntu [27883.999064] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 05/09/2022 [27884.002734] RIP: 0010:putname+0x68/0x70 ... [27884.032893] Call Trace: [27884.034032] [27884.035117] ? show_regs+0x6a/0x80 [27884.036763] ? die+0x38/0xa0 [27884.038023] ? do_trap+0xd0/0xf0 [27884.039359] ? do_error_trap+0x70/0x90 [27884.040861] ? putname+0x68/0x70 [27884.042201] ? exc_invalid_op+0x53/0x70 [27884.043698] ? putname+0x68/0x70 [27884.045076] ? asm_exc_invalid_op+0x1b/0x20 [27884.047051] ? putname+0x68/0x70 [27884.048415] audit_reset_context.part.0.constprop.0+0xe1/0x300 [27884.050511] __audit_uring_exit+0xda/0x1c0 [27884.052100] io_issue_sqe+0x1f3/0x450 [27884.053702] ? lock_timer_base+0x3b/0xd0 [27884.055283] io_wq_submit_work+0x8d/0x2b0 [27884.056848] ? __try_to_del_timer_sync+0x67/0xa0 [27884.058577] io_worker_handle_work+0x17c/0x2b0 [27884.060267] io_wqe_worker+0x10a/0x350 [27884.061714] ? raw_spin_rq_unlock+0x10/0x30 [27884.063295] ? finish_task_switch.isra.0+0x8b/0x2c0 [27884.065537] ? __pfx_io_wqe_worker+0x10/0x10 [27884.067215] ret_from_fork+0x2c/0x50 [27884.068733] RIP: 0033:0x0 ... Test program usage: ./io_uring_open_close_audit_hang --directory /tmp/deleteme --count 10000 Test program source: // Note: The test program is C++ but could be converted to C. #include #include #include #include #include #include // open and close a file. the file is created if it does not exist. void openClose(struct io_uring& ring, std::string fileName) { int ret; struct io_uring_cqe* cqe {}; struct io_uring_sqe* sqe {}; int fd {}; int flags {O_RDWR | O_CREAT}; mode_t mode {0666}; // openat2 sqe = io_uring_get_sqe(&ring); assert(sqe != nullptr); io_uring_prep_openat(sqe, AT_FDCWD, fileName.data(), flags, mode); io_uring_sqe_set_flags(sqe, IOSQE_ASYNC); ret = io_uring_submit(&ring); assert(ret == 1); ret = io_uring_wait_cqe(&ring, &cqe); assert(ret == 0); fd = cqe->res; assert(fd > 0); io_uring_cqe_seen(&ring, cqe); // close sqe = io_uring_get_sqe(&ring); assert(sqe != nullptr); io_uring_prep_close(sqe, fd); io_uring_sqe_set_flags(sqe, IOSQE_ASYNC); ret = io_uring_submit(&ring); assert(ret == 1); // wait for the close to complete. ret = io_uring_wait_cqe(&ring, &cqe); assert(ret == 0); // verify that close succeeded. assert(cqe->res == 0); io_uring_cqe_seen(&ring, cqe); } // create 100 files and then open each file twice. void openCloseHang(std::string filePath) { int ret; struct io_uring ring; ret = io_uring_queue_init(8, &ring, 0); assert(0 == ret); int repeat {3}; int numFiles {100}; std::filesystem::create_directory(filePath); // files of length 0 are created in the j==0 iteration below. // those files are opened and closed during the j>0 iteraions. // a repeat of 3 results in a fairly reliable reproduction. for (int j = 0; j < repeat; j += 1) { for (int i = 0; i < numFiles; i += 1) { std::string fileName(filePath + "/file" + std::to_string(i)); openClose(ring, fileName); } } std::filesystem::remove_all(filePath); io_uring_queue_exit(&ring); } int main(int argc, char** argv) { std::string filePath {}; int iterations {}; struct option options[] { {"help", no_argument, 0, 'h'}, {"directory", required_argument, 0, 'd'}, {"count", required_argument, 0, 'c'}, { 0, 0, 0, 0 } }; bool printUsage {false}; int val {}; while ((val = getopt_long_only(argc, argv, "", options, nullptr)) != -1) { if (val == 'h') { printUsage = true; } else if (val == 'd') { filePath = optarg; if (std::filesystem::exists(filePath)) { printUsage = true; std::cerr << "directory must not exist" << std::endl; } } else if (val == 'c') { iterations = atoi(optarg); if (0 == iterations) { printUsage = true; } } else { printUsage = true; } } if ((0 == iterations) || (filePath.empty())) { printUsage = true; } if (printUsage || (optind < argc)) { std::cerr << "io_uring_open_close_audit_hang.cc --directory DIR --count COUNT" << std::endl; exit(1); } for (int i = 0; i < iterations; i += 1) { if (0 == (i % 100)) { std::cout << "i=" << std::to_string(i) << std::endl; } openCloseHang(filePath); } return 0; } Changing the reference count from int to refcount_t allows the test program to complete using the v6.2 distro kernel. The patch applies and builds on the upstream v6.1.55 kernel. Signed-off-by: Dan Clash diff --git a/fs/namei.c b/fs/namei.c index 2a8baa6ce3e8..4f7ac131c9d1 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -187,7 +187,7 @@ getname_flags(const char __user *filename, int flags, int *empty) } } - result->refcnt = 1; + refcount_set(&result->refcnt, 1); /* The empty path is special. */ if (unlikely(!len)) { if (empty) @@ -248,7 +248,7 @@ getname_kernel(const char * filename) memcpy((char *)result->name, filename, len); result->uptr = NULL; result->aname = NULL; - result->refcnt = 1; + refcount_set(&result->refcnt, 1); audit_getname(result); return result; @@ -259,9 +259,10 @@ void putname(struct filename *name) if (IS_ERR(name)) return; - BUG_ON(name->refcnt <= 0); + BUG_ON(refcount_read(&name->refcnt) == 0); + BUG_ON(refcount_read(&name->refcnt) == REFCOUNT_SATURATED); - if (--name->refcnt > 0) + if (!refcount_dec_and_test(&name->refcnt)) return; if (name->name != name->iname) { diff --git a/include/linux/fs.h b/include/linux/fs.h index d0a54e9aac7a..8217e07726d4 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2719,7 +2719,7 @@ struct audit_names; struct filename { const char *name; /* pointer to actual string */ const __user char *uptr; /* original userland pointer */ - int refcnt; + refcount_t refcnt; struct audit_names *aname; const char iname[]; }; diff --git a/kernel/auditsc.c b/kernel/auditsc.c index 37cded22497e..232e0be9f6d9 100644 --- a/kernel/auditsc.c +++ b/kernel/auditsc.c @@ -2188,7 +2188,7 @@ __audit_reusename(const __user char *uptr) if (!n->name) continue; if (n->name->uptr == uptr) { - n->name->refcnt++; + refcount_inc(&n->name->refcnt); return n->name; } } @@ -2217,7 +2217,7 @@ void __audit_getname(struct filename *name) n->name = name; n->name_len = AUDIT_NAME_FULL; name->aname = n; - name->refcnt++; + refcount_inc(&name->refcnt); } static inline int audit_copy_fcaps(struct audit_names *name, @@ -2349,7 +2349,7 @@ void __audit_inode(struct filename *name, const struct dentry *dentry, return; if (name) { n->name = name; - name->refcnt++; + refcount_inc(&name->refcnt); } out: @@ -2474,7 +2474,7 @@ void __audit_inode_child(struct inode *parent, if (found_parent) { found_child->name = found_parent->name; found_child->name_len = AUDIT_NAME_FULL; - found_child->name->refcnt++; + refcount_inc(&found_child->name->refcnt); } }