From patchwork Tue Sep 10 18:21:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13799212 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CBEC4EDE9AC for ; Tue, 10 Sep 2024 18:21:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 424A88D00A1; Tue, 10 Sep 2024 14:21:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3D3B38D0056; Tue, 10 Sep 2024 14:21:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 29BA58D00A1; Tue, 10 Sep 2024 14:21:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 05EAD8D0056 for ; Tue, 10 Sep 2024 14:21:48 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id A5F21C053E for ; Tue, 10 Sep 2024 18:21:48 +0000 (UTC) X-FDA: 82549647096.21.5C3D12A Received: from mail-il1-f175.google.com (mail-il1-f175.google.com [209.85.166.175]) by imf19.hostedemail.com (Postfix) with ESMTP id BD5AA1A0004 for ; Tue, 10 Sep 2024 18:21:45 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=kernel-dk.20230601.gappssmtp.com header.s=20230601 header.b=Mu9c+vzU; spf=pass (imf19.hostedemail.com: domain of axboe@kernel.dk designates 209.85.166.175 as permitted sender) smtp.mailfrom=axboe@kernel.dk; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725992402; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=udzgVQ6JBf3SEvwBtn2AnZiAV3VLggu57qUZMf7/iGE=; b=HcSuYmhUUNQhw1hdDukXXMDBIap37RuKHYcuYGf7NmvOugzP6o9zrw3nWu/1dmx+Hppl5U nQQROK3THBFFw3LDKf9Ed8pLcA+/UccCeK892RyKg1Ym1JmI40MQ4whV6JVDM5VzcmnSOP GQ6DidF4e2fUIBxnyuYA/SqJ0YRtNpM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725992402; a=rsa-sha256; cv=none; b=sLAeD3yH7vE55E0YAtx6wix507WShVDMilhuMr280a3Egt07Wg3dO9oLGbp4/9RsYGrzKe kjFwulP1dFyLxwQwaZ1LKJJC5uiVMzw++R9UIObW7C3sWAPV27ofMUs35cDhKbI71ctOcn QjOTIHlnvQVSl4SBdtKR2OlavtP1Z5o= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=kernel-dk.20230601.gappssmtp.com header.s=20230601 header.b=Mu9c+vzU; spf=pass (imf19.hostedemail.com: domain of axboe@kernel.dk designates 209.85.166.175 as permitted sender) smtp.mailfrom=axboe@kernel.dk; dmarc=none Received: by mail-il1-f175.google.com with SMTP id e9e14a558f8ab-39d3cd4fa49so376445ab.1 for ; Tue, 10 Sep 2024 11:21:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1725992504; x=1726597304; darn=kvack.org; h=content-transfer-encoding:subject:from:cc:to:content-language :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=udzgVQ6JBf3SEvwBtn2AnZiAV3VLggu57qUZMf7/iGE=; b=Mu9c+vzUiScg2N+ThwGhrWmTv6KNwwcxhHoiaWhE7aUI378DYSgAvwHcDVHFVuAkxz BhbzFOAHKGCexYGW2+WPohWgqKUGokHwQJiQkVavImeh15PeV5k9QpQNdIW0ZfKkozdj UgKnLgRilOGlZVl34DyTlmyRjBi1Yh2MuKIlGyXzzB1J6XjUTaMGjrDiv9A2GX5imaOL hZG0f9QGraOxUsEkrn1qRfYz9xme/apUQec+KpUNTap88LGiaPdELO2+5Md24d5SFzvE LoWZ/tYtMqSmPhOivEwQCk0p+JGCUTJVpUVQxdmChDVXE/LQJHjiVOAuuJzT2yBufYgG 7OpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725992504; x=1726597304; h=content-transfer-encoding:subject:from:cc:to:content-language :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=udzgVQ6JBf3SEvwBtn2AnZiAV3VLggu57qUZMf7/iGE=; b=dFgjAPIGncU0P6A5/YP/JJmWtk9LxLTmtbklQT4Yeu2MERXVeem7p8vSoVEsFsnrKW /N0AXg4JPY11JgfOvvLvTgR91GZofP5jLx5ijCZV3Gs7FaTqYc058wcD8AzdXqZBwpUq M+uMmZqdEhPbNO7OLQ/zPF6pEY5v6WnjjYfU1Ysv8V8MrJa+opEY402H1Mwoq7W88BQf qDPtStzk7ApMwvZsLMWgkq9PesM1M2j6Ij6gcf0W500nY0nph1mqTKH0nR4IcTNvxThL YLoHdP68TKgbj24rTbkTXnX+9xgeMZqc7NE9pOiYlCnnqfmv2d8CGntnCoECfnq1r2cF 4Edw== X-Gm-Message-State: AOJu0YxgaazDrXuoc1S7pPzS0nAjO0hAztO/zzgHor1QKGPiWVY0otU9 OES/rZML7wuhoHsKZbIPeBOnI6+HRlr+4asiRlWx4bTosxG+UYrkhZZpZuny0YphWbATYjo7c9K x X-Google-Smtp-Source: AGHT+IEba/qdeLoxe3Ab8JZP6ON2tmDE6a8gGmVrYzMPjMeklM0qcT0OvS1qwKNAw0ZQjKMmYrpjzw== X-Received: by 2002:a92:190b:0:b0:39d:3c87:1435 with SMTP id e9e14a558f8ab-3a074bf2326mr2772155ab.1.1725992504107; Tue, 10 Sep 2024 11:21:44 -0700 (PDT) Received: from [192.168.1.116] ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id 8926c6da1cb9f-4d09455f8edsm1738585173.37.2024.09.10.11.21.43 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 10 Sep 2024 11:21:43 -0700 (PDT) Message-ID: <02ffa542-ce49-4755-9d2b-29841f9973e0@kernel.dk> Date: Tue, 10 Sep 2024 12:21:42 -0600 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US To: Linux-MM Cc: Yu Zhao , Johannes Weiner , Andrew Morton , Muchun Song From: Jens Axboe Subject: Hugepage program taking forever to exit X-Stat-Signature: wd7siy5j8bu4egixcmgdm1gppmnu85fg X-Rspamd-Queue-Id: BD5AA1A0004 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1725992505-428500 X-HE-Meta: U2FsdGVkX18FdiaHFvSoVTciY/PXcXlmQS/e7HRe734pWXQQb/REFbSNpPXsbNxKNmIoKm2yUeUCVLzFy0Eyt1pAiW7SN1zFSIhussawE+VpJwfalGWd7oTAXGykP7ZSADRWys/E/YRkyz3Pby56q9jwYxgL0PVxgMIUQKsRJWUmikl1/2s5NcQ09NztI3R4TcOP+9t6YtignlVOxwuivyPv0cpnD7nAI/7BPqH1S2I6m8OVaO9HuLcOZc31TDxJDLUieAC1UYgyYAmPoCU5bzWtfxGC1P1pEQkbh/WYgs66dDJjL6yxpuEUYUAbo+dy1Ba+jKTlBPVEcDHI8g3OrN5ddW1QpRGUfkrUKGd88Q4upqIe3PNKpQGktuf8Vk4UqdGzTvQH7cXzfF/0IegR+ACSy7tBLPv2yBT7eqX3lJSDGYt7z43iQmoJS6hnesnIlK2JYaNom6Z1yq80nBYv83ZBT8CJt13WvcSgsl1cuHruZvM9Su7NtzJG+5Q3rP0EGU8dXfUKLQ6QSRTtODBrn3mnMwISi6jB50aAWPt2mc1l5lhKkPmXKcsBT5nhnfleLCXaNjX12NHoDlZG3+HZfX7BgEBfnaSwTfEiGA68lFgN8wSoGyr5yoAbfJ6ufOUVuKk6KEHN45U2o3VclqAMH0OzbpOW7Oii+uCOHHS3Zub+BnFpyaOc3o4tQNUmmZb2648Od5iK5cijAQn1KFkwqP//ztoI8LCjvSuTuRDi2WDrwsHSk2VVcih03VlqxfObm9aCX9Tb1Js/p58nKLUYofqJKhWtiguAHOckDncqT+BBKsr059sx4XSJhRNiqILTHUC0BekDRLOx5SPoX11rCmioVhiuRHteOiR7jiAYdWs+zGLBS1PlBdSEteHtbgej7K7z0ioHAoHHH6sjllb2WZZneoPYb4B4cnPUoOMEPsmgFABlmo2Ke6s4Ht8D+HBwm1RnOXo55rLrSLHEYgp LOAzS6xC mcVceJ83yvcGgMFHSNy6RRuKKblkF11JPd7duQl9GalUPvWLKyNi5sIsUb58CPlkWPs3tylg75EiYtwSXUhSaUbDFkTfPE/OJo8YJmsA1a/ZLjJJXjZ/nsQ44+XI7zNhWnSZq8l3C2qSP3RWUBpbSNMNzvheimo011wMnfO4mB+DQC6li7fi43mtq+s56y4/3MCI2UlB8iUh3Wdsogw1DQYpEk4JKZ9w26k9kV+vLDu+1TV2Uz7buXaZePkgOViMmgZfBio9JWPqehWRwgYHRtuMsDHRtpnsa+j2UCMrP/fD5bW90uYOP0gXr8kAc8ITSuwr/wEgV0/7Cxull9cyx60LF6pQ/Z3A2m6lNEv2NuDm8dbSklksT4PRbDA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000007, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi, Investigating another issue, I wrote the following simple program that allocates and faults in 500 1GB huge pages, and then registers them with io_uring. Each step is timed: Got 500 huge pages (each 1024MB) in 0 msec Faulted in 500 huge pages in 38632 msec Registered 500 pages in 867 msec and as expected, faulting in the pages takes (by far) the longest. From the above, you'd also expect the total runtime to be around ~39 seconds. But it is not... In fact it takes 82 seconds in total for this program to have exited. Looking at why, I see: [<0>] __wait_rcu_gp+0x12b/0x160 [<0>] synchronize_rcu_normal.part.0+0x2a/0x30 [<0>] hugetlb_vmemmap_restore_folios+0x22/0xe0 [<0>] update_and_free_pages_bulk+0x4c/0x220 [<0>] return_unused_surplus_pages+0x80/0xa0 [<0>] hugetlb_acct_memory.part.0+0x2dd/0x3b0 [<0>] hugetlb_vm_op_close+0x160/0x180 [<0>] remove_vma+0x20/0x60 [<0>] exit_mmap+0x199/0x340 [<0>] mmput+0x49/0x110 [<0>] do_exit+0x261/0x9b0 [<0>] do_group_exit+0x2c/0x80 [<0>] __x64_sys_exit_group+0x14/0x20 [<0>] x64_sys_call+0x714/0x720 [<0>] do_syscall_64+0x5b/0x160 [<0>] entry_SYSCALL_64_after_hwframe+0x4b/0x53 and yes, it does look like the program is mostly idle for most of the time while returning these huge pages. It's also telling us exactly why we're just sitting idle - RCU grace period. The below quick change means the runtime of the program is pretty much just the time it takes to execute the parts of it, as you can see from the full output after the change: axboe@r7525 ~> time sudo ./reg-huge Got 500 huge pages (each 1024MB) in 0 msec Faulted in 500 huge pages in 38632 msec Registered 500 pages in 867 msec diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 0c3f56b3578e..95f6ad8f8232 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -517,7 +517,7 @@ long hugetlb_vmemmap_restore_folios(const struct hstate *h, long ret = 0; /* avoid writes from page_ref_add_unless() while unfolding vmemmap */ - synchronize_rcu(); + synchronize_rcu_expedited(); list_for_each_entry_safe(folio, t_folio, folio_list, lru) { if (folio_test_hugetlb_vmemmap_optimized(folio)) {