From patchwork Wed Nov 20 10:34:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yunsheng Lin X-Patchwork-Id: 13881007 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 710CFD63923 for ; Wed, 20 Nov 2024 10:41:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DBC7E6B007B; Wed, 20 Nov 2024 05:41:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D45236B0083; Wed, 20 Nov 2024 05:41:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BE6096B0085; Wed, 20 Nov 2024 05:41:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 9B1306B007B for ; Wed, 20 Nov 2024 05:41:42 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 1F215ADAAE for ; Wed, 20 Nov 2024 10:41:42 +0000 (UTC) X-FDA: 82806130848.30.DCB30BD Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by imf05.hostedemail.com (Postfix) with ESMTP id B20AB100003 for ; Wed, 20 Nov 2024 10:40:01 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=none; spf=pass (imf05.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1732099055; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references; bh=s8/l/k3lSjCShyjpwzOO1+ZkmBT4yb2Gl82M34kPGSo=; b=JvdvXhvcYtMVFZ2XqnHp8QAsbkzMiOyzDzZN4aiEWsl5KI9kj+YPvdsPfmCpWi4IqNqz0c q29wB5f2xmlp3mQfOL5h1hVe+qZVzc0YD+GbPBlywoUXc9vJdnMhk/2XTRfl0PoVdN2zpu 89AwT2mgXJRciIlDqJ9Nz6L3ZnRqZvI= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=none; spf=pass (imf05.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1732099055; a=rsa-sha256; cv=none; b=c8BqJRU0ZmcR5VWdS1mLT8bduph7sqa5go0xumEnFiXOopGO/4gP+kpksowJWlF2vO+afv jnwW1OBcDuSWxxpGAkiB+exXcR/9aPGnJA7XIO8d9QdxJtDeGeSSz2QVgf1wDljytwSEko uBz2Tz168S6ZIDOY40bhFKIdknvl3Us= Received: from mail.maildlp.com (unknown [172.19.163.174]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4XtdCR66mCz10WBf; Wed, 20 Nov 2024 18:39:31 +0800 (CST) Received: from dggpemf200006.china.huawei.com (unknown [7.185.36.61]) by mail.maildlp.com (Postfix) with ESMTPS id BD1D1140360; Wed, 20 Nov 2024 18:41:33 +0800 (CST) Received: from localhost.localdomain (10.90.30.45) by dggpemf200006.china.huawei.com (7.185.36.61) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Wed, 20 Nov 2024 18:41:33 +0800 From: Yunsheng Lin To: , , CC: , , , Yunsheng Lin , Alexander Lobakin , Robin Murphy , Alexander Duyck , Andrew Morton , IOMMU , MM Subject: [PATCH RFC v4 0/3] fix two bugs related to page_pool Date: Wed, 20 Nov 2024 18:34:52 +0800 Message-ID: <20241120103456.396577-1-linyunsheng@huawei.com> X-Mailer: git-send-email 2.30.0 MIME-Version: 1.0 X-Originating-IP: [10.90.30.45] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggpemf200006.china.huawei.com (7.185.36.61) X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: B20AB100003 X-Stat-Signature: g3wmu5wfd7d6m154tyt36hxwt6x6hkih X-Rspam-User: X-HE-Tag: 1732099201-737832 X-HE-Meta: U2FsdGVkX18RKkKLdwPSB0CaLu8wv+PxZwKMm38ynaaNkLvUmOMTZLkv/ZO/ntXHX6UrM2pDbs4GhKyPNHBttTM2Nzwv5tY/0hRMPUMe/S3TSJfn5t2jn6INwQ/aNRFNLeMS7N7fpqAzJhusdI36KFfyyDzJRIzLsTxILb7gZwowDw3Woz/8VGRfMvEwOt/XR9isXb5ANTgLp7LiFccZs2gmxeiR41TSmK3RTN3iMKNU6KjBwmPWAKp4ZkIPaYDvrSC2ei1sz1G1aGK/D3i/13cjFxdEmD3sxkpK8dLrlgdViF64OjirOt6swOcLBP1axU1mbGuMNcS9SM1iqqlFF5tvzEeR5PiAChHWHMYXP+E9JBJLHFCyvlvD/sULTba029URomnuAaJTQ8N1cqu5rN/sy7I9L5Abo376yb8fugZGVXZOJ2LPqar6NIfaIdeWmNZrQtks0DZgAu3UANvVNcLH8+mfjreTWewN/uYaxKcirXlQcJ1hECZJAYqesFozz4Y++v4fC5liYaCn9BPhiTNwlN9y2GHIV0sRj0IRWbPlijaKElJ8RvkXkxoRzTneZ4+v2pq9zE+mnftMoC3KGMvWQd+fPuRlmdzjVPpV86vcNmeWuul+MKmdxUbsyVA2FVBgFyXkwmWAe40lOVx9vFHiEM+bJ0MR65vJ8SvFK+c10Si/zrNjXJXmP9AjqaIs1KPXlAKmgphcoQ9yrmIrnc8V5gHA0Oz7Ra9FWR4/jcTG21alzDsLUvT07U+I5dnsET6UIAF8XxDg89Yt8Eaa/xhWFrdcI75U7tSmcSwtjN+gFQDPRXAl6iRQseyds5UBJV2bSK76jIwB+c8RlVlrRpAM+vR0AS1Oezs0nVkzPxzVv75QIsP3QCX45gOvMBvmlAEO9TIasoiMPEuEAFPNKuICNPydArPjBTRRJo0SPRM6IZjYnVkGcXVOzYPaAo9ytvUDmy7rJtXGExzdbTE F5Cw2X4X wDTPdyhF0vQqmA0UoaP6PcwQQJKybpPu2/dBEZsoQuXkLTV2Sv40RWVvwMcZcSCNWYNYQoCunignKRWxmy0kZ+6jojj8JkMBN09tnY+EFrQm1NV2bK9NIhkDVEa/Z3gh1HOFPI+E2Dx8F0yTgJPOeSXrUy2YqxswIJwQbg3E5Dc3tm5Nihglj8zK6KNnETlTtpj2o8IbTsn62so8Fowgfkj0Vb+0kEgtrKK1Q8iyKeU0iSUTui+QvrVjZsFeCa48e4+NIkUbFwDRGkHl4Nqf3gitNLsmuk/2QN8Q/5QqbwygtzboAPP5XhwDrUsVViFSXvneByWuzWSocSor38RlcE4CqAP0+oBC1a9uq30TB5+Tp8kPzfcZDj4tNIpYnqrYVIP0R8BkT3DrgnqRWlxbKnh5JwO4WgEn8YQzB3PxSAYHyEeEqobF3qkBAL5Q9EXcMLtdkar1XmVQ1IlyfX/RCtbDI7aS6BCm3h+OP442Pp0R+kX8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Patch 1 fix a possible time window problem for page_pool. Patch 2 fix the kernel crash problem at iommu_get_dma_domain reported in [1] using scanning. Patch 3 avoid calling dma sync API after driver has already unbound. From the below performance data, there seems to be no noticeable performance impact. Before this patchset: root@(none)$ taskset -c 0 insmod bench_page_pool_simple.ko [ 165.357058] bench_page_pool_simple: Loaded [ 165.438159] time_bench: Type:for_loop Per elem: 0 cycles(tsc) 0.769 ns (step:0) - (measurement period time:0.076973110 sec time_interval:76973110) - (invoke count:100000000 tsc_interval:7697296) [ 167.423811] time_bench: Type:atomic_inc Per elem: 1 cycles(tsc) 19.683 ns (step:0) - (measurement period time:1.968369340 sec time_interval:1968369340) - (invoke count:100000000 tsc_interval:196836926) [ 167.591773] time_bench: Type:lock Per elem: 1 cycles(tsc) 15.006 ns (step:0) - (measurement period time:0.150069980 sec time_interval:150069980) - (invoke count:10000000 tsc_interval:15006991) [ 168.265447] time_bench: Type:rcu Per elem: 0 cycles(tsc) 6.565 ns (step:0) - (measurement period time:0.656564890 sec time_interval:656564890) - (invoke count:100000000 tsc_interval:65656477) [ 168.282469] bench_page_pool_simple: time_bench_page_pool01_fast_path(): Cannot use page_pool fast-path [ 168.572734] time_bench: Type:no-softirq-page_pool01 Per elem: 2 cycles(tsc) 28.097 ns (step:0) - (measurement period time:0.280971960 sec time_interval:280971960) - (invoke count:10000000 tsc_interval:28097187) [ 168.591404] bench_page_pool_simple: time_bench_page_pool02_ptr_ring(): Cannot use page_pool fast-path [ 169.178662] time_bench: Type:no-softirq-page_pool02 Per elem: 5 cycles(tsc) 57.805 ns (step:0) - (measurement period time:0.578052550 sec time_interval:578052550) - (invoke count:10000000 tsc_interval:57805246) [ 169.197331] bench_page_pool_simple: time_bench_page_pool03_slow(): Cannot use page_pool fast-path [ 171.033303] time_bench: Type:no-softirq-page_pool03 Per elem: 18 cycles(tsc) 182.711 ns (step:0) - (measurement period time:1.827113580 sec time_interval:1827113580) - (invoke count:10000000 tsc_interval:182711348) [ 171.052324] bench_page_pool_simple: pp_tasklet_handler(): in_serving_softirq fast-path [ 171.060227] bench_page_pool_simple: time_bench_page_pool01_fast_path(): in_serving_softirq fast-path [ 171.350242] time_bench: Type:tasklet_page_pool01_fast_path Per elem: 2 cycles(tsc) 28.089 ns (step:0) - (measurement period time:0.280896430 sec time_interval:280896430) - (invoke count:10000000 tsc_interval:28089636) [ 171.369517] bench_page_pool_simple: time_bench_page_pool02_ptr_ring(): in_serving_softirq fast-path [ 171.903169] time_bench: Type:tasklet_page_pool02_ptr_ring Per elem: 5 cycles(tsc) 52.461 ns (step:0) - (measurement period time:0.524619700 sec time_interval:524619700) - (invoke count:10000000 tsc_interval:52461966) [ 171.922357] bench_page_pool_simple: time_bench_page_pool03_slow(): in_serving_softirq fast-path [ 173.851219] time_bench: Type:tasklet_page_pool03_slow Per elem: 19 cycles(tsc) 192.017 ns (step:0) - (measurement period time:1.920178560 sec time_interval:1920178560) - (invoke count:10000000 tsc_interval:192017848) After this patchset: root@(none)$ taskset -c 0 insmod bench_page_pool_simple.ko [ 394.337302] bench_page_pool_simple: Loaded [ 394.418402] time_bench: Type:for_loop Per elem: 0 cycles(tsc) 0.769 ns (step:0) - (measurement period time:0.076976830 sec time_interval:76976830) - (invoke count:100000000 tsc_interval:7697673) [ 396.168990] time_bench: Type:atomic_inc Per elem: 1 cycles(tsc) 17.333 ns (step:0) - (measurement period time:1.733304770 sec time_interval:1733304770) - (invoke count:100000000 tsc_interval:173330470) [ 396.336932] time_bench: Type:lock Per elem: 1 cycles(tsc) 15.005 ns (step:0) - (measurement period time:0.150052930 sec time_interval:150052930) - (invoke count:10000000 tsc_interval:15005288) [ 397.008173] time_bench: Type:rcu Per elem: 0 cycles(tsc) 6.541 ns (step:0) - (measurement period time:0.654135460 sec time_interval:654135460) - (invoke count:100000000 tsc_interval:65413540) [ 397.025193] bench_page_pool_simple: time_bench_page_pool01_fast_path(): Cannot use page_pool fast-path [ 397.295761] time_bench: Type:no-softirq-page_pool01 Per elem: 2 cycles(tsc) 26.127 ns (step:0) - (measurement period time:0.261275610 sec time_interval:261275610) - (invoke count:10000000 tsc_interval:26127555) [ 397.314429] bench_page_pool_simple: time_bench_page_pool02_ptr_ring(): Cannot use page_pool fast-path [ 397.852216] time_bench: Type:no-softirq-page_pool02 Per elem: 5 cycles(tsc) 52.858 ns (step:0) - (measurement period time:0.528581530 sec time_interval:528581530) - (invoke count:10000000 tsc_interval:52858146) [ 397.870887] bench_page_pool_simple: time_bench_page_pool03_slow(): Cannot use page_pool fast-path [ 399.701260] time_bench: Type:no-softirq-page_pool03 Per elem: 18 cycles(tsc) 182.151 ns (step:0) - (measurement period time:1.821514450 sec time_interval:1821514450) - (invoke count:10000000 tsc_interval:182151437) [ 399.720282] bench_page_pool_simple: pp_tasklet_handler(): in_serving_softirq fast-path [ 399.728186] bench_page_pool_simple: time_bench_page_pool01_fast_path(): in_serving_softirq fast-path [ 399.998947] time_bench: Type:tasklet_page_pool01_fast_path Per elem: 2 cycles(tsc) 26.164 ns (step:0) - (measurement period time:0.261642940 sec time_interval:261642940) - (invoke count:10000000 tsc_interval:26164289) [ 400.018223] bench_page_pool_simple: time_bench_page_pool02_ptr_ring(): in_serving_softirq fast-path [ 400.621035] time_bench: Type:tasklet_page_pool02_ptr_ring Per elem: 5 cycles(tsc) 59.377 ns (step:0) - (measurement period time:0.593779950 sec time_interval:593779950) - (invoke count:10000000 tsc_interval:59377988) [ 400.640223] bench_page_pool_simple: time_bench_page_pool03_slow(): in_serving_softirq fast-path [ 402.524760] time_bench: Type:tasklet_page_pool03_slow Per elem: 18 cycles(tsc) 187.585 ns (step:0) - (measurement period time:1.875853550 sec time_interval:1875853550) - (invoke count:10000000 tsc_interval:187585349) 1. https://lore.kernel.org/lkml/8067f204-1380-4d37-8ffd-007fc6f26738@kernel.org/T/ CC: Alexander Lobakin CC: Robin Murphy CC: Alexander Duyck CC: Andrew Morton CC: IOMMU CC: MM Change log: V4: 1. use scanning to do the unmapping 2. spilt dma sync skipping into separate patch V3: 1. Target net-next tree instead of net tree. 2. Narrow the rcu lock as the discussion in v2. 3. Check the ummapping cnt against the inflight cnt. V2: 1. Add a item_full stat. 2. Use container_of() for page_pool_to_pp(). Yunsheng Lin (3): page_pool: fix timing for checking and disabling napi_local page_pool: fix IOMMU crash when driver has already unbound page_pool: skip dma sync operation for inflight pages include/net/page_pool/types.h | 6 +- net/core/page_pool.c | 135 ++++++++++++++++++++++++++++------ 2 files changed, 119 insertions(+), 22 deletions(-)