From patchwork Fri Aug 9 10:31:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dev Jain X-Patchwork-Id: 13758679 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DFC08C3DA4A for ; Fri, 9 Aug 2024 10:32:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6DDA96B008C; Fri, 9 Aug 2024 06:32:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 68D316B0092; Fri, 9 Aug 2024 06:32:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 52F376B0095; Fri, 9 Aug 2024 06:32:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 364616B008C for ; Fri, 9 Aug 2024 06:32:01 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id CD6B380D0B for ; Fri, 9 Aug 2024 10:32:00 +0000 (UTC) X-FDA: 82432341600.14.FE61554 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf12.hostedemail.com (Postfix) with ESMTP id F294F40024 for ; Fri, 9 Aug 2024 10:31:58 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=none; spf=pass (imf12.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723199453; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references; bh=UjRa5wWdS3cnwFhOwruQTY6kCjLMtuFPOlHS+53YypE=; b=gzDGJIiqxRJvAgdelpIhsC+iZBJFNsnShinkfzRN9oao37zU7T+CpvoJoc6Ibr9gm28cgJ krRyU8AQ8+wSpVI5b85O5LSV7l705UDIqRMKvsTaWm83X9HgLu0hu21jFOKolK5YFUFMx6 A4628MT9xH3+Vnn562smHMjcbl8SDAA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723199453; a=rsa-sha256; cv=none; b=3v9yjcO1PMC1nxui5/z5IDR9oXta71W9hYe7OgApOmeY/Ju1S+dr9vTzIgbqKxgGiW/rts 6oR8cHpZk+JgKAiXRWfcRu9/MKyznoFzOeptCqBDycQC+Hi9paD+8I4meBJZT2hzU1jwx6 mxeC+papY8KuiE9HHAT7SRr5CkiP/Go= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=none; spf=pass (imf12.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com; dmarc=pass (policy=none) header.from=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B0DD913D5; Fri, 9 Aug 2024 03:32:23 -0700 (PDT) Received: from e116581.blr.arm.com (e116581.arm.com [10.162.42.15]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 157CC3F766; Fri, 9 Aug 2024 03:31:42 -0700 (PDT) From: Dev Jain To: akpm@linux-foundation.org, shuah@kernel.org, david@redhat.com, willy@infradead.org Cc: ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, osalvador@suse.de, baolin.wang@linux.alibaba.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, ioworker0@gmail.com, gshan@redhat.com, mark.rutland@arm.com, kirill.shutemov@linux.intel.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, broonie@kernel.org, mgorman@techsingularity.net, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, Dev Jain Subject: [PATCH 0/2] Improve migration by backing off earlier Date: Fri, 9 Aug 2024 16:01:27 +0530 Message-Id: <20240809103129.365029-1-dev.jain@arm.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: F294F40024 X-Stat-Signature: bzn34hn96jfg8rpbw7dr4i48mmgsf74s X-HE-Tag: 1723199518-687814 X-HE-Meta: U2FsdGVkX1+5kx0T02ahpyfgj442a6kc+Rli74DBQjJEIv0bZ9NfOt6/fmwR8ugLYvTNs6D8n5dP/wiT/nyLyJQHE3osmP76s3t9n+NyFqb1kjJp4uoK6Vc13OehXCP81gde7fyEzZ9mZhOBfW4sgiOBvYBsYrw/4R5cscX+sa4zcVQ1W6aFmL5zRl0B0Ng311S7Jj5HnwcTFpzOgYTN79zmzSDnVelixH3hmK0WvLyMNA0DLUNQUnDq0H8+cUq0jEJeeCJz2LVDWK6dzW1nTHcsgxY9d7uPmznGOEzc0kBqzORJUoOOsnjHMT3k0P2XLlTeeqXJoAtXMyC3APR3DzFcSG7qnE0yaM1v+lUu0nOULPJbIH4Ibcykoq4sAl379nlIzsK38N05ZzTyB8Xw/oOJI5AaFl4S2ggxFKM1K5L3neOLeQQkSv76+8Uk1a0whwPy5qcHbs0r44oShfrN7gBquw/TVFaFgTbIkLynMTYlRLQXiN+0C8e6WuC20CNlMCtCfVmtwPfU6j1fCqJegvHVmGx5gtQKjjXW+ehUM4PNXu214QWR9xfoERndhmcMKQjabqZLDkCYC+lXhVQopdExwtEteFe8rRJp/GRZtLSIjRoINXOpD09Ne+4BGQqzw1l7xosrUgm0Vit8zsIluAzGH9lAnz02eAi6nR/wkg1VwTtKKS0c34ABpNP5/iATedITYJtjfh1QfE6w6HueoHbozmzEFancHpjlRh2cGQolede510GBFpMDZxnmw9UzkF+035miqYuvspTBMX3NHixwrRcOKnxRQR3ApyrrDWoo0N2dgvsUHjKFA78CabKbQ/2ZLiloBIaJELpGL4hb4pWxR1BMPogMfTvzBMK000qqHUgjTRChYuk6df1mIGz4ntdSzvXk5fcbYHg/wLClWSUPiSzrsuQf9pk6lBwu+TTHlkj3gnFt4uyuV6dLAC/JfEhSwLq4tKWsAS6WNka FNa63hTc PHkuuM3cHLLvJtPGqXpxvBfi+ibq3tT5m7uVgdc7bs6Wj4L7USICKCCm/puTbwM1RZOulylFWu+f900wFQTNuG5romp7oQ6H+3/Jx1o7oCezktDdT46+LYfgdXyGxdyBCzta/FOEVRXOvwEtZ2qb3l4GVggmFZT55jDOjmKaTMRFbxJ3WlwYfwtRSaxDpodHcqbUiFTyomIzAsyo7VHzRIgLV7Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: It was recently observed at [1] that during the folio unmapping stage of migration, when the PTEs are cleared, a racing thread faulting on that folio may increase the refcount of the folio, sleep on the folio lock (the migration path has the lock), and migration ultimately fails when asserting the actual refcount against the expected. Migration is a best effort service; the unmapping and the moving phase are wrapped around loops for retrying. The refcount of the folio is currently being asserted during the move stage; if it fails, we retry. But, if a racing thread changes the refcount, and ends up sleeping on the folio lock (which is mostly the case), there is no way the refcount would be decremented; as a result, this renders the retrying useless. In the first patch, we make the refcount check also during the unmap stage; if it fails, we restore the original state of the PTE, drop the folio lock, let the system make progress, and retry unmapping again. This improves the probability of migration winning the race. Given that migration is a best-effort service, it is wrong to fail the test for just a single failure; hence, fail the test after 100 consecutive failures (where 100 is still a subjective choice). [1] https://lore.kernel.org/all/20240801081657.1386743-1-dev.jain@arm.com/ Dev Jain (2): mm: Retry migration earlier upon refcount mismatch selftests/mm: Do not fail test for a single migration failure mm/migrate.c | 9 +++++++++ tools/testing/selftests/mm/migration.c | 17 +++++++++++------ 2 files changed, 20 insertions(+), 6 deletions(-)