From patchwork Tue Oct 3 07:44:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kasireddy, Vivek" X-Patchwork-Id: 13406980 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 84BB0E75437 for ; Tue, 3 Oct 2023 08:05:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1CA088D0060; Tue, 3 Oct 2023 04:05:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 179A08D005D; Tue, 3 Oct 2023 04:05:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F37D48D0060; Tue, 3 Oct 2023 04:05:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id E59DE8D005D for ; Tue, 3 Oct 2023 04:05:29 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id C4FF41601BF for ; Tue, 3 Oct 2023 08:05:29 +0000 (UTC) X-FDA: 81303415578.15.86EC606 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by imf26.hostedemail.com (Postfix) with ESMTP id BCE9F14001D for ; Tue, 3 Oct 2023 08:05:27 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=H0CwpFDD; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf26.hostedemail.com: domain of vivek.kasireddy@intel.com designates 134.134.136.100 as permitted sender) smtp.mailfrom=vivek.kasireddy@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1696320327; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=dwnrxfUvAYoyWCIfLVEopVveMSGDqTSy237H5Iqy56k=; b=tXSvxEqVLqE/U+EqECWeDlq82gckRQ5jsm3kyFWQR6slo5YuUss1i9PkdsDxK+Ne2ZEJiC pTKlUSX/mDsx9sI5wwfS8fPQ1UAquxGxorbPnyz8mEd4sBUz11zmod6K9/KK4R1jVdmVSH u7BueOqXjvLACJlmX4IuTG4ZX190iJI= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=H0CwpFDD; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf26.hostedemail.com: domain of vivek.kasireddy@intel.com designates 134.134.136.100 as permitted sender) smtp.mailfrom=vivek.kasireddy@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1696320327; a=rsa-sha256; cv=none; b=oz7SRCJrcb/0xGLw61Dv6IR1fccGx0NdUHpNz1iEcz/LRxq0m0iCoUxWQZ9L3gmZwV8vfE jwznTTRr/4MumPcIktPtrOL4t0J+HU6wyrIcUkJwUy21LrzB/nNul1toYArhaLonYaWAvv qCmjsJzk3VXvGBW5z1FP8Mb6HCJgX58= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1696320327; x=1727856327; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Ocd1PVVgB8FbEUS5R3uP96wQnUqQO7DOI3QH4iInWew=; b=H0CwpFDD+zsSUj9xEtgGNRL034t7O8fFCl1sBtwadWnk+LxZJfYfAHKS pUy+A4aALvCeFip0O8quRJcGledDxOfqZV+ws0iikxPAwOqtYEkOQvvsk nRXye9v9a0+CCEmZ2pVxbb0oRZkfvj4u5eEwuVbKS3N7LNTfPPSINXSon YJhtGWGALz0hQSFyy5RIGnG+vajNlHh5oHmZGQ70aXKTI3kf1LTYFHnzS H3lpWuJ12EE5K+gxi4YOStRM2T2XxTUzUaKgZErTFMLKTSLGZYkKAT9Er DL+xTt7fdEgur0V+Kf6xa1BpzQ3YjtmN24QHETKZFYBIclHnBH/pduUz6 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10851"; a="449306983" X-IronPort-AV: E=Sophos;i="6.03,196,1694761200"; d="scan'208";a="449306983" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Oct 2023 01:05:25 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10851"; a="700615287" X-IronPort-AV: E=Sophos;i="6.03,196,1694761200"; d="scan'208";a="700615287" Received: from vkasired-desk2.fm.intel.com ([10.105.128.127]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Oct 2023 01:05:25 -0700 From: Vivek Kasireddy To: dri-devel@lists.freedesktop.org, linux-mm@kvack.org Cc: Vivek Kasireddy , David Hildenbrand , Daniel Vetter , Mike Kravetz , Hugh Dickins , Peter Xu , Gerd Hoffmann , Dongwon Kim , Junxiao Chang , Jason Gunthorpe Subject: [PATCH v1 1/3] mm/gup: Introduce pin_user_pages_fd() for pinning shmem/hugetlbfs file pages Date: Tue, 3 Oct 2023 00:44:45 -0700 Message-Id: <20231003074447.3245729-2-vivek.kasireddy@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231003074447.3245729-1-vivek.kasireddy@intel.com> References: <20231003074447.3245729-1-vivek.kasireddy@intel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: BCE9F14001D X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: j6a1fwfatwbz5ysiduii5qp669ncaxwq X-HE-Tag: 1696320327-456218 X-HE-Meta: U2FsdGVkX1/FEw2SFAoZgSzYMvi+NF9Hbyj/m2VwS9HNhBaDd9rQ8nuYtKfsm/dQGmTbF14ruTr6gQZUaaPWYfExe4sdJFpqA4DQquBb9fJaMd5j7t9TJy9c3JNGiIOom/ttM8FaLH1ZzDXxSHluzEr3tYcyb+1RA3oRhKISkZU1Y6/d+quhxjlKgu6QEAk/6hnIllyhvqCl8dY7PLPAY7RuOZjrqzqV2LGjgK6oKuU0CpoLV2tkJVWMnutUALj3Dk25ylpmz3kmYRVQr5jQ120Ta41Ng7SlI1VS7BZLPgcjLG8Roh6LDDcMBQU8tcBuvdNhQXAavfI6GioKYD1KBTtXJ69+bD2T3WM/snDdVYyIi7aFU2ZNwbpOwazzgUrt+oSz2tLMl72LsIojTVi9eVK8IvJB55OUZa8XZ8CEg6rc8J/d4/0HG4C3qv4PeC/VhAFvV2T7lyX/IghlM2wHu5UwJK0A11Hh5N0+ZQV0NiZcswkp/lorVJbgzCb/1fQ66kHVbFwmSfd2mnpj0odBuz8S1DZmBYwgmyA7bVAfkrLmbsw83QMfVb3do7DOtlCv/wpkD1HVLOzt6+PMxJT+VNM06FnMOP7GL054h/uYIPzQe/WiR8NVz/LehGQvK0zINAl7Oi1eWmZkO/60fqQpZtvCJzChM3YXeDkNQcyK6IINjTeSet+Ft6ciQF5vsznJwd+nMD40B2eq+yOVK3ewhVVsS/2epRXIt3biH/xf4fFOyFKFmxnRqXK7R8R1Ho062vx4NmEYvEEdP91nM1WnZe/DheIOyZi0g0ZaB07hnvFWTZ9ja3ymdZ0HQUnLJSCdNLcmpzR5WI7Cq1JP4GKK1wDp+JRzzxeaB2L/4x0rG8HbLGvb9+emqiaUcua9P1Edl954kps7P85VwLnEjMtZWxyEY3vHACU9RcW8p33Tu2ayqEbsHkuveQOcwajkfBVZ32TzwQUoOaTp5HleLf+ Jmxl+eUP gVVNc3PqQ4FifxpD67PQnA8RLrvsz3jIxHh+B3u17G5UxMnhZmqcppZddtOOeVDoHJO9CYvp7x0sGz6NBQAMpoPD8kbaztGZHq4b5HxqwOzRX3oHnlTM22HNi78UYoC6XfA1dx+LT3ZAEF9DWQZ2ELwOqBWE/aj8X+1wLgyJJ97dYk9x7Jkr5lpwjrwU1YC7ZxiZnJ4gKZ25WGMj0rZWLqy2YRQrvYiXgYaL4aTfEBVKuT3jIeRdLDDKuX9L0ZZxAA/lucdyPEwMkjMreVurMo0kXpiRixAhPPaVt4uWnttnrFDw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: For drivers that would like to longterm-pin the pages associated with a file, the pin_user_pages_fd() API provides an option to not only FOLL_PIN the pages but also to check and migrate them if they reside in movable zone or CMA block. For now, this API can only work with files belonging to shmem or hugetlbfs given that the udmabuf driver is the only user. It must be noted that the pages associated with hugetlbfs files are expected to be found in the page cache. An error is returned if they are not found. However, shmem pages can be swapped in or allocated if they are not present in the page cache. Cc: David Hildenbrand Cc: Daniel Vetter Cc: Mike Kravetz Cc: Hugh Dickins Cc: Peter Xu Cc: Gerd Hoffmann Cc: Dongwon Kim Cc: Junxiao Chang Suggested-by: Jason Gunthorpe Signed-off-by: Vivek Kasireddy --- include/linux/mm.h | 2 ++ mm/gup.c | 87 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 89 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index bf5d0b1b16f4..af2121fb8101 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2457,6 +2457,8 @@ long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages, struct page **pages, unsigned int gup_flags); long pin_user_pages_unlocked(unsigned long start, unsigned long nr_pages, struct page **pages, unsigned int gup_flags); +long pin_user_pages_fd(int fd, pgoff_t start, unsigned long nr_pages, + unsigned int gup_flags, struct page **pages); int get_user_pages_fast(unsigned long start, int nr_pages, unsigned int gup_flags, struct page **pages); diff --git a/mm/gup.c b/mm/gup.c index 2f8a2d89fde1..e34b77a15fa8 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -3400,3 +3400,90 @@ long pin_user_pages_unlocked(unsigned long start, unsigned long nr_pages, &locked, gup_flags); } EXPORT_SYMBOL(pin_user_pages_unlocked); + +/** + * pin_user_pages_fd() - pin user pages associated with a file + * @fd: the fd whose pages are to be pinned + * @start: starting file offset + * @nr_pages: number of pages from start to pin + * @gup_flags: flags modifying pin behaviour + * @pages: array that receives pointers to the pages pinned. + * Should be at least nr_pages long. + * + * Attempt to pin (and migrate) pages associated with a file belonging to + * either shmem or hugetlbfs. An error is returned if pages associated with + * hugetlbfs files are not present in the page cache. However, shmem pages + * are swapped in or allocated if they are not present in the page cache. + * + * Returns number of pages pinned. This would be equal to the number of + * pages requested. + * If nr_pages is 0 or negative, returns 0. If no pages were pinned, returns + * -errno. + */ +long pin_user_pages_fd(int fd, pgoff_t start, unsigned long nr_pages, + unsigned int gup_flags, struct page **pages) +{ + struct page *page; + struct file *filep; + unsigned int flags, i; + long ret; + + if (nr_pages <= 0) + return 0; + if (!is_valid_gup_args(pages, NULL, &gup_flags, FOLL_PIN)) + return 0; + + if (start < 0) + return -EINVAL; + + filep = fget(fd); + if (!filep) + return -EINVAL; + + if (!shmem_file(filep) && !is_file_hugepages(filep)) + return -EINVAL; + + flags = memalloc_pin_save(); + do { + for (i = 0; i < nr_pages; i++) { + if (shmem_mapping(filep->f_mapping)) { + page = shmem_read_mapping_page(filep->f_mapping, + start + i); + if (IS_ERR(page)) { + ret = PTR_ERR(page); + goto err; + } + } else { + page = find_get_page_flags(filep->f_mapping, + start + i, + FGP_ACCESSED); + if (!page) { + ret = -EINVAL; + goto err; + } + } + ret = try_grab_page(page, FOLL_PIN); + if (unlikely(ret)) + goto err; + + pages[i] = page; + put_page(pages[i]); + } + + ret = check_and_migrate_movable_pages(nr_pages, pages); + } while (ret == -EAGAIN); + +err: + memalloc_pin_restore(flags); + fput(filep); + if (!ret) + return nr_pages; + + while (i > 0 && pages[--i]) { + unpin_user_page(pages[i]); + pages[i] = NULL; + } + return ret; +} +EXPORT_SYMBOL_GPL(pin_user_pages_fd); + From patchwork Tue Oct 3 07:44:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kasireddy, Vivek" X-Patchwork-Id: 13406982 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1DA3BE75436 for ; Tue, 3 Oct 2023 08:05:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 54D1D8D0061; Tue, 3 Oct 2023 04:05:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4AF778D005D; Tue, 3 Oct 2023 04:05:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2D9538D0061; Tue, 3 Oct 2023 04:05:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 1E79B8D005D for ; Tue, 3 Oct 2023 04:05:31 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id E75CB1401B4 for ; Tue, 3 Oct 2023 08:05:30 +0000 (UTC) X-FDA: 81303415620.09.0A4D681 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by imf25.hostedemail.com (Postfix) with ESMTP id BE110A001E for ; Tue, 3 Oct 2023 08:05:28 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=lLwEDb9B; spf=pass (imf25.hostedemail.com: domain of vivek.kasireddy@intel.com designates 134.134.136.100 as permitted sender) smtp.mailfrom=vivek.kasireddy@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1696320329; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Mf7FY0rwkqJvT2QMUY53LF+cP2eeatX5ifIrenHzC+A=; b=dpG6OZUzlyxikCuDQvmGPunNh/PesJPDYlTjw2kTnFqILPOBzRSY27aY0ikL3tXdsPLbMF V4Mm7/Lv/mu34aKwXPnE/RIJl2nQ0gWh0dGGGZZBy24x+FhMH2LeIcbJZURnX1lJMxtXzX vrcrDwVCfM2M4jlnyOpmX1QsKCKlyO4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1696320329; a=rsa-sha256; cv=none; b=EhJudoi13UDd/rEG0bpwzs9YA+sT+6MJnJAyJf1CmiuyV++f9m10v+S8CBdnKdKfN5+VAe ATWeLKxQQ9Xat3scli++OENyVEqyhIT6ocaQVKKltGq12fLv6c7b1GgFU3HYJ04+0HIRXW 5STlgOqKLeEQA7LZY1jwzEM0j9WNJdE= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=lLwEDb9B; spf=pass (imf25.hostedemail.com: domain of vivek.kasireddy@intel.com designates 134.134.136.100 as permitted sender) smtp.mailfrom=vivek.kasireddy@intel.com; dmarc=pass (policy=none) header.from=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1696320328; x=1727856328; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=nn3xjby11PVMXRKfpdpNwR1Yzd2Qu+4y8cyJ8c9gw2c=; b=lLwEDb9BqCjhrLV+75IReJqdscyf8jQ+rTG22NoF+kWi116w7BNaIAed xxJmkOdEFDmOtpYOv3LR1iqs5HebCAHEulY+U9b2oeoYj1qpIpW4lQpJ0 5nPfNT5UchsWihqp98B3UG9ycQD0GcJkJ4ux6v4Z8e/l+iuSZWoZZtfpg 9CclQrZqJag1e21KdtQBz7vZ4/ljPbPHpWUgO47ORwozq73njirp5VsBX oEUmEPEFwlcOJvFrkjyF5nKq8pjmpL1HCDxc9eL3391LWfFZOkJr2OMnV PHQ+uydFoqrmfe9ABU7D9qR7zhrWYiSH2EWbbYg29hh9vlkNMY/Bp6SV2 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10851"; a="449306990" X-IronPort-AV: E=Sophos;i="6.03,196,1694761200"; d="scan'208";a="449306990" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Oct 2023 01:05:26 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10851"; a="700615299" X-IronPort-AV: E=Sophos;i="6.03,196,1694761200"; d="scan'208";a="700615299" Received: from vkasired-desk2.fm.intel.com ([10.105.128.127]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Oct 2023 01:05:25 -0700 From: Vivek Kasireddy To: dri-devel@lists.freedesktop.org, linux-mm@kvack.org Cc: Vivek Kasireddy , David Hildenbrand , Daniel Vetter , Mike Kravetz , Hugh Dickins , Peter Xu , Jason Gunthorpe , Gerd Hoffmann , Dongwon Kim , Junxiao Chang Subject: [PATCH v1 2/3] udmabuf: Pin the pages using pin_user_pages_fd() API Date: Tue, 3 Oct 2023 00:44:46 -0700 Message-Id: <20231003074447.3245729-3-vivek.kasireddy@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231003074447.3245729-1-vivek.kasireddy@intel.com> References: <20231003074447.3245729-1-vivek.kasireddy@intel.com> MIME-Version: 1.0 X-Stat-Signature: 33fwmizz6qhaygf99rbzaxi7tmt5z6su X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: BE110A001E X-Rspam-User: X-HE-Tag: 1696320328-772343 X-HE-Meta: U2FsdGVkX18VvBNGYDitJfsihdG9C70XU/+tKbqk1uBGqXebwPFreqEVvQzomGEVESkcUpDCpQSOG8aD0ir76DmVges+CRyghrvna6TiekMJgDJA7TCUuBpc/lLtRNR1xtPv5BHviIqE5r95A1Z6NtDsrKTLKox4vdfc/vJMecQvnSG2OqF2tnZjgiAux6S1f/b/Z8M7N9zIc8FQ3Wng0QqXwiYySu3cwO0P/5aFWZEddLqvADiFMLoij/x2XPbX6BagZqjQ/l3R3m2/CwZwhiJLefLVz+anVvIkHf1Asls/N8aBQ5QdPZfuhZlo1dvHu+Iedahef9c093+VeWiNyOeSZCt+CRFtLcxmNwZkE4FUuC886R1LIBFEA8ag7/g24+XsnVAPg8TqlWtksilZS+Xkvu5DUHGiv9DbZOr+8Pna1RVVDLTrWsIDyie+ZtmXx6QjXhHE1hybaYj682mLwIGPtmBpIRD0njhWBLmM6p7p+xI9L72HnjfydpWXa7UgKCV0a24bIEnepH595Ux+ak/7BJAzeVYGz49oHh08y0IQF4GnLUTY4ZyD8dHcl+s4ydvi7WozFOWqCK047usWCm7jsHWiW36u4NBM07V7oXdzlIL/4V+OkjocxXPmJNTA0YIZR8+UKMtfmJhMJ8jd32QxBwiRP36u2D9OL0jHcYVfcG0sIB/l/g4AV5e5HAk/tIT5P/6uC4/PF6g25bRCzsLJ5fwjkqlFMt0ktpbttT2aAfm6C4UKZEJR5D42N5SCBfl3bNVS3nt0G/IYU1nkZS+U5PQoJTj/PrMYo04hZN7LglkJt3hFIDSLtKRkp0bPGetmwWByz4U8L35awY46gTKYblZVTc6+HCl0g7CB8i9GybvalA/CylVkCkaNXrBwm920xZtbDdDJWE+1Kgzl2U9gY7l5J9PLKG3allel3yWUHGo/d1OEdUdDzqkSpaoEnmmhN2ZgY/J+Q6ESuYQ 1oGI4Vvj 7SkTWlvpo8SYKOSLLb3cXc750uypaljaOh6R4YHZHIpSrBkRNUCPIwVlHODbxFWHGrz+oufCLT99RW3ldrTo6ijA9KFyjfdFrdmlCU4COmJ9UnTs0/myGh+Ve5tEm13tI1m6z6jjUdp4wjJmM3eyymA17EzPjkKJw0AjZTFkdX5iqgDJKjDrzCRl0MvMCiO9X/7Tch9Xe/7NUhq2SvKITCyTi+qFndVGS49YneG78sL9TXrWDfJLujrEAM4Q0Ij9nJ20+juDYOgz8v6hhzv7Ob75urcBBxmVlgaiC2f7ca1uR0ao= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Using pin_user_pages_fd() will ensure that the pages are pinned correctly using FOLL_PIN. And, this also ensures that we don't accidentally break features such as memory hotunplug as it would not allow pinning pages in the movable zone. This patch also adds back support for mapping hugetlbfs pages by noting the subpage offsets within the huge pages and uses this information while populating the scatterlist. Cc: David Hildenbrand Cc: Daniel Vetter Cc: Mike Kravetz Cc: Hugh Dickins Cc: Peter Xu Cc: Jason Gunthorpe Cc: Gerd Hoffmann Cc: Dongwon Kim Cc: Junxiao Chang Signed-off-by: Vivek Kasireddy --- drivers/dma-buf/udmabuf.c | 82 +++++++++++++++++++++++++++++---------- 1 file changed, 61 insertions(+), 21 deletions(-) diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c index 820c993c8659..9ef1eaf4df4b 100644 --- a/drivers/dma-buf/udmabuf.c +++ b/drivers/dma-buf/udmabuf.c @@ -10,6 +10,7 @@ #include #include #include +#include #include #include #include @@ -28,6 +29,7 @@ struct udmabuf { struct page **pages; struct sg_table *sg; struct miscdevice *device; + pgoff_t *subpgoff; }; static vm_fault_t udmabuf_vm_fault(struct vm_fault *vmf) @@ -90,23 +92,31 @@ static struct sg_table *get_sg_table(struct device *dev, struct dma_buf *buf, { struct udmabuf *ubuf = buf->priv; struct sg_table *sg; + struct scatterlist *sgl; + pgoff_t offset; + unsigned long i = 0; int ret; sg = kzalloc(sizeof(*sg), GFP_KERNEL); if (!sg) return ERR_PTR(-ENOMEM); - ret = sg_alloc_table_from_pages(sg, ubuf->pages, ubuf->pagecount, - 0, ubuf->pagecount << PAGE_SHIFT, - GFP_KERNEL); + + ret = sg_alloc_table(sg, ubuf->pagecount, GFP_KERNEL); if (ret < 0) - goto err; + goto err_alloc; + + for_each_sg(sg->sgl, sgl, ubuf->pagecount, i) { + offset = ubuf->subpgoff ? ubuf->subpgoff[i] : 0; + sg_set_page(sgl, ubuf->pages[i], PAGE_SIZE, offset); + } ret = dma_map_sgtable(dev, sg, direction, 0); if (ret < 0) - goto err; + goto err_map; return sg; -err: +err_map: sg_free_table(sg); +err_alloc: kfree(sg); return ERR_PTR(ret); } @@ -142,7 +152,9 @@ static void release_udmabuf(struct dma_buf *buf) put_sg_table(dev, ubuf->sg, DMA_BIDIRECTIONAL); for (pg = 0; pg < ubuf->pagecount; pg++) - put_page(ubuf->pages[pg]); + unpin_user_page(ubuf->pages[pg]); + + kfree(ubuf->subpgoff); kfree(ubuf->pages); kfree(ubuf); } @@ -202,12 +214,13 @@ static long udmabuf_create(struct miscdevice *device, { DEFINE_DMA_BUF_EXPORT_INFO(exp_info); struct file *memfd = NULL; - struct address_space *mapping = NULL; struct udmabuf *ubuf; struct dma_buf *buf; - pgoff_t pgoff, pgcnt, pgidx, pgbuf = 0, pglimit; - struct page *page; - int seals, ret = -EINVAL; + pgoff_t pgoff, pgcnt, pgbuf = 0, pglimit, nr_pages; + pgoff_t subpgoff, maxsubpgs; + struct hstate *hpstate; + long ret = -EINVAL; + int seals; u32 i, flags; ubuf = kzalloc(sizeof(*ubuf), GFP_KERNEL); @@ -241,8 +254,7 @@ static long udmabuf_create(struct miscdevice *device, memfd = fget(list[i].memfd); if (!memfd) goto err; - mapping = memfd->f_mapping; - if (!shmem_mapping(mapping)) + if (!shmem_file(memfd) && !is_file_hugepages(memfd)) goto err; seals = memfd_fcntl(memfd, F_GET_SEALS, 0); if (seals == -EINVAL) @@ -253,14 +265,41 @@ static long udmabuf_create(struct miscdevice *device, goto err; pgoff = list[i].offset >> PAGE_SHIFT; pgcnt = list[i].size >> PAGE_SHIFT; - for (pgidx = 0; pgidx < pgcnt; pgidx++) { - page = shmem_read_mapping_page(mapping, pgoff + pgidx); - if (IS_ERR(page)) { - ret = PTR_ERR(page); + if (is_file_hugepages(memfd)) { + if (!ubuf->subpgoff) { + ubuf->subpgoff = kmalloc_array(ubuf->pagecount, + sizeof(*ubuf->subpgoff), + GFP_KERNEL); + if (!ubuf->subpgoff) { + ret = -ENOMEM; + goto err; + } + } + hpstate = hstate_file(memfd); + pgoff = list[i].offset >> huge_page_shift(hpstate); + subpgoff = (list[i].offset & + ~huge_page_mask(hpstate)) >> PAGE_SHIFT; + maxsubpgs = huge_page_size(hpstate) >> PAGE_SHIFT; + } + + do { + nr_pages = shmem_file(memfd) ? pgcnt : 1; + ret = pin_user_pages_fd(list[i].memfd, pgoff, + nr_pages, FOLL_LONGTERM, + ubuf->pages + pgbuf); + if (ret < 0) goto err; + + if (is_file_hugepages(memfd)) { + ubuf->subpgoff[pgbuf] = subpgoff << PAGE_SHIFT; + if (++subpgoff == maxsubpgs) { + subpgoff = 0; + pgoff++; + } } - ubuf->pages[pgbuf++] = page; - } + pgbuf += nr_pages; + pgcnt -= nr_pages; + } while (pgcnt > 0); fput(memfd); memfd = NULL; } @@ -283,10 +322,11 @@ static long udmabuf_create(struct miscdevice *device, return dma_buf_fd(buf, flags); err: - while (pgbuf > 0) - put_page(ubuf->pages[--pgbuf]); + while (pgbuf > 0 && ubuf->pages[--pgbuf]) + unpin_user_page(ubuf->pages[pgbuf]); if (memfd) fput(memfd); + kfree(ubuf->subpgoff); kfree(ubuf->pages); kfree(ubuf); return ret; From patchwork Tue Oct 3 07:44:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kasireddy, Vivek" X-Patchwork-Id: 13406983 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2DB0BE75434 for ; Tue, 3 Oct 2023 08:05:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D3BDE8D0062; Tue, 3 Oct 2023 04:05:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CEE7A8D005D; Tue, 3 Oct 2023 04:05:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B169A8D0062; Tue, 3 Oct 2023 04:05:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 9D3518D005D for ; Tue, 3 Oct 2023 04:05:31 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 70F48A018E for ; Tue, 3 Oct 2023 08:05:31 +0000 (UTC) X-FDA: 81303415662.29.84FE170 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by imf28.hostedemail.com (Postfix) with ESMTP id 6CD04C000E for ; Tue, 3 Oct 2023 08:05:29 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=io7wbkHX; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf28.hostedemail.com: domain of vivek.kasireddy@intel.com designates 134.134.136.100 as permitted sender) smtp.mailfrom=vivek.kasireddy@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1696320329; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=h3emmYlObQz4YBWvxea89j9SxvY/uMNOIgA3BV8wvNU=; b=2KOQnXvPoBnMKsESznR//VBWNqFR46MwjgkNNf6CtcWgW4qIRwNsL1/esoUWHmzOerA41L SnfSuvHvdgEKZi8hfBLG2iZGmcAh9NKXpypuWAK4DbqNnz4sfMoAB8p0lhDgosoAp4/mSo ga411w/1xadMmmgmL30UMyYjXb0xfQg= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=io7wbkHX; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf28.hostedemail.com: domain of vivek.kasireddy@intel.com designates 134.134.136.100 as permitted sender) smtp.mailfrom=vivek.kasireddy@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1696320329; a=rsa-sha256; cv=none; b=DCBzWh6p2AI1l97TkzlOCCv04GQsIvPmZS78W0muYQw/+VwxxWwXmP4oObHd109S51HGzc RrAIKUUmz+1NfoNt2lhUChoCO6FG02zVc2k2tTqCkAcOjwKYdwnLkRmrgPlyP0dsKz1gAp 0YX2VmFyM6Cmxk1XUElVnItz9lu7U0k= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1696320329; x=1727856329; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=r39C9fURv/tYjjTgfRvVfOiacLkGBXJF65fG2x4l+b0=; b=io7wbkHXCSR8l1bLBlPapwjQf5KGlJhKyhJv7Ldy8GIlw4fNxs3af5nw 5Qy66uKBRGX79zxmKGQC/YY4WEl8rKnV/5JoR2dnJ/mW2I1MHj8gxawZe VD9q45VB3/gmGNgqrDm8ZadCiRe4oe7SyLCP3U0JlJor3x+9KEUg/M2/E UNHoPlCmqLbqfpsBLywIWOiSxDn1jSPA/wGAhutYHFyS07Tq22cvpCEP+ niEMvp5xAYvmu+6OpTSUAmsr+biKD+tnoasxX/0JCbToxjxXB65mRa/vM hmIJeh5l/yyH/yPVHd16PT22JIs47LyY0oNOxGt3u1VO3uG+vJ92srA3u A==; X-IronPort-AV: E=McAfee;i="6600,9927,10851"; a="449306998" X-IronPort-AV: E=Sophos;i="6.03,196,1694761200"; d="scan'208";a="449306998" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Oct 2023 01:05:26 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10851"; a="700615306" X-IronPort-AV: E=Sophos;i="6.03,196,1694761200"; d="scan'208";a="700615306" Received: from vkasired-desk2.fm.intel.com ([10.105.128.127]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Oct 2023 01:05:25 -0700 From: Vivek Kasireddy To: dri-devel@lists.freedesktop.org, linux-mm@kvack.org Cc: Vivek Kasireddy , Shuah Khan , David Hildenbrand , Daniel Vetter , Mike Kravetz , Hugh Dickins , Peter Xu , Jason Gunthorpe , Gerd Hoffmann , Dongwon Kim , Junxiao Chang Subject: [PATCH v1 3/3] selftests/dma-buf/udmabuf: Add tests to verify data after page migration Date: Tue, 3 Oct 2023 00:44:47 -0700 Message-Id: <20231003074447.3245729-4-vivek.kasireddy@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231003074447.3245729-1-vivek.kasireddy@intel.com> References: <20231003074447.3245729-1-vivek.kasireddy@intel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 6CD04C000E X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: 6o4upnw7fguw5ozh8bfgfm4qo8aw1mgc X-HE-Tag: 1696320329-722099 X-HE-Meta: U2FsdGVkX1+t2cIsegjCyjyuXsaIZMrAWeeQCYHrGddUklyq6wPTBBzbqTmChnnj8cA4jUPC49U6Z1nkIvVmVZOd5G7RnMhBbaneiWaGs0O9CqnJ5IyeMR5TN+fiWRj8F+5UAkxJ75JvMxiLUHdvHLthbJNzojoqvK7MBz++XPCjh5yMnzWbISJDicTJ3AUJePOQZlBS/DJ1Udn8zwckz2AENX1Ib2/SBDVKX1HBBo9Joa/uyvhRTTPt4+TaB/XpIKLuRdFNCRyrUHRWfU7tRCzIWiO31ijEoLpjBBYA1VaQRD4rYB/F4FnvxyaEM7zLPtVPaU/SmGiYSAqAXq3z1S3GlVjFTJ21AjZkWCvu1KTpvvMpDqElSOERZV9qxerXJmqW8AYaCHa0mi89P7Y9zakWjjQe76BlRpdJcbsKxZLXTKKP9DITOpsAXc5Ok0ogHiyhwRzaB5lb5yynLB/06hK0WZ0t++tSaxsahRmPoZdAZKbPCuSxD17lDx3GZnFXnyDUjwZmFetXmdY59R+phyNMP3dqY+DdtKYorfMYJUW52sbzVJoVseiz2sQpoMNr3qFw4OfM3MkS126gFR5e+Sk+8Wv1zEIW3CJY6O0VR4T38D3TewFkm412x/hJvkh5efFbnebYz3ds/xhD6pwvgW3JWebMRdp/3bxWqW5XGZ86hdNyedD6zaKM6E9CoZSEd242SmoHh2nQp2i9Em+d3hrHeBoTd6rE94H3MtWGxF4dA9XysfAnw2s5h/6gSmhwyYpnPetZAt0EbRjSY1hT6J/6/X/FQtN9gcImQH1ebYo2zfD2k3m+ED7q3aWv4nvGH9DsKu1+vGMWGdrRL3XEoq/hf50CV1lhbPwRKfSjM+LTgxB1DMFCeBY7gHYdeQfJcXkV19L51oROkeOFGXj3DiPPdvVqL31hLtd4Sckt7ACWPYtuT0EBA9fXSoiDFfofzMgp9dlXk/vWfK8NiVC CN1TDV9E SuwxqX3tUgfAnPql1N4DHPGc9i2WZdvgfyzHmwMUCJJY6FcTmQjM3l2Y9a08AYzAK9xywu6uLDJdg69hSaZhD4cZihR9KKZ/tsc/bokNqqgWPx3vwAbqY5U37PW0VgApWs3yXfY6yVqgc3NxivZjeifOwg1cf9ChTsNitGiFFfx/lIzTskfuOSB7ty7GL51XHDx7Z/GeGdawS6MpcSdIDJIWTNSibCe58e2707Df4jXsgEinuTIDxFR+HKnO4HbKzpCzF4sojtTy+xh4aufZ0Q0xXoaVL29LzXesh8Y5PRmNghP1VrKBEY0tHPw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Since the memfd pages associated with a udmabuf may be migrated as part of udmabuf create, we need to verify the data coherency after successful migration. The new tests added in this patch try to do just that using 4k sized pages and also 2 MB sized huge pages for the memfd. Successful completion of the tests would mean that there is no disconnect between the memfd pages and the ones associated with a udmabuf. And, these tests can also be augmented in the future to test newer udmabuf features (such as handling memfd hole punch). Cc: Shuah Khan Cc: David Hildenbrand Cc: Daniel Vetter Cc: Mike Kravetz Cc: Hugh Dickins Cc: Peter Xu Cc: Jason Gunthorpe Cc: Gerd Hoffmann Cc: Dongwon Kim Cc: Junxiao Chang Based-on-patch-by: Mike Kravetz Signed-off-by: Vivek Kasireddy --- .../selftests/drivers/dma-buf/udmabuf.c | 151 +++++++++++++++++- 1 file changed, 147 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/drivers/dma-buf/udmabuf.c b/tools/testing/selftests/drivers/dma-buf/udmabuf.c index c812080e304e..d76c813fe652 100644 --- a/tools/testing/selftests/drivers/dma-buf/udmabuf.c +++ b/tools/testing/selftests/drivers/dma-buf/udmabuf.c @@ -9,26 +9,132 @@ #include #include #include +#include #include #include +#include #include #include #define TEST_PREFIX "drivers/dma-buf/udmabuf" #define NUM_PAGES 4 +#define NUM_ENTRIES 4 +#define MEMFD_SIZE 1024 /* in pages */ -static int memfd_create(const char *name, unsigned int flags) +static unsigned int page_size; + +static int create_memfd_with_seals(off64_t size, bool hpage) +{ + int memfd, ret; + unsigned int flags = MFD_ALLOW_SEALING; + + if (hpage) + flags |= MFD_HUGETLB; + + memfd = memfd_create("udmabuf-test", flags); + if (memfd < 0) { + printf("%s: [skip,no-memfd]\n", TEST_PREFIX); + exit(77); + } + + ret = fcntl(memfd, F_ADD_SEALS, F_SEAL_SHRINK); + if (ret < 0) { + printf("%s: [skip,fcntl-add-seals]\n", TEST_PREFIX); + exit(77); + } + + ret = ftruncate(memfd, size); + if (ret == -1) { + printf("%s: [FAIL,memfd-truncate]\n", TEST_PREFIX); + exit(1); + } + + return memfd; +} + +static int create_udmabuf_list(int devfd, int memfd, off64_t memfd_size) +{ + struct udmabuf_create_list *list; + int ubuf_fd, i; + + list = malloc(sizeof(struct udmabuf_create_list) + + sizeof(struct udmabuf_create_item) * NUM_ENTRIES); + if (!list) { + printf("%s: [FAIL, udmabuf-malloc]\n", TEST_PREFIX); + exit(1); + } + + for (i = 0; i < NUM_ENTRIES; i++) { + list->list[i].memfd = memfd; + list->list[i].offset = i * (memfd_size / NUM_ENTRIES); + list->list[i].size = getpagesize() * NUM_PAGES; + } + + list->count = NUM_ENTRIES; + list->flags = UDMABUF_FLAGS_CLOEXEC; + ubuf_fd = ioctl(devfd, UDMABUF_CREATE_LIST, list); + free(list); + if (ubuf_fd < 0) { + printf("%s: [FAIL, udmabuf-create]\n", TEST_PREFIX); + exit(1); + } + + return ubuf_fd; +} + +static void write_to_memfd(void *addr, off64_t size, char chr) +{ + int i; + + for (i = 0; i < size / page_size; i++) { + *((char *)addr + (i * page_size)) = chr; + } +} + +static void *mmap_fd(int fd, off64_t size) { - return syscall(__NR_memfd_create, name, flags); + void *addr; + + addr = mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); + if (addr == MAP_FAILED) { + printf("%s: ubuf_fd mmap fail\n", TEST_PREFIX); + exit(1); + } + + return addr; +} + +static int compare_chunks(void *addr1, void *addr2, off64_t memfd_size) +{ + off64_t off; + int i = 0, j, k = 0, ret = 0; + char char1, char2; + + while (i < NUM_ENTRIES) { + off = i * (memfd_size / NUM_ENTRIES); + for (j = 0; j < NUM_PAGES; j++, k++) { + char1 = *((char *)addr1 + off + (j * getpagesize())); + char2 = *((char *)addr2 + (k * getpagesize())); + if (char1 != char2) { + ret = -1; + goto err; + } + } + i++; + } +err: + munmap(addr1, memfd_size); + munmap(addr2, NUM_ENTRIES * NUM_PAGES * getpagesize()); + return ret; } int main(int argc, char *argv[]) { struct udmabuf_create create; int devfd, memfd, buf, ret; - off_t size; - void *mem; + off64_t size; + void *addr1, *addr2; devfd = open("/dev/udmabuf", O_RDWR); if (devfd < 0) { @@ -90,6 +196,9 @@ int main(int argc, char *argv[]) } /* should work */ + page_size = getpagesize(); + addr1 = mmap_fd(memfd, size); + write_to_memfd(addr1, size, 'a'); create.memfd = memfd; create.offset = 0; create.size = size; @@ -98,6 +207,40 @@ int main(int argc, char *argv[]) printf("%s: [FAIL,test-4]\n", TEST_PREFIX); exit(1); } + munmap(addr1, size); + close(buf); + close(memfd); + + /* should work (migration of 4k size pages)*/ + size = MEMFD_SIZE * page_size; + memfd = create_memfd_with_seals(size, false); + addr1 = mmap_fd(memfd, size); + write_to_memfd(addr1, size, 'a'); + buf = create_udmabuf_list(devfd, memfd, size); + addr2 = mmap_fd(buf, NUM_PAGES * NUM_ENTRIES * getpagesize()); + write_to_memfd(addr1, size, 'b'); + ret = compare_chunks(addr1, addr2, size); + if (ret < 0) { + printf("%s: [FAIL,test-5]\n", TEST_PREFIX); + exit(1); + } + close(buf); + close(memfd); + + /* should work (migration of 2MB size huge pages)*/ + page_size = getpagesize() * 512; /* 2 MB */ + size = MEMFD_SIZE * page_size; + memfd = create_memfd_with_seals(size, true); + addr1 = mmap_fd(memfd, size); + write_to_memfd(addr1, size, 'a'); + buf = create_udmabuf_list(devfd, memfd, size); + addr2 = mmap_fd(buf, NUM_PAGES * NUM_ENTRIES * getpagesize()); + write_to_memfd(addr1, size, 'b'); + ret = compare_chunks(addr1, addr2, size); + if (ret < 0) { + printf("%s: [FAIL,test-6]\n", TEST_PREFIX); + exit(1); + } fprintf(stderr, "%s: ok\n", TEST_PREFIX); close(buf);