From patchwork Thu Jun 27 17:08:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13714875 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12851C30653 for ; Thu, 27 Jun 2024 17:09:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B75286B009E; Thu, 27 Jun 2024 13:09:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A82186B00A1; Thu, 27 Jun 2024 13:09:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 79C726B009F; Thu, 27 Jun 2024 13:09:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 533DB6B009E for ; Thu, 27 Jun 2024 13:09:10 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id D3EEC120BAC for ; Thu, 27 Jun 2024 17:09:09 +0000 (UTC) X-FDA: 82277304018.11.A7C361A Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf08.hostedemail.com (Postfix) with ESMTP id 165F6160013 for ; Thu, 27 Jun 2024 17:09:07 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=pjIV75Ud; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf08.hostedemail.com: domain of andrii@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=andrii@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719508135; a=rsa-sha256; cv=none; b=3EuuhOJ/dgaOctjVcq6TqD//A6/PTUBy97IO+tyXyi/+PE7YEav9ZxzaV1d8NwTJP9G6hY Rg0Rx4wtbKXPRsjNg0lnda7ZZgDT4xbiIrXFGhKSXqPdNbSaJELU8jmpCobiDEtTSzf2bv IylBTm09yYjycKGI+CgkSeHP/OT7qIw= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=pjIV75Ud; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf08.hostedemail.com: domain of andrii@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=andrii@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719508135; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pr/fv0ckxhWR6mlagUsulw9zLq7bdT+8t88N5DpZzzA=; b=w+40cW8TxtyvAN6jgLSW/boyLWKRDMjkwbVkiKcE+XHV87Q6Hz2RNyEWtsTucmFI52Z31O h7BLCASTGKYntTjoTg4MNWcG+EKQO5FBPRVGJUmgaxEDeI9DM8NjsXHQdQf52Cnev+YiM1 Z3a9JGf5CoMvHICnEm8HhKsILvNtADo= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 2209261F69; Thu, 27 Jun 2024 17:09:07 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BAC91C4AF07; Thu, 27 Jun 2024 17:09:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1719508146; bh=b8KoHPdCbnvChHG05PMTrqcngJcnOaTMuFGj6VIsl38=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=pjIV75UdEeKIB7IaNhqm13TydwUtSi56/ZC2Js19IziL9voEgHiIwjqXBr2yhR3L9 1aD9WGGVRkJYV/8UXnoJjYJ8L8GY8/BLLtk9TU4jAskODI/pNepgLHZ7YzRUboN1TH QVzI4nCGbatSePOYnxFYraXGY7YuyvYUXo7bKJR2fQR0uvXrXud8u4ei0uaJeFAeua 1wy5q4juZ0gjE2gTNngqv6AQSCplJgyvsoR47iGoWhzaCpOrijxgfJdUPOOrSaO1Dv tFVeG9/As5PFN5bGgB54aokX6FsGm+y0G8h15R1y/7roRHfbhsn03uKzajvqWOL8nd tz++M0cJJZ39Q== From: Andrii Nakryiko To: linux-fsdevel@vger.kernel.org, brauner@kernel.org, viro@zeniv.linux.org.uk, akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, bpf@vger.kernel.org, gregkh@linuxfoundation.org, linux-mm@kvack.org, liam.howlett@oracle.com, surenb@google.com, rppt@kernel.org, adobriyan@gmail.com, Andrii Nakryiko Subject: [PATCH v6 1/6] fs/procfs: extract logic for getting VMA name constituents Date: Thu, 27 Jun 2024 10:08:53 -0700 Message-ID: <20240627170900.1672542-2-andrii@kernel.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240627170900.1672542-1-andrii@kernel.org> References: <20240627170900.1672542-1-andrii@kernel.org> MIME-Version: 1.0 X-Rspamd-Queue-Id: 165F6160013 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 1sqxrrwb8pwpjwafq7tpbzxxocszkh7k X-HE-Tag: 1719508147-448605 X-HE-Meta: U2FsdGVkX1+CuNLfh0hFOmpAj6Ca0Rz1iNW2cXXUXEV3ZXCHd+31GyuzZt0+HqnqpxlU2Pun+bUWgOoi35guyLDZGO8wZDVxwB0y65A7qOG5J6mBqJ5dsKFtW7GCLCsCkTchCNjZE2Dg59eXnCCT4jN7FtMkSOS6UKnMCQKsTkki/GvGPHOvjo9+4pi4n+LiYlmpFqnnrq73FBm1SxObxlprHwDM4mMtW72WPeKRNFplmSl+V1ARL+yw070IUQwzHT/gErfTDcwoUUytLhdbjX3Ti6bgS5kjtIZzNk80S5RzhETFzuEte64cSShCHzxKmAgoNmBuoC8+ms6OLFE+IGErntCXYFV7LdEixBooKH6Hj10cybyon9S214kGhY6sC0ZQXCuOrRcA0W6Lm+WkRWC05F99KQOa+Z6O4wmD1ZZKe+u1Zoo2K8VLwbRbNM3HQ8WfN1DHthVMlrbPx0DbY5jRTlqtLBKeD+Rc3U9tJPsbsO4QbktSmvhiSTdhxx8xukMNcyrZi2yJXOmEnmCMPpJQjyIOtI5nRjbXrOIukSXG3ddC/Mg1XFGnCf5WBhXRkwvBMICf1v7hK9c35v7ZFoTCiYsH3NYgbAyWl6ZhLFil6Qtdeps7A5zh8KFFAE34W9Hxx4dII0M2LELdqNXa2zwkqDUOAFD5rd2JrZfkhA7qhVLpQNwyTCgFaTqfSRpNVocLc6FgiPa5svgYyI7zmZF9Nu4e1OC2LmeVqy3o+i5UzH5CJpl4KRF+3upkDcZsLScv3euXPQ/QVpO1P+q+H/G1j67RG7jGqrIRLkct+3nRnnwb24cOTcIBfYC63b8BXr1/TK+r9K2Gvj884fHIKfcy+hTGiA+/iGZGGHTmBO8qCSh103xwv8pi+zqrZd0Adcz3Wcc8a7fzJtpIZ6rnJTXB6UzLowFSyBAPuSuNFTvvO3p4AkbxkQsv3xOWUFemEjJp7Yz23ftPXLVv1q7 HP0H8E9M zISzeRzmKBWXy/W+xLNb8uROnIgMBINLVXNey6A+E64t/907jzWOZGTSu98Uyfge0Yd7/BfyvwAET8sor+z7phjYC1j9f85BsDj1uxOpGaq/2oz/hv2CBCmMPoITnogjiDjDE0QZpxAYyY761+irz9cCPd8WbZ6VCKMqP0A8qwMeqg980FDgjNdizdThugGLjJdj6hrHdJy1NYIv+DSpJQqx8GBG6n5ry2qZcFSZMgJiT3eO2ld4yCwex4s9Ahxat04j3fgVDt1qCR7RKD4UQxQGSHz8icI9gR39QXG7avM9cJEZRwkATdagsnN/6mV5D5EvY X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Extract generic logic to fetch relevant pieces of data to describe VMA name. This could be just some string (either special constant or user-provided), or a string with some formatted wrapping text (e.g., "[anon_shmem:]"), or, commonly, file path. seq_file-based logic has different methods to handle all three cases, but they are currently mixed in with extracting underlying sources of data. This patch splits this into data fetching and data formatting, so that data fetching can be reused later on. There should be no functional changes. Signed-off-by: Andrii Nakryiko --- fs/proc/task_mmu.c | 125 +++++++++++++++++++++++++-------------------- 1 file changed, 71 insertions(+), 54 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 93fb2c61b154..507b7dc7c4c8 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -239,6 +239,67 @@ static int do_maps_open(struct inode *inode, struct file *file, sizeof(struct proc_maps_private)); } +static void get_vma_name(struct vm_area_struct *vma, + const struct path **path, + const char **name, + const char **name_fmt) +{ + struct anon_vma_name *anon_name = vma->vm_mm ? anon_vma_name(vma) : NULL; + + *name = NULL; + *path = NULL; + *name_fmt = NULL; + + /* + * Print the dentry name for named mappings, and a + * special [heap] marker for the heap: + */ + if (vma->vm_file) { + /* + * If user named this anon shared memory via + * prctl(PR_SET_VMA ..., use the provided name. + */ + if (anon_name) { + *name_fmt = "[anon_shmem:%s]"; + *name = anon_name->name; + } else { + *path = file_user_path(vma->vm_file); + } + return; + } + + if (vma->vm_ops && vma->vm_ops->name) { + *name = vma->vm_ops->name(vma); + if (*name) + return; + } + + *name = arch_vma_name(vma); + if (*name) + return; + + if (!vma->vm_mm) { + *name = "[vdso]"; + return; + } + + if (vma_is_initial_heap(vma)) { + *name = "[heap]"; + return; + } + + if (vma_is_initial_stack(vma)) { + *name = "[stack]"; + return; + } + + if (anon_name) { + *name_fmt = "[anon:%s]"; + *name = anon_name->name; + return; + } +} + static void show_vma_header_prefix(struct seq_file *m, unsigned long start, unsigned long end, vm_flags_t flags, unsigned long long pgoff, @@ -262,17 +323,15 @@ static void show_vma_header_prefix(struct seq_file *m, static void show_map_vma(struct seq_file *m, struct vm_area_struct *vma) { - struct anon_vma_name *anon_name = NULL; - struct mm_struct *mm = vma->vm_mm; - struct file *file = vma->vm_file; + const struct path *path; + const char *name_fmt, *name; vm_flags_t flags = vma->vm_flags; unsigned long ino = 0; unsigned long long pgoff = 0; unsigned long start, end; dev_t dev = 0; - const char *name = NULL; - if (file) { + if (vma->vm_file) { const struct inode *inode = file_user_inode(vma->vm_file); dev = inode->i_sb->s_dev; @@ -283,57 +342,15 @@ show_map_vma(struct seq_file *m, struct vm_area_struct *vma) start = vma->vm_start; end = vma->vm_end; show_vma_header_prefix(m, start, end, flags, pgoff, dev, ino); - if (mm) - anon_name = anon_vma_name(vma); - /* - * Print the dentry name for named mappings, and a - * special [heap] marker for the heap: - */ - if (file) { + get_vma_name(vma, &path, &name, &name_fmt); + if (path) { seq_pad(m, ' '); - /* - * If user named this anon shared memory via - * prctl(PR_SET_VMA ..., use the provided name. - */ - if (anon_name) - seq_printf(m, "[anon_shmem:%s]", anon_name->name); - else - seq_path(m, file_user_path(file), "\n"); - goto done; - } - - if (vma->vm_ops && vma->vm_ops->name) { - name = vma->vm_ops->name(vma); - if (name) - goto done; - } - - name = arch_vma_name(vma); - if (!name) { - if (!mm) { - name = "[vdso]"; - goto done; - } - - if (vma_is_initial_heap(vma)) { - name = "[heap]"; - goto done; - } - - if (vma_is_initial_stack(vma)) { - name = "[stack]"; - goto done; - } - - if (anon_name) { - seq_pad(m, ' '); - seq_printf(m, "[anon:%s]", anon_name->name); - } - } - -done: - if (name) { + seq_path(m, path, "\n"); + } else if (name_fmt) { + seq_pad(m, ' '); + seq_printf(m, name_fmt, name); + } else if (name) { seq_pad(m, ' '); seq_puts(m, name); } From patchwork Thu Jun 27 17:08:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13714876 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0DE3C2BD09 for ; Thu, 27 Jun 2024 17:09:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 26FC66B007B; Thu, 27 Jun 2024 13:09:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 21E9F6B00A0; Thu, 27 Jun 2024 13:09:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 074456B00A1; Thu, 27 Jun 2024 13:09:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id D18936B007B for ; Thu, 27 Jun 2024 13:09:13 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 6E9751C0ED1 for ; Thu, 27 Jun 2024 17:09:13 +0000 (UTC) X-FDA: 82277304186.10.0557449 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf15.hostedemail.com (Postfix) with ESMTP id A904DA001F for ; Thu, 27 Jun 2024 17:09:11 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="Z03s4vs/"; spf=pass (imf15.hostedemail.com: domain of andrii@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=andrii@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719508134; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9Zo+1JEKIzz1OwlKKDDK2yyn0oCenUmwNpgvclYYauo=; b=O3oepYDfD5xeWRJe8WVMVLBANv2oy9fiT3guxhAp3wtuKE0PTvdGJTlww465dUrWqt4P42 tUNc516hf7fJZQX63WM+/57oRDa6gJI1UhxRja1aC+R/UAVOfBr2crfOJZElLqs+7UjcP0 mouiyTaFYlfLWEYzQM5UKNXvPoDSUDY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719508134; a=rsa-sha256; cv=none; b=sR++RzTMv+q1lpDI9QLW8ZmZ6mhw4vxNgxdwsey1fcfnzc/wzj3Jfo4CJWWl0sOjECcfoW 83iTUw2Y+/KRLHmiThKOAOKAONT38AJakNuvOC83rgt0jY+9bY9AgzsyjMkeu3wmwXlIOn rHG6NkFjHG2QIYEIRK8y7Y62aorpd+w= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="Z03s4vs/"; spf=pass (imf15.hostedemail.com: domain of andrii@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=andrii@kernel.org; dmarc=pass (policy=none) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id AA9F061F73; Thu, 27 Jun 2024 17:09:10 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1E603C2BBFC; Thu, 27 Jun 2024 17:09:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1719508150; bh=9Gnf+HjZin2BMpzNoW81HhiINRR/BSpV3DcdmzO83iU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Z03s4vs/C4+TjTET01HIbC/csGwc2gGSeuDV7mjb/bKVJUqJV3xF2w4+HGEusR6jO HI5zC1ZSXA8p+OpKNhV1aymzx3EdcId1OeLK0XJ2FswzdAh0rsHvgwyR7sms+A8Q2P xM11Z/4qTU1FW+BB/JWqM+Wck1TS2FmU6HfQZXF5Y5tA/6cgxYhzcOiR2gN/eOFR9a +SiihUFjQtwij6wDRFi1YweV0qFHWXPvSQ02dfw20DE2+7VZtcAsTqp4Z7oPG8w2CB aeqpiTNtkHvJsHoRMw7KJHdF5JswvFM8lgmPhvb69gUwR9zCFjcF9U5fW1ZsK/tgzr fHz1YNQuhz4vw== From: Andrii Nakryiko To: linux-fsdevel@vger.kernel.org, brauner@kernel.org, viro@zeniv.linux.org.uk, akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, bpf@vger.kernel.org, gregkh@linuxfoundation.org, linux-mm@kvack.org, liam.howlett@oracle.com, surenb@google.com, rppt@kernel.org, adobriyan@gmail.com, Andrii Nakryiko Subject: [PATCH v6 2/6] fs/procfs: implement efficient VMA querying API for /proc//maps Date: Thu, 27 Jun 2024 10:08:54 -0700 Message-ID: <20240627170900.1672542-3-andrii@kernel.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240627170900.1672542-1-andrii@kernel.org> References: <20240627170900.1672542-1-andrii@kernel.org> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: A904DA001F X-Stat-Signature: 6usyjo9cu541mdseai6khefegt4d9jg3 X-HE-Tag: 1719508151-807696 X-HE-Meta: U2FsdGVkX19t0ur/1pUSA10dQTh6Afdx43sX8aiyIBFdpA0E4pWfAaB9hKG4dXVG0q7+lS4gvZm1aA273bbBH9XvPaXruQiDckmRTbHHbc+j/olxlzZMmXfu0g836c5YxFbk/6gmfTf5HAju1ecMJoqTVZD2RUmCPT80tokOWLWv90W4NtbTnVXvQvgCAR8YnlsrAe61WNYntixePpg5urchRnPjQEXTzM3hmSmuPvdHzv2MnrPTaQjibkTzlEbnwZXfHdF7lu8xFMV6uFLSP2gbNgCE5CvnEZB2Wb7AGfmHBy2Q9xNG5eo287k30xPKe25g3laoyQwf5Pq3T3fee4Vfzz4KO7I8t99wi4TUQ3WO2ypmxhUqPkZMlHCMkcmjKJEmB722XkwWbDst6iNt2IZNXEXH+yLTu17rnNgUM40jQqgZ8/WrF4on2MlvmYu+q9XRrQkqMPEERK9kqBLfhCBCUR5oSNO9m0Ts3YnGQldXd/cA77S0OgJwn4yZz4HsXTs833Ph6X3atkU62hkqDoNbSo2O2YWPUcKhYKVYJECR/fQV+GB0DnxYDGKbSunrSikSaQs8fgouPgr+PmSuf7hVp03hiogIJ6uGDKXdmDc8CLpHNFOi3krznhESmzjoPjKv4dfNYrwzsG1MuY5AIvyQ6qHjdMlmBCTfUEFQepGB4neuInv5gbFJdWkO9l2WpYD9khPL7lWV+DD0u1AiJ/kEO2IGsik+rXX52PLKFTKETKWx2H7yThpX4ni5keHaEMKjJj9Qag6rhKcj+hSpDzUB0rcQxrxtv36VG59SsKJkFi1ZI0Or9zNhHTemZXaRaBuhJo71nll7998QL0ipryQYRTUGWRCMbOPLy7r2DGQD+lFrWAeZFsaCS64qKiks46ZliCj9RKnkyo3O4R6BMckZ/9/VWjbGEo0eJ5wT+ceMLrfR+OrYLHv6r+s0T5ewHKmcv02BnLX3Fa6Fq5w R5YvE0IB 1U3qF52cRqxwWs2i1WzGPBy3t09E/sC4aLYX+jGx2CvnQiNsTnv5ok1QWeSyqbkBtUxxQCYKVe5yeagaKE0hftZg82eqdzEPyv+2N1ZRlS52c5I8dH+LY2eJKryXddF7V2gh6TddZlYmJSnKJnia90EMJAbUDfxEYF+9LGcP+6kfaxXih6n3dTMdaNwVcycBDWJ/HSJLQw69Yad9boPJoyo6dRtgEccTp4RfcZE1f0HXexh8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: /proc//maps file is extremely useful in practice for various tasks involving figuring out process memory layout, what files are backing any given memory range, etc. One important class of applications that absolutely rely on this are profilers/stack symbolizers (perf tool being one of them). Patterns of use differ, but they generally would fall into two categories. In on-demand pattern, a profiler/symbolizer would normally capture stack trace containing absolute memory addresses of some functions, and would then use /proc//maps file to find corresponding backing ELF files (normally, only executable VMAs are of interest), file offsets within them, and then continue from there to get yet more information (ELF symbols, DWARF information) to get human-readable symbolic information. This pattern is used by Meta's fleet-wide profiler, as one example. In preprocessing pattern, application doesn't know the set of addresses of interest, so it has to fetch all relevant VMAs (again, probably only executable ones), store or cache them, then proceed with profiling and stack trace capture. Once done, it would do symbolization based on stored VMA information. This can happen at much later point in time. This patterns is used by perf tool, as an example. In either case, there are both performance and correctness requirement involved. This address to VMA information translation has to be done as efficiently as possible, but also not miss any VMA (especially in the case of loading/unloading shared libraries). In practice, correctness can't be guaranteed (due to process dying before VMA data can be captured, or shared library being unloaded, etc), but any effort to maximize the chance of finding the VMA is appreciated. Unfortunately, for all the /proc//maps file universality and usefulness, it doesn't fit the above use cases 100%. First, it's main purpose is to emit all VMAs sequentially, but in practice captured addresses would fall only into a smaller subset of all process' VMAs, mainly containing executable text. Yet, library would need to parse most or all of the contents to find needed VMAs, as there is no way to skip VMAs that are of no use. Efficient library can do the linear pass and it is still relatively efficient, but it's definitely an overhead that can be avoided, if there was a way to do more targeted querying of the relevant VMA information. Second, it's a text based interface, which makes its programmatic use from applications and libraries more cumbersome and inefficient due to the need to handle text parsing to get necessary pieces of information. The overhead is actually payed both by kernel, formatting originally binary VMA data into text, and then by user space application, parsing it back into binary data for further use. For the on-demand pattern of usage, described above, another problem when writing generic stack trace symbolization library is an unfortunate performance-vs-correctness tradeoff that needs to be made. Library has to make a decision to either cache parsed contents of /proc//maps (after initial processing) to service future requests (if application requests to symbolize another set of addresses (for the same process), captured at some later time, which is typical for periodic/continuous profiling cases) to avoid higher costs of re-parsing this file. Or it has to choose to cache the contents in memory to speed up future requests. In the former case, more memory is used for the cache and there is a risk of getting stale data if application loads or unloads shared libraries, or otherwise changed its set of VMAs somehow, e.g., through additional mmap() calls. In the latter case, it's the performance hit that comes from re-opening the file and re-parsing its contents all over again. This patch aims to solve this problem by providing a new API built on top of /proc//maps. It's meant to address both non-selectiveness and text nature of /proc//maps, by giving user more control of what sort of VMA(s) needs to be queried, and being binary-based interface eliminates the overhead of text formatting (on kernel side) and parsing (on user space side). It's also designed to be extensible and forward/backward compatible by including required struct size field, which user has to provide. We use established copy_struct_from_user() approach to handle extensibility. User has a choice to pick either getting VMA that covers provided address or -ENOENT if none is found (exact, least surprising, case). Or, with an extra query flag (PROCMAP_QUERY_COVERING_OR_NEXT_VMA), they can get either VMA that covers the address (if there is one), or the closest next VMA (i.e., VMA with the smallest vm_start > addr). The latter allows more efficient use, but, given it could be a surprising behavior, requires an explicit opt-in. There is another query flag that is useful for some use cases. PROCMAP_QUERY_FILE_BACKED_VMA instructs this API to only return file-backed VMAs. Combining this with PROCMAP_QUERY_COVERING_OR_NEXT_VMA makes it possible to efficiently iterate only file-backed VMAs of the process, which is what profilers/symbolizers are normally interested in. All the above querying flags can be combined with (also optional) set of desired VMA permissions flags. This allows to, for example, iterate only an executable subset of VMAs, which is what preprocessing pattern, used by perf tool, would benefit from, as the assumption is that captured stack traces would have addresses of executable code. This saves time by skipping non-executable VMAs altogether efficienty. All these querying flags (modifiers) are orthogonal and can be combined in a semantically meaningful and natural way. Basing this ioctl()-based API on top of /proc//maps's FD makes sense given it's querying the same set of VMA data. It's also benefitial because permission checks for /proc//maps is performed at open time once, and the actual data read of text contents of /proc//maps is done without further permission checks. We piggyback on this pattern with ioctl()-based API as well, as that's a desired property. Both for performance reasons, but also for security and flexibility reasons. Allowing application to open an FD for /proc/self/maps without any extra capabilities, and then passing it to some sort of profiling agent through Unix-domain socket, would allow such profiling agent to not require some of the capabilities that are otherwise expected when opening /proc//maps file for *another* process. This is a desirable property for some more restricted setups. This new ioctl-based implementation doesn't interfere with seq_file-based implementation of /proc//maps textual interface, and so could be used together or independently without paying any price for that. Note also, that fetching VMA name (e.g., backing file path, or special hard-coded or user-provided names) is optional just like build ID. If user sets vma_name_size to zero, kernel code won't attempt to retrieve it, saving resources. Earlier versions of this patch set were adding per-VMA locking, which is why we have a code structure that is ready for abstracting mmap_lock vs vm_lock differences (query_vma_setup(), query_vma_teardown(), and query_vma_find_by_addr()), but given anon_vma_name() is not yet compatible with per-VMA locking, initial implementation sticks to using only mmap_lock for now. It will be easy to add back per-VMA locking once all the pieces are ready later on. Which is why we keep existing code structure with setup/teardown/query helper functions. Signed-off-by: Andrii Nakryiko --- fs/proc/task_mmu.c | 235 ++++++++++++++++++++++++++++++++++++++++ include/uapi/linux/fs.h | 130 +++++++++++++++++++++- 2 files changed, 364 insertions(+), 1 deletion(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 507b7dc7c4c8..674405b99d0d 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -375,11 +375,246 @@ static int pid_maps_open(struct inode *inode, struct file *file) return do_maps_open(inode, file, &proc_pid_maps_op); } +#define PROCMAP_QUERY_VMA_FLAGS ( \ + PROCMAP_QUERY_VMA_READABLE | \ + PROCMAP_QUERY_VMA_WRITABLE | \ + PROCMAP_QUERY_VMA_EXECUTABLE | \ + PROCMAP_QUERY_VMA_SHARED \ +) + +#define PROCMAP_QUERY_VALID_FLAGS_MASK ( \ + PROCMAP_QUERY_COVERING_OR_NEXT_VMA | \ + PROCMAP_QUERY_FILE_BACKED_VMA | \ + PROCMAP_QUERY_VMA_FLAGS \ +) + +static int query_vma_setup(struct mm_struct *mm) +{ + return mmap_read_lock_killable(mm); +} + +static void query_vma_teardown(struct mm_struct *mm, struct vm_area_struct *vma) +{ + mmap_read_unlock(mm); +} + +static struct vm_area_struct *query_vma_find_by_addr(struct mm_struct *mm, unsigned long addr) +{ + return find_vma(mm, addr); +} + +static struct vm_area_struct *query_matching_vma(struct mm_struct *mm, + unsigned long addr, u32 flags) +{ + struct vm_area_struct *vma; + +next_vma: + vma = query_vma_find_by_addr(mm, addr); + if (!vma) + goto no_vma; + + /* user requested only file-backed VMA, keep iterating */ + if ((flags & PROCMAP_QUERY_FILE_BACKED_VMA) && !vma->vm_file) + goto skip_vma; + + /* VMA permissions should satisfy query flags */ + if (flags & PROCMAP_QUERY_VMA_FLAGS) { + u32 perm = 0; + + if (flags & PROCMAP_QUERY_VMA_READABLE) + perm |= VM_READ; + if (flags & PROCMAP_QUERY_VMA_WRITABLE) + perm |= VM_WRITE; + if (flags & PROCMAP_QUERY_VMA_EXECUTABLE) + perm |= VM_EXEC; + if (flags & PROCMAP_QUERY_VMA_SHARED) + perm |= VM_MAYSHARE; + + if ((vma->vm_flags & perm) != perm) + goto skip_vma; + } + + /* found covering VMA or user is OK with the matching next VMA */ + if ((flags & PROCMAP_QUERY_COVERING_OR_NEXT_VMA) || vma->vm_start <= addr) + return vma; + +skip_vma: + /* + * If the user needs closest matching VMA, keep iterating. + */ + addr = vma->vm_end; + if (flags & PROCMAP_QUERY_COVERING_OR_NEXT_VMA) + goto next_vma; +no_vma: + return ERR_PTR(-ENOENT); +} + +static int do_procmap_query(struct proc_maps_private *priv, void __user *uarg) +{ + struct procmap_query karg; + struct vm_area_struct *vma; + struct mm_struct *mm; + const char *name = NULL; + char *name_buf = NULL; + __u64 usize; + int err; + + if (copy_from_user(&usize, (void __user *)uarg, sizeof(usize))) + return -EFAULT; + /* argument struct can never be that large, reject abuse */ + if (usize > PAGE_SIZE) + return -E2BIG; + /* argument struct should have at least query_flags and query_addr fields */ + if (usize < offsetofend(struct procmap_query, query_addr)) + return -EINVAL; + err = copy_struct_from_user(&karg, sizeof(karg), uarg, usize); + if (err) + return err; + + /* reject unknown flags */ + if (karg.query_flags & ~PROCMAP_QUERY_VALID_FLAGS_MASK) + return -EINVAL; + /* either both buffer address and size are set, or both should be zero */ + if (!!karg.vma_name_size != !!karg.vma_name_addr) + return -EINVAL; + + mm = priv->mm; + if (!mm || !mmget_not_zero(mm)) + return -ESRCH; + + err = query_vma_setup(mm); + if (err) { + mmput(mm); + return err; + } + + vma = query_matching_vma(mm, karg.query_addr, karg.query_flags); + if (IS_ERR(vma)) { + err = PTR_ERR(vma); + vma = NULL; + goto out; + } + + karg.vma_start = vma->vm_start; + karg.vma_end = vma->vm_end; + + karg.vma_flags = 0; + if (vma->vm_flags & VM_READ) + karg.vma_flags |= PROCMAP_QUERY_VMA_READABLE; + if (vma->vm_flags & VM_WRITE) + karg.vma_flags |= PROCMAP_QUERY_VMA_WRITABLE; + if (vma->vm_flags & VM_EXEC) + karg.vma_flags |= PROCMAP_QUERY_VMA_EXECUTABLE; + if (vma->vm_flags & VM_MAYSHARE) + karg.vma_flags |= PROCMAP_QUERY_VMA_SHARED; + + karg.vma_page_size = vma_kernel_pagesize(vma); + + if (vma->vm_file) { + const struct inode *inode = file_user_inode(vma->vm_file); + + karg.vma_offset = ((__u64)vma->vm_pgoff) << PAGE_SHIFT; + karg.dev_major = MAJOR(inode->i_sb->s_dev); + karg.dev_minor = MINOR(inode->i_sb->s_dev); + karg.inode = inode->i_ino; + } else { + karg.vma_offset = 0; + karg.dev_major = 0; + karg.dev_minor = 0; + karg.inode = 0; + } + + if (karg.build_id_size) { + __u32 build_id_sz; + + err = build_id_parse(vma, build_id_buf, &build_id_sz); + if (err) { + karg.build_id_size = 0; + } else { + if (karg.build_id_size < build_id_sz) { + err = -ENAMETOOLONG; + goto out; + } + karg.build_id_size = build_id_sz; + } + } + + if (karg.vma_name_size) { + size_t name_buf_sz = min_t(size_t, PATH_MAX, karg.vma_name_size); + const struct path *path; + const char *name_fmt; + size_t name_sz = 0; + + get_vma_name(vma, &path, &name, &name_fmt); + + if (path || name_fmt || name) { + name_buf = kmalloc(name_buf_sz, GFP_KERNEL); + if (!name_buf) { + err = -ENOMEM; + goto out; + } + } + if (path) { + name = d_path(path, name_buf, name_buf_sz); + if (IS_ERR(name)) { + err = PTR_ERR(name); + goto out; + } + name_sz = name_buf + name_buf_sz - name; + } else if (name || name_fmt) { + name_sz = 1 + snprintf(name_buf, name_buf_sz, name_fmt ?: "%s", name); + name = name_buf; + } + if (name_sz > name_buf_sz) { + err = -ENAMETOOLONG; + goto out; + } + karg.vma_name_size = name_sz; + } + + /* unlock vma or mmap_lock, and put mm_struct before copying data to user */ + query_vma_teardown(mm, vma); + mmput(mm); + + if (karg.vma_name_size && copy_to_user((void __user *)karg.vma_name_addr, + name, karg.vma_name_size)) { + kfree(name_buf); + return -EFAULT; + } + kfree(name_buf); + + if (copy_to_user(uarg, &karg, min_t(size_t, sizeof(karg), usize))) + return -EFAULT; + + return 0; + +out: + query_vma_teardown(mm, vma); + mmput(mm); + kfree(name_buf); + return err; +} + +static long procfs_procmap_ioctl(struct file *file, unsigned int cmd, unsigned long arg) +{ + struct seq_file *seq = file->private_data; + struct proc_maps_private *priv = seq->private; + + switch (cmd) { + case PROCMAP_QUERY: + return do_procmap_query(priv, (void __user *)arg); + default: + return -ENOIOCTLCMD; + } +} + const struct file_operations proc_pid_maps_operations = { .open = pid_maps_open, .read = seq_read, .llseek = seq_lseek, .release = proc_map_release, + .unlocked_ioctl = procfs_procmap_ioctl, + .compat_ioctl = procfs_procmap_ioctl, }; /* diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h index 45e4e64fd664..5d440f9b5d92 100644 --- a/include/uapi/linux/fs.h +++ b/include/uapi/linux/fs.h @@ -333,8 +333,10 @@ typedef int __bitwise __kernel_rwf_t; #define RWF_SUPPORTED (RWF_HIPRI | RWF_DSYNC | RWF_SYNC | RWF_NOWAIT |\ RWF_APPEND | RWF_NOAPPEND) +#define PROCFS_IOCTL_MAGIC 'f' + /* Pagemap ioctl */ -#define PAGEMAP_SCAN _IOWR('f', 16, struct pm_scan_arg) +#define PAGEMAP_SCAN _IOWR(PROCFS_IOCTL_MAGIC, 16, struct pm_scan_arg) /* Bitmasks provided in pm_scan_args masks and reported in page_region.categories. */ #define PAGE_IS_WPALLOWED (1 << 0) @@ -393,4 +395,130 @@ struct pm_scan_arg { __u64 return_mask; }; +/* /proc//maps ioctl */ +#define PROCMAP_QUERY _IOWR(PROCFS_IOCTL_MAGIC, 17, struct procmap_query) + +enum procmap_query_flags { + /* + * VMA permission flags. + * + * Can be used as part of procmap_query.query_flags field to look up + * only VMAs satisfying specified subset of permissions. E.g., specifying + * PROCMAP_QUERY_VMA_READABLE only will return both readable and read/write VMAs, + * while having PROCMAP_QUERY_VMA_READABLE | PROCMAP_QUERY_VMA_WRITABLE will only + * return read/write VMAs, though both executable/non-executable and + * private/shared will be ignored. + * + * PROCMAP_QUERY_VMA_* flags are also returned in procmap_query.vma_flags + * field to specify actual VMA permissions. + */ + PROCMAP_QUERY_VMA_READABLE = 0x01, + PROCMAP_QUERY_VMA_WRITABLE = 0x02, + PROCMAP_QUERY_VMA_EXECUTABLE = 0x04, + PROCMAP_QUERY_VMA_SHARED = 0x08, + /* + * Query modifier flags. + * + * By default VMA that covers provided address is returned, or -ENOENT + * is returned. With PROCMAP_QUERY_COVERING_OR_NEXT_VMA flag set, closest + * VMA with vma_start > addr will be returned if no covering VMA is + * found. + * + * PROCMAP_QUERY_FILE_BACKED_VMA instructs query to consider only VMAs that + * have file backing. Can be combined with PROCMAP_QUERY_COVERING_OR_NEXT_VMA + * to iterate all VMAs with file backing. + */ + PROCMAP_QUERY_COVERING_OR_NEXT_VMA = 0x10, + PROCMAP_QUERY_FILE_BACKED_VMA = 0x20, +}; + +/* + * Input/output argument structured passed into ioctl() call. It can be used + * to query a set of VMAs (Virtual Memory Areas) of a process. + * + * Each field can be one of three kinds, marked in a short comment to the + * right of the field: + * - "in", input argument, user has to provide this value, kernel doesn't modify it; + * - "out", output argument, kernel sets this field with VMA data; + * - "in/out", input and output argument; user provides initial value (used + * to specify maximum allowable buffer size), and kernel sets it to actual + * amount of data written (or zero, if there is no data). + * + * If matching VMA is found (according to criterias specified by + * query_addr/query_flags, all the out fields are filled out, and ioctl() + * returns 0. If there is no matching VMA, -ENOENT will be returned. + * In case of any other error, negative error code other than -ENOENT is + * returned. + * + * Most of the data is similar to the one returned as text in /proc//maps + * file, but procmap_query provides more querying flexibility. There are no + * consistency guarantees between subsequent ioctl() calls, but data returned + * for matched VMA is self-consistent. + */ +struct procmap_query { + /* Query struct size, for backwards/forward compatibility */ + __u64 size; + /* + * Query flags, a combination of enum procmap_query_flags values. + * Defines query filtering and behavior, see enum procmap_query_flags. + * + * Input argument, provided by user. Kernel doesn't modify it. + */ + __u64 query_flags; /* in */ + /* + * Query address. By default, VMA that covers this address will + * be looked up. PROCMAP_QUERY_* flags above modify this default + * behavior further. + * + * Input argument, provided by user. Kernel doesn't modify it. + */ + __u64 query_addr; /* in */ + /* VMA starting (inclusive) and ending (exclusive) address, if VMA is found. */ + __u64 vma_start; /* out */ + __u64 vma_end; /* out */ + /* VMA permissions flags. A combination of PROCMAP_QUERY_VMA_* flags. */ + __u64 vma_flags; /* out */ + /* VMA backing page size granularity. */ + __u64 vma_page_size; /* out */ + /* + * VMA file offset. If VMA has file backing, this specifies offset + * within the file that VMA's start address corresponds to. + * Is set to zero if VMA has no backing file. + */ + __u64 vma_offset; /* out */ + /* Backing file's inode number, or zero, if VMA has no backing file. */ + __u64 inode; /* out */ + /* Backing file's device major/minor number, or zero, if VMA has no backing file. */ + __u32 dev_major; /* out */ + __u32 dev_minor; /* out */ + /* + * If set to non-zero value, signals the request to return VMA name + * (i.e., VMA's backing file's absolute path, with " (deleted)" suffix + * appended, if file was unlinked from FS) for matched VMA. VMA name + * can also be some special name (e.g., "[heap]", "[stack]") or could + * be even user-supplied with prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME). + * + * Kernel will set this field to zero, if VMA has no associated name. + * Otherwise kernel will return actual amount of bytes filled in + * user-supplied buffer (see vma_name_addr field below), including the + * terminating zero. + * + * If VMA name is longer that user-supplied maximum buffer size, + * -E2BIG error is returned. + * + * If this field is set to non-zero value, vma_name_addr should point + * to valid user space memory buffer of at least vma_name_size bytes. + * If set to zero, vma_name_addr should be set to zero as well + */ + __u32 vma_name_size; /* in/out */ + /* + * User-supplied address of a buffer of at least vma_name_size bytes + * for kernel to fill with matched VMA's name (see vma_name_size field + * description above for details). + * + * Should be set to zero if VMA name should not be returned. + */ + __u64 vma_name_addr; /* in */ +}; + #endif /* _UAPI_LINUX_FS_H */ From patchwork Thu Jun 27 17:08:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13714877 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D782C3064D for ; Thu, 27 Jun 2024 17:09:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DC0FF6B00A2; Thu, 27 Jun 2024 13:09:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CCD446B00A3; Thu, 27 Jun 2024 13:09:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A846E6B00A4; Thu, 27 Jun 2024 13:09:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 84A636B00A2 for ; Thu, 27 Jun 2024 13:09:16 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 3EF48A3E54 for ; Thu, 27 Jun 2024 17:09:16 +0000 (UTC) X-FDA: 82277304312.14.55DB18B Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf15.hostedemail.com (Postfix) with ESMTP id 89552A0013 for ; Thu, 27 Jun 2024 17:09:14 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=pifzznZr; spf=pass (imf15.hostedemail.com: domain of andrii@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=andrii@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719508131; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1mSz6NYBEtVlykLHYOpq7Y8Z0zh+MMwyc2nZ6QivHEo=; b=TCGAaJtCBi5LUEiQZnt31m8YWsAy6HCE/VVmSrKTaMeYgGl2QdSl3as/v/BOdkoFJT0ZcJ 8v0AaO/L6z3wTIhZdyaWVi+bGDgmbUT39gfPcSoi4sX9XUP1ICDWDbTORf0oLPVxc3a+Xx T+Bpo64as4SOOic3SyEJa67Mtdv2VFo= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=pifzznZr; spf=pass (imf15.hostedemail.com: domain of andrii@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=andrii@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719508131; a=rsa-sha256; cv=none; b=uM3ZeUJF5cz+GbWLPrRNPrE3rlL69E6ZJ9H/YqrlbHKmzQW8iOnek2NnB91xtgvoVxKv2v 4L9DV/9ErU0kB2bpoKETSOoOa5GshaP5ZVmb2oZot3jJCVbS2Xq4wAXZ9lbo1yncpNAckE 1FCELnGWTuJrFRJ6uwhwfWExRsntg9w= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id C667F61F6B; Thu, 27 Jun 2024 17:09:13 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6313FC2BD10; Thu, 27 Jun 2024 17:09:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1719508153; bh=TfcaDQQTdRLothd0FRVJK6mCUHkge2r3AFoLq6I1TMA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=pifzznZrR7d2mHil0ZsDNUjdyAtDUfNT8iHlRgb5WpOzvxiiIN5gEWe/sLAsNeS9Q 3JlLJEEUQFeOiKu2UeHqkSwO228T5OvWAIdya2IbVVt6DqAAbr4cCrsfmC+5vUrMnt /+pWc2mJx4YLQ1Y4jjiG7HuspbUxhOLX4yejWUtYCxWErZIjSTtRB5w/3y0bEcVoSh ELHXf2qD+MSJSXRuu0dYiKJGjG9HgMXC991GBaClgKL8a4xT1yto/86DFAe0hSwtqt 3ldeJiQtjfRR+FOI4/3DebO392IATQ+yXMKEtaw7eJ+6I6f+QPT+HCEYuli2kLEwcS fe5N7keYk1IUQ== From: Andrii Nakryiko To: linux-fsdevel@vger.kernel.org, brauner@kernel.org, viro@zeniv.linux.org.uk, akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, bpf@vger.kernel.org, gregkh@linuxfoundation.org, linux-mm@kvack.org, liam.howlett@oracle.com, surenb@google.com, rppt@kernel.org, adobriyan@gmail.com, Andrii Nakryiko Subject: [PATCH v6 3/6] fs/procfs: add build ID fetching to PROCMAP_QUERY API Date: Thu, 27 Jun 2024 10:08:55 -0700 Message-ID: <20240627170900.1672542-4-andrii@kernel.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240627170900.1672542-1-andrii@kernel.org> References: <20240627170900.1672542-1-andrii@kernel.org> MIME-Version: 1.0 X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 89552A0013 X-Stat-Signature: 1d7bgtr5fsgwza39goznajb566mui1wp X-Rspam-User: X-HE-Tag: 1719508154-539433 X-HE-Meta: U2FsdGVkX18YFVvJ/5RcQmtL4UicYBH6wStRq3h25wR1+EhZg2b5EuznA6ZseqpV8Irz5O5bBo5S8wbxsK0V9RyPNkgHq7FK8r04tts7yArtS68muUNmNQqexNArwYHUCEBIRsbws0EqjO/7fQz3bM/rMOVPoNDOTDiCVds0wIJftLJxzroQ94R1l7d47HADGYzx0UJxWK3wspbFcjsQoh6W9F4GwwWV3QfhVoB5QPZFRNAFdR8SJF8wfrze9Ukq1S3VAq3ul1tsSto/sNXir8vrqFI5zeV1f5SBDsNL9biM3T+i4fSOIobSE1z0vbGgEF15/NOMVGrY7J5FjAiifEuS4vtYlwQvqjz1HHB61hsMueMl24ix66mZRcc4iydD1zWGOxt9kyAyzMufQwdgVhbT0KCceHoP2N/7QJDpM2sCwb5+TFNLsE4G6t0cDjcsaQStfxpe/z4sNqKB+AMK3axmbWenHRwwZQXaydbcv1ySlOR549xg3MI7RV+YCKFtFrR2qytPLYkWNpGFOCdTmswXCy6J4PRff45COAQnwzSWMnNDFLjzZltYuKkiPENmkWWQqu0GsIb18oBTh5d8U+0Ts9qfJVGV/D7C4sqPUEOwBStOA0NIxGQJsEvATfczjA2X8S76qoi/Q0aa+arlD+SWRYiip+BmKrZcQQXzX0Fjv/u2NKpez5UUPHqou08zx+OsweQ0q5ZKKjBg4EA0RAESEO3M2FiTlgVRyePts+2Sq/uXCXQq4t3pG9A0v3QMwuQ+o0jPm0nvhgYXudV3iqG6pnjvNunxfigM78zuH7uq8o0KbJ8N7Jdq14M+KMKzIJ2gy/2PNlWHQFgGnGRYOT458dipYWHMvZ3JKN1S6koGBVP/D9ylGrJtjk54W60byFRt7XPsAWOTtaI2cxzASwM0zYJ2eudSFCeEzvtqoHSPfe7zgk9LIlbGh+15RceSGZks34LO+p6rjMoIQuo 2yNbvSDz /TpvQBFAiB8vBPs07PoAzB3ETftu1vzqzwTbJBx2ABPi2PuJms/vIoj5SqIPvNQI3LnqYEqpU3PHiiNsIkfr5OcaRD9fu9VKBbW1Xqw8K1FguSkgTrwsAKw+OeouVkN9lbz8gRS3uqlsqmfusjxb+vs1TfGUp0ue4iok+olBWH8RIT8eW6UlXTxkdXT5CBtTYnEKWJhOZWvXgO3psQtdT+QZItx1WJ1LMgzpbwo+JZs/hOwQr9yzJ0Hf3fL9BXjjj8biKSQSq33qsTDghn51Vg+n1W9MUO8MJNShHChMpHKTTCN2iobYJ4FLyDxMtR9MoVIaWnOs5PeQmRLQ42VNOaZd1StaiS3r1m7C4NoduQa2rKZc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The need to get ELF build ID reliably is an important aspect when dealing with profiling and stack trace symbolization, and /proc//maps textual representation doesn't help with this. To get backing file's ELF build ID, application has to first resolve VMA, then use it's start/end address range to follow a special /proc//map_files/- symlink to open the ELF file (this is necessary because backing file might have been removed from the disk or was already replaced with another binary in the same file path. Such approach, beyond just adding complexity of having to do a bunch of extra work, has extra security implications. Because application opens underlying ELF file and needs read access to its entire contents (as far as kernel is concerned), kernel puts additional capable() checks on following /proc//map_files/- symlink. And that makes sense in general. But in the case of build ID, profiler/symbolizer doesn't need the contents of ELF file, per se. It's only build ID that is of interest, and ELF build ID itself doesn't provide any sensitive information. So this patch adds a way to request backing file's ELF build ID along the rest of VMA information in the same API. User has control over whether this piece of information is requested or not by either setting build_id_size field to zero or non-zero maximum buffer size they provided through build_id_addr field (which encodes user pointer as __u64 field). This is a completely optional piece of information, and so has no performance implications for user cases that don't care about build ID, while improving performance and simplifying the setup for those application that do need it. Kernel already implements build ID fetching, which is used from BPF subsystem. We are reusing this code here, but plan a follow up changes to make it work better under more relaxed assumption (compared to what existing code assumes) of being called from user process context, in which page faults are allowed. BPF-specific implementation currently bails out if necessary part of ELF file is not paged in, all due to extra BPF-specific restrictions (like the need to fetch build ID in restrictive contexts such as NMI handler). Signed-off-by: Andrii Nakryiko --- fs/proc/task_mmu.c | 25 ++++++++++++++++++++++++- include/uapi/linux/fs.h | 28 ++++++++++++++++++++++++++++ 2 files changed, 52 insertions(+), 1 deletion(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 674405b99d0d..32bef3eeab7f 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -22,6 +22,7 @@ #include #include #include +#include #include #include @@ -445,6 +446,7 @@ static struct vm_area_struct *query_matching_vma(struct mm_struct *mm, addr = vma->vm_end; if (flags & PROCMAP_QUERY_COVERING_OR_NEXT_VMA) goto next_vma; + no_vma: return ERR_PTR(-ENOENT); } @@ -455,7 +457,7 @@ static int do_procmap_query(struct proc_maps_private *priv, void __user *uarg) struct vm_area_struct *vma; struct mm_struct *mm; const char *name = NULL; - char *name_buf = NULL; + char build_id_buf[BUILD_ID_SIZE_MAX], *name_buf = NULL; __u64 usize; int err; @@ -477,6 +479,8 @@ static int do_procmap_query(struct proc_maps_private *priv, void __user *uarg) /* either both buffer address and size are set, or both should be zero */ if (!!karg.vma_name_size != !!karg.vma_name_addr) return -EINVAL; + if (!!karg.build_id_size != !!karg.build_id_addr) + return -EINVAL; mm = priv->mm; if (!mm || !mmget_not_zero(mm)) @@ -539,6 +543,21 @@ static int do_procmap_query(struct proc_maps_private *priv, void __user *uarg) } } + if (karg.build_id_size) { + __u32 build_id_sz; + + err = build_id_parse(vma, build_id_buf, &build_id_sz); + if (err) { + karg.build_id_size = 0; + } else { + if (karg.build_id_size < build_id_sz) { + err = -ENAMETOOLONG; + goto out; + } + karg.build_id_size = build_id_sz; + } + } + if (karg.vma_name_size) { size_t name_buf_sz = min_t(size_t, PATH_MAX, karg.vma_name_size); const struct path *path; @@ -583,6 +602,10 @@ static int do_procmap_query(struct proc_maps_private *priv, void __user *uarg) } kfree(name_buf); + if (karg.build_id_size && copy_to_user((void __user *)karg.build_id_addr, + build_id_buf, karg.build_id_size)) + return -EFAULT; + if (copy_to_user(uarg, &karg, min_t(size_t, sizeof(karg), usize))) return -EFAULT; diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h index 5d440f9b5d92..2a4a5f50c98e 100644 --- a/include/uapi/linux/fs.h +++ b/include/uapi/linux/fs.h @@ -511,6 +511,26 @@ struct procmap_query { * If set to zero, vma_name_addr should be set to zero as well */ __u32 vma_name_size; /* in/out */ + /* + * If set to non-zero value, signals the request to extract and return + * VMA's backing file's build ID, if the backing file is an ELF file + * and it contains embedded build ID. + * + * Kernel will set this field to zero, if VMA has no backing file, + * backing file is not an ELF file, or ELF file has no build ID + * embedded. + * + * Build ID is a binary value (not a string). Kernel will set + * build_id_size field to exact number of bytes used for build ID. + * If build ID is requested and present, but needs more bytes than + * user-supplied maximum buffer size (see build_id_addr field below), + * -E2BIG error will be returned. + * + * If this field is set to non-zero value, build_id_addr should point + * to valid user space memory buffer of at least build_id_size bytes. + * If set to zero, build_id_addr should be set to zero as well + */ + __u32 build_id_size; /* in/out */ /* * User-supplied address of a buffer of at least vma_name_size bytes * for kernel to fill with matched VMA's name (see vma_name_size field @@ -519,6 +539,14 @@ struct procmap_query { * Should be set to zero if VMA name should not be returned. */ __u64 vma_name_addr; /* in */ + /* + * User-supplied address of a buffer of at least build_id_size bytes + * for kernel to fill with matched VMA's ELF build ID, if available + * (see build_id_size field description above for details). + * + * Should be set to zero if build ID should not be returned. + */ + __u64 build_id_addr; /* in */ }; #endif /* _UAPI_LINUX_FS_H */ From patchwork Thu Jun 27 17:08:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13714878 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D698C2BD09 for ; Thu, 27 Jun 2024 17:09:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D6B146B00A4; Thu, 27 Jun 2024 13:09:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D1B576B00A5; Thu, 27 Jun 2024 13:09:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B94D66B00A6; Thu, 27 Jun 2024 13:09:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 9BEB06B00A4 for ; Thu, 27 Jun 2024 13:09:22 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 57CA4160DC8 for ; Thu, 27 Jun 2024 17:09:22 +0000 (UTC) X-FDA: 82277304564.11.683B4E9 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf29.hostedemail.com (Postfix) with ESMTP id 59FEE120028 for ; Thu, 27 Jun 2024 17:09:19 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=iWxwaAm6; spf=pass (imf29.hostedemail.com: domain of andrii@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=andrii@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719508147; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Nqqeguxhnv70YJ/Hlnzz44ng4fnExjdR7VvZFZgjwVY=; b=zlLL/qjk9LD8gV2yj0GVb04JGiYHoeG9CGLh5+qdm/LQ2gVw4pXCSQpY2KbGDOtpGeK2oR DUAKfrBERncVBK4WfFx4bbYyrUeuFRUohOJ8LSGyXM2dkCHS+87EqS9SGFVvUZYobddhmK 0N+7MWMXyt2ZnER4++pCXFXdHRLRQUI= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=iWxwaAm6; spf=pass (imf29.hostedemail.com: domain of andrii@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=andrii@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719508147; a=rsa-sha256; cv=none; b=JBHjmFlEgnP/FaYvzPZnQvxiqahYzuYiUsPu6SGAYROfD7fF3+KdeURqzpbR5Hh2QDGWM+ 7Cm6PlwUSD7VJXnJ16y8KNW+hCxWVDrAKYu085/XtgjSssHSGH/mOjXctK7hgXhgHMk17r vaqZSGMYcD996I3m+B+g6056fvoTKxc= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id 8D1D0CE3377; Thu, 27 Jun 2024 17:09:17 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A915BC2BBFC; Thu, 27 Jun 2024 17:09:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1719508156; bh=OCMwDvtwnM9ll0Ws/wHw8cfOMgOXkr3TtaYN7UcD3wM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=iWxwaAm6HiI2s6p8C7OPJUQawedJzfpX3Qtv6R/A3MQznMfbZJjniPkVsJ6xd0kf+ FanSaA7+wP2JrEJsSSupGUVVfHV8nS8GQOg/o37+oky22AN3kGvCRezK8rbrTKyyxc TDAtr9dlBHBQzf2+BDakPYQ4IZ+SiUzUpDX8adHYcU0SjHUMfkqKcSN/lNd8LAEBxa JOT1dqf1/FQNuopBQcbInp1Ctny9qlB9k5ehdwSstT1gn218L89kg7JeHWfyJbEFkb 6lPe01BN5tapSMV5+FQQvExwXYOvAx+QC1vzjwpVqP4vl1LHhlDvcKcj6LkFO+9gWf CAcElJq4S2Mxw== From: Andrii Nakryiko To: linux-fsdevel@vger.kernel.org, brauner@kernel.org, viro@zeniv.linux.org.uk, akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, bpf@vger.kernel.org, gregkh@linuxfoundation.org, linux-mm@kvack.org, liam.howlett@oracle.com, surenb@google.com, rppt@kernel.org, adobriyan@gmail.com, Andrii Nakryiko Subject: [PATCH v6 4/6] docs/procfs: call out ioctl()-based PROCMAP_QUERY command existence Date: Thu, 27 Jun 2024 10:08:56 -0700 Message-ID: <20240627170900.1672542-5-andrii@kernel.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240627170900.1672542-1-andrii@kernel.org> References: <20240627170900.1672542-1-andrii@kernel.org> MIME-Version: 1.0 X-Stat-Signature: 3wpqb5hegrxbjof6pkghpeju3ik7yop5 X-Rspam-User: X-Rspamd-Queue-Id: 59FEE120028 X-Rspamd-Server: rspam02 X-HE-Tag: 1719508159-946776 X-HE-Meta: U2FsdGVkX19SacEJNaSNvtRUtXaqTpXdv3Um3C/ZJwUSmPCb5ZoR2q98Q1bbYCt16e9XW9k4W6Uyh8AiyTKeJkbXivRb1m4c16ehQdDE01XQRraSCATNCpilKkstfGNOJcgVRlDdbFIWbZGvxwpJ/sAYONX44jjt46T24+mqMZOUoDngfQUhlrtC8HAQ196IJCBEWCKP6Iq4+duwON2SxsWktib8Vo0djggXdWvG7NytYgEnyBCDoT/qK/ls+R/j3pIXCxy+mgRTPsvOmMZow39sBArGEILaS5a2uv9rJ97SstCIMiz2WpSL+jqxp6NfSUL5OvfIYWMx68WlQCG709Nj5rlJ9LUFGV2pP8SPI1i8tOoxy9aZbJBDKnbYwrIQK0VD8NVAqGc3SzSxQ9ytYeGea/Bs0yE/q2A3XR8ZAPzMswFTzMuwZ9VU/pMJhtKegZ+lK4kPT6nB/6nLya7aem8OQQd/FXz7RzkeD6RFpcwmhQ0dRUeokUB1sKehIquYtzh9ZlSe/cWA5EqPl6linbAJ5lvzaaIwV8GTGFMBGkOUChqsbCCQxZFLtupIeYEWOt/4R8oQa2WvW45Or7+hUY5liTaBvFmU7sp7t7UvK83wEBlXLCgzbctin1WyLpAYdtnmsosxb5i24uywmkNovyFMOuFpgBMf9EHtZeyPUvW38Ce+mneiZfwO+GzKwbvy6yayLN9JVqoYlQaapTEuAsjEJMuNcKZUy+wndgQ5hGyG360tQGGjAz8QqijxsGWF7lKD5HNkKYId2/a67Q1sbhVfqlvqiYCwMmsUchg9EqVmSCRo3JOfdSqgK9+7yVQtGS+og083xju9Xrx2gJ4bAwqVu1rhHYibt8I6X5m/zAC1YoUh49vfXrDd2ZJgGV5Yh4LAv7Z9MzeMkwpFIFYIX56z8wBhdZhvbq9EV5lbkV1KfDjL5CJlF4NEqjm8VnNRgnsOP6ma44dOLy5YXBb 23iGWcB1 0aIUV2uhorqZ/U+0Aur0sn3NHhI0agKMNFg22kCNWM+SpdE+J/S34Z1KM1nq8i/3+LUPo5Ec0Vsqsq28r2p+oXz2rpsD8rgwektZ/YC7Kd3xknuRqaqt68eyRNo85YUq01B64qQfQ582EDPPL00HEOui8K9ogqFZeA/8UF+jTcrjkdvoVd6QHl2MyN9qZLZcMUT1eWV85waGk+Ok+izIGw6Lc9dyDyNdooVDZ6iW+mo2s+Xr2MKLyfzvcsvRy24zFi77L80hshaj3I5d1qOKOZWkqTUSc61lbWsVa X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Call out PROCMAP_QUERY ioctl() existence in the section describing /proc/PID/maps file in documentation. We refer user to UAPI header for low-level details of this programmatic interface. Signed-off-by: Andrii Nakryiko --- Documentation/filesystems/proc.rst | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst index 82d142de3461..e834779d9611 100644 --- a/Documentation/filesystems/proc.rst +++ b/Documentation/filesystems/proc.rst @@ -443,6 +443,15 @@ is not associated with a file: or if empty, the mapping is anonymous. +Starting with 6.11 kernel, /proc/PID/maps provides an alternative +ioctl()-based API that gives ability to flexibly and efficiently query and +filter individual VMAs. This interface is binary and is meant for more +efficient and easy programmatic use. `struct procmap_query`, defined in +linux/fs.h UAPI header, serves as an input/output argument to the +`PROCMAP_QUERY` ioctl() command. See comments in linus/fs.h UAPI header for +details on query semantics, supported flags, data returned, and general API +usage information. + The /proc/PID/smaps is an extension based on maps, showing the memory consumption for each of the process's mappings. For each mapping (aka Virtual Memory Area, or VMA) there is a series of lines such as the following:: From patchwork Thu Jun 27 17:08:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13714879 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50F61C3064D for ; Thu, 27 Jun 2024 17:09:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 72AD56B00A6; Thu, 27 Jun 2024 13:09:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6DA9B6B00A8; Thu, 27 Jun 2024 13:09:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 52C766B00A7; Thu, 27 Jun 2024 13:09:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 32DEE6B00A5 for ; Thu, 27 Jun 2024 13:09:23 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id DD7AF40CDA for ; Thu, 27 Jun 2024 17:09:22 +0000 (UTC) X-FDA: 82277304564.14.1790C6B Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf08.hostedemail.com (Postfix) with ESMTP id 232D8160021 for ; Thu, 27 Jun 2024 17:09:20 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=iWYnW33B; spf=pass (imf08.hostedemail.com: domain of andrii@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=andrii@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719508153; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=EGeqRfOJeMMfTLJ8SPBe2SAMr1zi3zx4a2TyMM3aEJ8=; b=uPwbjFchFo7D0GEvTGcQd7mzpkHhO2Oc4QesYp54mylv1z4TA0nhPfyru1n7vUE7FZcXUq /uTWl8hYmoDLTL1uy6mGHXxC8tl+NL6En0AcpMkbZ3oDWfW00Ebbd6/LvUp7Z8eC4Y/Vpl qvQ6GMXKIJe32qYGNYOQeJf691qjPnE= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=iWYnW33B; spf=pass (imf08.hostedemail.com: domain of andrii@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=andrii@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719508153; a=rsa-sha256; cv=none; b=7YvqlF4u/+uh3n47zlJxJC0xELv1M5aTvizjckokWOhU5wRbMjjKvv/QTHF75ebPfDaeVz e+1QXf5NkM932b8cIHxff79HtfxWWkO2IaU+RZ1xISTQElGigjVhe0SCClNj4UQtDQ1cn5 l6FxjTPxZG7yBQ4G6QqgXjJbZPlV+hk= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 535DD61F70; Thu, 27 Jun 2024 17:09:20 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DE0A4C4AF09; Thu, 27 Jun 2024 17:09:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1719508160; bh=SnshODZSkwsKfCxIK+mNdzK2K89M7NgjFr+cfop4D80=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=iWYnW33BV/Ie/FR7Q2tx+s2abv9P1bvMVoKSuWcNb+ixChl7ZRMN2zVomBYEWUdd3 8Qohl1VMStYWHAt5Pi+2RgMJM1tYbbhFVDyP2PweX0Kl4p2eR0iBaGtS4PQ9/9Pdpp 03CRPTNY/Gjo941thtIDg4YBu8ftzgbYHfyTMTwZCOYeDgv4DtnnlUDLOolxhT5Xrn 4dbZQCZJhPEOKxZxugb1jEhN5vZZ2Ka+RlFTmM6QYCVAWL01tySTNLuFSw4eIBHzmN /OWdYEWPw4U0IQ3rU8rD5rUmq4z/dDpDOn4NLacj8NpF0YYYV6K5VU7LvUzyTFzGhJ zd08e/kVhG8YA== From: Andrii Nakryiko To: linux-fsdevel@vger.kernel.org, brauner@kernel.org, viro@zeniv.linux.org.uk, akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, bpf@vger.kernel.org, gregkh@linuxfoundation.org, linux-mm@kvack.org, liam.howlett@oracle.com, surenb@google.com, rppt@kernel.org, adobriyan@gmail.com, Andrii Nakryiko Subject: [PATCH v6 5/6] tools: sync uapi/linux/fs.h header into tools subdir Date: Thu, 27 Jun 2024 10:08:57 -0700 Message-ID: <20240627170900.1672542-6-andrii@kernel.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240627170900.1672542-1-andrii@kernel.org> References: <20240627170900.1672542-1-andrii@kernel.org> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 232D8160021 X-Stat-Signature: y1p49w3y6hrhzijaksi9yug6jpoqtgru X-HE-Tag: 1719508160-214261 X-HE-Meta: U2FsdGVkX19f50SegGayqvuM7340T2IX2SCtCYGOxkHuYNLZZdLlPZx+C0qTWv7VpqDGHyevpOxI+OuiFufSK83pk2yNWyCCSCV/TquUbjgV017y/rkiSZNlQ6Tn5benlAV5EAd0kwPlqTa7NvCJ0cPyR6TLNADfK0uOLgjGCBttSMV0fasI/y4MR0c3ppm9y7k7o2LlVjC3Ss4DgHYlh0fUjeSKuNAA2vq4/u4ye7LneINSkE97GbWI13mZGtN0fsQAM8jSKXrqKnh4Ni8jkzeOnjCIYiM7pskGgSx65c3ckXscMqzOhEZcZtXq4GLCc8deeItbW7a+PeS3obBcncMxaFMt+eTjeDl3ytc1vFz2YCoZwwTATpBABlF6sQ+zYUPAhiFk03qdQLevupvE8bbNh2GRbKZjaVbln/OAJaPMZWm2avaBFle5Rjm2ZliyOrwUOoHmJEtCKe9hha7froAszH23FkU8aW+Vf854Ic5wBWxibr6btXRPbkZZOl8xRMgdcJllnBrMY3liNBJptj9qIAHdHuymUh2OSntRy6SQdaqpoMW+d4pJGqyHHajQ1qNuMPeZ4t5q68TkOoMeq2ZsuNE6yI5U7EgVpTX7taQxp0GtO9dDePJSEQikwF2JzLIJrdc1n3DMfBg2nSYQRprQyZpSDQMzKnlky0ZSpOFEl/1n0LKnfYHqZzT05jq+QAYrL7c0a8bn+bzSYTOIkfML9OVK+Wq5nqZQZy+I+1WmlgGHH6Vc4eojm2fPoHYURYUjJSI/rCyymXDlCnxRebwPQ1JJwMWTQi4r/n3JtVm48b/rIvp+YOxUgxSOvidGIrFyetMDTtJoCvjYU2pXvwBeGcD0/4nHpfk3qKy+xCJpdywSa+bl+hKeyW3M54+7fJSwUDbyxYUIZNL05GputtLDzYE949L9bMHh5PvbFq378W2YtRhKbKjPDAtM+ZpmSs7rOwvliH9EPxNEv9P yb4y2bU6 kvSRYHiaosZrMgC+swgu6KA3MwzhJ4FEfMtnaaoXI6QXFNItwvY8uU7v90kdL7dd8N1sz7x/PNbMO5b2b99z20Q3Dj2zPrV/SeaYqBhQHfUD0bRCtTdds/R+2EvuvgYhompRpTOivKxpguzQS69qqUK8dcB5NCg5h0qxSU2WrMQC87bZepLju4PC1OKUg3a3PcSm0IbuOWW6KQFih5solxobvq7F7L/Nidd8uE8XwKYYxzeg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: We need this UAPI header in tools/include subdirectory for using it from BPF selftests. Signed-off-by: Andrii Nakryiko --- tools/include/uapi/linux/fs.h | 184 +++++++++++++++++++++++++++++++--- 1 file changed, 172 insertions(+), 12 deletions(-) diff --git a/tools/include/uapi/linux/fs.h b/tools/include/uapi/linux/fs.h index cc3fea99fd43..2a4a5f50c98e 100644 --- a/tools/include/uapi/linux/fs.h +++ b/tools/include/uapi/linux/fs.h @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ -#ifndef _LINUX_FS_H -#define _LINUX_FS_H +#ifndef _UAPI_LINUX_FS_H +#define _UAPI_LINUX_FS_H /* * This file has definitions for some important file table structures @@ -13,10 +13,14 @@ #include #include #include +#ifndef __KERNEL__ #include +#endif /* Use of MS_* flags within the kernel is restricted to core mount(2) code. */ +#if !defined(__KERNEL__) #include +#endif /* * It's silly to have NR_OPEN bigger than NR_FILE, but you can change @@ -24,8 +28,8 @@ * nr_file rlimit, so it's safe to set up a ridiculously high absolute * upper limit on files-per-process. * - * Some programs (notably those using select()) may have to be - * recompiled to take full advantage of the new limits.. + * Some programs (notably those using select()) may have to be + * recompiled to take full advantage of the new limits.. */ /* Fixed constants first: */ @@ -308,29 +312,31 @@ struct fsxattr { typedef int __bitwise __kernel_rwf_t; /* high priority request, poll if possible */ -#define RWF_HIPRI ((__kernel_rwf_t)0x00000001) +#define RWF_HIPRI ((__force __kernel_rwf_t)0x00000001) /* per-IO O_DSYNC */ -#define RWF_DSYNC ((__kernel_rwf_t)0x00000002) +#define RWF_DSYNC ((__force __kernel_rwf_t)0x00000002) /* per-IO O_SYNC */ -#define RWF_SYNC ((__kernel_rwf_t)0x00000004) +#define RWF_SYNC ((__force __kernel_rwf_t)0x00000004) /* per-IO, return -EAGAIN if operation would block */ -#define RWF_NOWAIT ((__kernel_rwf_t)0x00000008) +#define RWF_NOWAIT ((__force __kernel_rwf_t)0x00000008) /* per-IO O_APPEND */ -#define RWF_APPEND ((__kernel_rwf_t)0x00000010) +#define RWF_APPEND ((__force __kernel_rwf_t)0x00000010) /* per-IO negation of O_APPEND */ -#define RWF_NOAPPEND ((__kernel_rwf_t)0x00000020) +#define RWF_NOAPPEND ((__force __kernel_rwf_t)0x00000020) /* mask of flags supported by the kernel */ #define RWF_SUPPORTED (RWF_HIPRI | RWF_DSYNC | RWF_SYNC | RWF_NOWAIT |\ RWF_APPEND | RWF_NOAPPEND) +#define PROCFS_IOCTL_MAGIC 'f' + /* Pagemap ioctl */ -#define PAGEMAP_SCAN _IOWR('f', 16, struct pm_scan_arg) +#define PAGEMAP_SCAN _IOWR(PROCFS_IOCTL_MAGIC, 16, struct pm_scan_arg) /* Bitmasks provided in pm_scan_args masks and reported in page_region.categories. */ #define PAGE_IS_WPALLOWED (1 << 0) @@ -389,4 +395,158 @@ struct pm_scan_arg { __u64 return_mask; }; -#endif /* _LINUX_FS_H */ +/* /proc//maps ioctl */ +#define PROCMAP_QUERY _IOWR(PROCFS_IOCTL_MAGIC, 17, struct procmap_query) + +enum procmap_query_flags { + /* + * VMA permission flags. + * + * Can be used as part of procmap_query.query_flags field to look up + * only VMAs satisfying specified subset of permissions. E.g., specifying + * PROCMAP_QUERY_VMA_READABLE only will return both readable and read/write VMAs, + * while having PROCMAP_QUERY_VMA_READABLE | PROCMAP_QUERY_VMA_WRITABLE will only + * return read/write VMAs, though both executable/non-executable and + * private/shared will be ignored. + * + * PROCMAP_QUERY_VMA_* flags are also returned in procmap_query.vma_flags + * field to specify actual VMA permissions. + */ + PROCMAP_QUERY_VMA_READABLE = 0x01, + PROCMAP_QUERY_VMA_WRITABLE = 0x02, + PROCMAP_QUERY_VMA_EXECUTABLE = 0x04, + PROCMAP_QUERY_VMA_SHARED = 0x08, + /* + * Query modifier flags. + * + * By default VMA that covers provided address is returned, or -ENOENT + * is returned. With PROCMAP_QUERY_COVERING_OR_NEXT_VMA flag set, closest + * VMA with vma_start > addr will be returned if no covering VMA is + * found. + * + * PROCMAP_QUERY_FILE_BACKED_VMA instructs query to consider only VMAs that + * have file backing. Can be combined with PROCMAP_QUERY_COVERING_OR_NEXT_VMA + * to iterate all VMAs with file backing. + */ + PROCMAP_QUERY_COVERING_OR_NEXT_VMA = 0x10, + PROCMAP_QUERY_FILE_BACKED_VMA = 0x20, +}; + +/* + * Input/output argument structured passed into ioctl() call. It can be used + * to query a set of VMAs (Virtual Memory Areas) of a process. + * + * Each field can be one of three kinds, marked in a short comment to the + * right of the field: + * - "in", input argument, user has to provide this value, kernel doesn't modify it; + * - "out", output argument, kernel sets this field with VMA data; + * - "in/out", input and output argument; user provides initial value (used + * to specify maximum allowable buffer size), and kernel sets it to actual + * amount of data written (or zero, if there is no data). + * + * If matching VMA is found (according to criterias specified by + * query_addr/query_flags, all the out fields are filled out, and ioctl() + * returns 0. If there is no matching VMA, -ENOENT will be returned. + * In case of any other error, negative error code other than -ENOENT is + * returned. + * + * Most of the data is similar to the one returned as text in /proc//maps + * file, but procmap_query provides more querying flexibility. There are no + * consistency guarantees between subsequent ioctl() calls, but data returned + * for matched VMA is self-consistent. + */ +struct procmap_query { + /* Query struct size, for backwards/forward compatibility */ + __u64 size; + /* + * Query flags, a combination of enum procmap_query_flags values. + * Defines query filtering and behavior, see enum procmap_query_flags. + * + * Input argument, provided by user. Kernel doesn't modify it. + */ + __u64 query_flags; /* in */ + /* + * Query address. By default, VMA that covers this address will + * be looked up. PROCMAP_QUERY_* flags above modify this default + * behavior further. + * + * Input argument, provided by user. Kernel doesn't modify it. + */ + __u64 query_addr; /* in */ + /* VMA starting (inclusive) and ending (exclusive) address, if VMA is found. */ + __u64 vma_start; /* out */ + __u64 vma_end; /* out */ + /* VMA permissions flags. A combination of PROCMAP_QUERY_VMA_* flags. */ + __u64 vma_flags; /* out */ + /* VMA backing page size granularity. */ + __u64 vma_page_size; /* out */ + /* + * VMA file offset. If VMA has file backing, this specifies offset + * within the file that VMA's start address corresponds to. + * Is set to zero if VMA has no backing file. + */ + __u64 vma_offset; /* out */ + /* Backing file's inode number, or zero, if VMA has no backing file. */ + __u64 inode; /* out */ + /* Backing file's device major/minor number, or zero, if VMA has no backing file. */ + __u32 dev_major; /* out */ + __u32 dev_minor; /* out */ + /* + * If set to non-zero value, signals the request to return VMA name + * (i.e., VMA's backing file's absolute path, with " (deleted)" suffix + * appended, if file was unlinked from FS) for matched VMA. VMA name + * can also be some special name (e.g., "[heap]", "[stack]") or could + * be even user-supplied with prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME). + * + * Kernel will set this field to zero, if VMA has no associated name. + * Otherwise kernel will return actual amount of bytes filled in + * user-supplied buffer (see vma_name_addr field below), including the + * terminating zero. + * + * If VMA name is longer that user-supplied maximum buffer size, + * -E2BIG error is returned. + * + * If this field is set to non-zero value, vma_name_addr should point + * to valid user space memory buffer of at least vma_name_size bytes. + * If set to zero, vma_name_addr should be set to zero as well + */ + __u32 vma_name_size; /* in/out */ + /* + * If set to non-zero value, signals the request to extract and return + * VMA's backing file's build ID, if the backing file is an ELF file + * and it contains embedded build ID. + * + * Kernel will set this field to zero, if VMA has no backing file, + * backing file is not an ELF file, or ELF file has no build ID + * embedded. + * + * Build ID is a binary value (not a string). Kernel will set + * build_id_size field to exact number of bytes used for build ID. + * If build ID is requested and present, but needs more bytes than + * user-supplied maximum buffer size (see build_id_addr field below), + * -E2BIG error will be returned. + * + * If this field is set to non-zero value, build_id_addr should point + * to valid user space memory buffer of at least build_id_size bytes. + * If set to zero, build_id_addr should be set to zero as well + */ + __u32 build_id_size; /* in/out */ + /* + * User-supplied address of a buffer of at least vma_name_size bytes + * for kernel to fill with matched VMA's name (see vma_name_size field + * description above for details). + * + * Should be set to zero if VMA name should not be returned. + */ + __u64 vma_name_addr; /* in */ + /* + * User-supplied address of a buffer of at least build_id_size bytes + * for kernel to fill with matched VMA's ELF build ID, if available + * (see build_id_size field description above for details). + * + * Should be set to zero if build ID should not be returned. + */ + __u64 build_id_addr; /* in */ +}; + +#endif /* _UAPI_LINUX_FS_H */ From patchwork Thu Jun 27 17:08:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13714880 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2DB48C2BD09 for ; Thu, 27 Jun 2024 17:09:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B97AC6B00A9; Thu, 27 Jun 2024 13:09:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B2BD96B00AA; Thu, 27 Jun 2024 13:09:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 94E9F6B00AB; Thu, 27 Jun 2024 13:09:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 71C0B6B00A9 for ; Thu, 27 Jun 2024 13:09:29 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 2F19540D03 for ; Thu, 27 Jun 2024 17:09:29 +0000 (UTC) X-FDA: 82277304858.18.A687DA2 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf22.hostedemail.com (Postfix) with ESMTP id ED92BC0011 for ; Thu, 27 Jun 2024 17:09:26 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=UR7h+SFI; spf=pass (imf22.hostedemail.com: domain of andrii@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=andrii@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719508159; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YE47zTswLtVm2nHY1MKC88+b4jDA3vOIJ0oDH/6LR9U=; b=USfkGCgYHLTn1kgRkP3qt0cIdes3NemzbCS3WxLCGC313P4sAv69hobR+NC+jF8pXVM/6i fime2HVb8JF0+sLpqPNLVTxOc9F/LXqIoF2J/SqhqdxQ9aoqOQG+WximAgErojqds6aEYP fQPJB3SUJhG1NY6aY6hWV8bUwCSAMbc= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=UR7h+SFI; spf=pass (imf22.hostedemail.com: domain of andrii@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=andrii@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719508159; a=rsa-sha256; cv=none; b=Ze+tOiePgq0lBjShjcdVv96e6CpiNv8vh9X3yDVkocTYUQyGYwcP++zxg+fEAkixfoCFGj YSu4M4RQ+kFniqsmNl0N/ayaxApg/rqiBmLm8L3eEnD28ESa++g7tE/3yOT1BdfW7cyki/ 3Ufh91p78CcVpnn3Bprvy/NMvGvTVWg= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id 19ED1CE3377; Thu, 27 Jun 2024 17:09:24 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3BC84C2BBFC; Thu, 27 Jun 2024 17:09:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1719508163; bh=UCwCLJEQqsSSL94rfskjH98zL7sdkNEm83V+CMuhWVw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=UR7h+SFIAeDssN9Yd1k3TBkGRgLtfj2sQPBGN2OlUxlh1q1ZxiDntpTkFtxOhtyZV 2qE5ePOJvfOKb6ER6fIAlKaVrHzMj81EfOw6Ciyox+OgZZV+F29CRbykU+ttAhN/1F fwID8x9Mk1rlmwGxemy51dhDUSQhCk28NlJfqIWHQhAUN386ENUEO3PDWyhHvaSXJg A2fbh3K86f5J1fBgTvckpafBZ9pC0Z5E9sykAFddIECSazhNuE/GP1S8yRw/c/Gktx sQC5EdEQ7PzLd+4Sm53UHpyhO5DXZEfKHpqaMqJfjc5varHjIGprsSNMK/Pmb6wrt/ 70GdVpHmM1jTw== From: Andrii Nakryiko To: linux-fsdevel@vger.kernel.org, brauner@kernel.org, viro@zeniv.linux.org.uk, akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, bpf@vger.kernel.org, gregkh@linuxfoundation.org, linux-mm@kvack.org, liam.howlett@oracle.com, surenb@google.com, rppt@kernel.org, adobriyan@gmail.com, Andrii Nakryiko Subject: [PATCH v6 6/6] selftests/proc: add PROCMAP_QUERY ioctl tests Date: Thu, 27 Jun 2024 10:08:58 -0700 Message-ID: <20240627170900.1672542-7-andrii@kernel.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240627170900.1672542-1-andrii@kernel.org> References: <20240627170900.1672542-1-andrii@kernel.org> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: ED92BC0011 X-Stat-Signature: 5i6dcxrbaefaku47qb34p7au6z1f7e31 X-HE-Tag: 1719508166-200911 X-HE-Meta: U2FsdGVkX1+XkhWBTobFvNYBarwnBJGU5YI4Ndkug1bPTLXXtuv10WXsphc6oKwlplPrL81dUyfZot/qe4cgHDYU7xDLL7ezBjqCtnJZxmI3q9qEmPcy2HCql22tpyLEwkElDA+2OAJucbMZObebnjsQl46dtvA0Xz9UmebCsiDg3r+nI2lpmKBna6l7CSxT+6uGLo/yPxevwRp+NVNkBf0blzmxVLKmQgEYThQpLekPNb3hUcAY5BRebOam3+OmJlwXQ1GO9pqEllAEI7DcHVljROgFsk7U7kfoXKEQ8utqLuU7d+mNIaAKPRBPnn2yt+R/xCadGeATMpJ59ZUkPSshK2CIgSSFPXny6T8U0jtzH4pmHYvT24YGaayl9NfxgTedKiV3QuN4X+3KG7i9EOw3JjPy4hJwe0fjCoxWtDchJT7YVKzEEPvjH+MFDJ1UITm32CvKi85xFh+Yaz7ekz1ZsP0sGm34MOXOMmq15Z91smDLbd8LHIqQCUm45xrzGcJS/8tb8AP62lq0TizNmNjm7VSZxObgPtqdunq2VNLjm+Mw1ZZRowZtC4bWYw2+4jw7tpX7Wza0dSZ2wFSOrMR3oK2GsS8eMT5u8F2JFRg5VQawnvbl0t5cfpvGykhFDccgKTxGBAunbfG8fHVvFmpHqPir4Y1p7rD9L9m5z7x3PTZRDL/0x1uR6hSfI/hQtbsNuN2vi+gviT/qFoelPQxedhGnxFt6Nj+oS0GYjN432o8UnnN1yJittsUexqpkz/B8Ny3nRxdhH0RNtqoDUR404dXhEMBNSEmJ0t4UChK2PC1QBwTlqMkV7Vrv+fmnUMMIbc7f8960x6nLZaaO2q4n9bqf2vGkwJHlYfD842c2mocAsW7Mc19X+EX01pyc5JaBHTYYKI6Dx0K/LR7OygaJ23MgkAlmdtmViDdvI4leXDLGSk2NMZSB2+BUKgnt7/b9Dgg819nZ9exX0ue fBkDKL+e xMkEsZGjZlgWhfbIuxMeqNw7nvR/tQbH8phXFf3gQjYfLmd9pQw+Yrag6XDUIh8581IiftnzJk38RJyovSG+97fM2GKBlXN9P7bIIEcmp+duUfboIFWXIWAnBKQubo6EyGIduq3rE2dz5+UY/ZSAmKR1Tc25RMIIS/h2x0gqTZNr7SeFfNfofGDNtloJwfCytUeGR7E32GeUqoSsc4C3v7QQ4CUeNQZhotVX/1HYntRebKpExhOy3fthiaVYcymurV7DH/OaWolGGrq5AsJ940cTwZDzWZV9zWzyw X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Extend existing proc-pid-vm.c tests with PROCMAP_QUERY ioctl() API. Test a few successful and negative cases, validating querying filtering and exact vs next VMA logic works as expected. Signed-off-by: Andrii Nakryiko --- tools/testing/selftests/proc/Makefile | 1 + tools/testing/selftests/proc/proc-pid-vm.c | 86 ++++++++++++++++++++++ 2 files changed, 87 insertions(+) diff --git a/tools/testing/selftests/proc/Makefile b/tools/testing/selftests/proc/Makefile index cd95369254c0..291e7087f1b3 100644 --- a/tools/testing/selftests/proc/Makefile +++ b/tools/testing/selftests/proc/Makefile @@ -1,6 +1,7 @@ # SPDX-License-Identifier: GPL-2.0-only CFLAGS += -Wall -O2 -Wno-unused-function CFLAGS += -D_GNU_SOURCE +CFLAGS += $(TOOLS_INCLUDES) LDFLAGS += -pthread TEST_GEN_PROGS := diff --git a/tools/testing/selftests/proc/proc-pid-vm.c b/tools/testing/selftests/proc/proc-pid-vm.c index cacbd2a4aec9..d04685771952 100644 --- a/tools/testing/selftests/proc/proc-pid-vm.c +++ b/tools/testing/selftests/proc/proc-pid-vm.c @@ -45,6 +45,7 @@ #include #include #include +#include #include "../kselftest.h" @@ -492,6 +493,91 @@ int main(void) assert(buf[13] == '\n'); } + /* Test PROCMAP_QUERY ioctl() for /proc/$PID/maps */ + { + char path_buf[256], exp_path_buf[256]; + struct procmap_query q; + int fd, err; + + snprintf(path_buf, sizeof(path_buf), "/proc/%u/maps", pid); + fd = open(path_buf, O_RDONLY); + if (fd == -1) + return 1; + + /* CASE 1: exact MATCH at VADDR */ + memset(&q, 0, sizeof(q)); + q.size = sizeof(q); + q.query_addr = VADDR; + q.query_flags = 0; + q.vma_name_addr = (__u64)(unsigned long)path_buf; + q.vma_name_size = sizeof(path_buf); + + err = ioctl(fd, PROCMAP_QUERY, &q); + assert(err == 0); + + assert(q.query_addr == VADDR); + assert(q.query_flags == 0); + + assert(q.vma_flags == (PROCMAP_QUERY_VMA_READABLE | PROCMAP_QUERY_VMA_EXECUTABLE)); + assert(q.vma_start == VADDR); + assert(q.vma_end == VADDR + PAGE_SIZE); + assert(q.vma_page_size == PAGE_SIZE); + + assert(q.vma_offset == 0); + assert(q.inode == st.st_ino); + assert(q.dev_major == MAJOR(st.st_dev)); + assert(q.dev_minor == MINOR(st.st_dev)); + + snprintf(exp_path_buf, sizeof(exp_path_buf), + "/tmp/#%llu (deleted)", (unsigned long long)st.st_ino); + assert(q.vma_name_size == strlen(exp_path_buf) + 1); + assert(strcmp(path_buf, exp_path_buf) == 0); + + /* CASE 2: NO MATCH at VADDR-1 */ + memset(&q, 0, sizeof(q)); + q.size = sizeof(q); + q.query_addr = VADDR - 1; + q.query_flags = 0; /* exact match */ + + err = ioctl(fd, PROCMAP_QUERY, &q); + err = err < 0 ? -errno : 0; + assert(err == -ENOENT); + + /* CASE 3: MATCH COVERING_OR_NEXT_VMA at VADDR - 1 */ + memset(&q, 0, sizeof(q)); + q.size = sizeof(q); + q.query_addr = VADDR - 1; + q.query_flags = PROCMAP_QUERY_COVERING_OR_NEXT_VMA; + + err = ioctl(fd, PROCMAP_QUERY, &q); + assert(err == 0); + + assert(q.query_addr == VADDR - 1); + assert(q.query_flags == PROCMAP_QUERY_COVERING_OR_NEXT_VMA); + assert(q.vma_start == VADDR); + assert(q.vma_end == VADDR + PAGE_SIZE); + + /* CASE 4: NO MATCH at VADDR + PAGE_SIZE */ + memset(&q, 0, sizeof(q)); + q.size = sizeof(q); + q.query_addr = VADDR + PAGE_SIZE; /* point right after the VMA */ + q.query_flags = PROCMAP_QUERY_COVERING_OR_NEXT_VMA; + + err = ioctl(fd, PROCMAP_QUERY, &q); + err = err < 0 ? -errno : 0; + assert(err == -ENOENT); + + /* CASE 5: NO MATCH WRITABLE at VADDR */ + memset(&q, 0, sizeof(q)); + q.size = sizeof(q); + q.query_addr = VADDR; + q.query_flags = PROCMAP_QUERY_VMA_WRITABLE; + + err = ioctl(fd, PROCMAP_QUERY, &q); + err = err < 0 ? -errno : 0; + assert(err == -ENOENT); + } + return 0; } #else