From patchwork Tue Oct 29 23:12:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 13855639 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 64A8B207A12; Tue, 29 Oct 2024 23:13:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730243583; cv=none; b=ut3duUt6ZdJ81RfVu6II8oHbfiw1dJWAZ7lLhMf6TX92F2MWOp+uH9zSnxe9vh/WPARHzGt/5UC+I4IY3gnKfjeUJ+mE1GqFv1TUndWx9kiy8c5mQBAVbtqx3gcKwjIiodDmKxfzvtKgNWoGepOSZuFgVv5ffo91fsNlZVSk6Ts= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730243583; c=relaxed/simple; bh=fFT1L9du/H1IVWZ0eDVHcXdOUhHiStUdXZ7+xMjkVO8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=nmAGU1IfmrDH7OcFwujrLRoCPZxcwXXYHDJTL+qumEnRm9/RGr1p6e4OftHbJ189GevOglroKVvIXgzcn7UQZeBUVuMGhrCE1erWbhLIPjhGAqJyzr51A341nZ/XZVaihU2ys8Ch9877JGzET9teAQs02H/2efevltC1jH/GIjY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=LSKInQXB; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="LSKInQXB" Received: by smtp.kernel.org (Postfix) with ESMTPSA id BF0D4C4CEE3; Tue, 29 Oct 2024 23:12:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1730243583; bh=fFT1L9du/H1IVWZ0eDVHcXdOUhHiStUdXZ7+xMjkVO8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LSKInQXB7YJpOLtG1Sv30vmLc+NtVYkw8tEGdybzSf12xtX/ex5GI1tDuez+Kxxlm 3VUYh6R8bd7zR7smF77QzF4WMEYw4o45R7zORDAasjD20ZpI/0yPgtYRv2jGTIGiJD 62BoPiWkl4reYngRYcqsrTj7xodQSA2VIfXwjn225BuxipFFu4ZZW8GEOUUnzc3cqj cKpK7rMi6VpKib8E+eHsOz8pdOH5oOJguz90e8i8eEJuvGUKVTu/deKpXhQNCcQoeU oTDtfsJETJNPSIH2UjGNBhpSxITUMO8qz3wiYZMLvBYd/s64jhP6FD2p08t3WgqwSe uPFbfWhEbrhcg== From: Song Liu To: bpf@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Cc: kernel-team@meta.com, andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org, daniel@iogearbox.net, martin.lau@linux.dev, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, kpsingh@kernel.org, mattbobrowski@google.com, amir73il@gmail.com, repnop@google.com, jlayton@kernel.org, josef@toxicpanda.com, Song Liu Subject: [RFC bpf-next fanotify 1/5] fanotify: Introduce fanotify fastpath handler Date: Tue, 29 Oct 2024 16:12:40 -0700 Message-ID: <20241029231244.2834368-2-song@kernel.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241029231244.2834368-1-song@kernel.org> References: <20241029231244.2834368-1-song@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 fanotify fastpath handler enables handling fanotify events within the kernel, and thus saves a trip to the user space. fanotify fastpath handler can be useful in many use cases. For example, if a user is only interested in events for some files in side a directory, a fastpath handler can be used to filter out irrelevant events. fanotify fastpath handler is attached to fsnotify_group. At most one fastpath handler can be attached to a fsnotify_group. The attach/detach of fastpath handlers are controlled by two new ioctls on the fanotify fds: FAN_IOC_ADD_FP and FAN_IOC_DEL_FP. fanotify fastpath handler is packaged in a kernel module. In the future, it is also possible to package fastpath handler in a BPF program. Since loading modules requires CAP_SYS_ADMIN, _loading_ fanotify fastpath handler in kernel modules is limited to CAP_SYS_ADMIN. However, non-SYS_CAP_ADMIN users can _attach_ fastpath handler loaded by sys admin to their fanotify fds. To make fanotify fastpath handler more useful for non-CAP_SYS_ADMIN users, a fastpath handler can take arguments at attach time. TODO: Add some mechanism to help users discover available fastpath handlers. For example, we can add a sysctl which is similar to net.ipv4.tcp_available_congestion_control, or we can add some sysfs entries. Signed-off-by: Song Liu --- fs/notify/fanotify/Makefile | 2 +- fs/notify/fanotify/fanotify.c | 25 ++++ fs/notify/fanotify/fanotify_fastpath.c | 171 +++++++++++++++++++++++++ fs/notify/fanotify/fanotify_user.c | 7 + include/linux/fanotify.h | 45 +++++++ include/linux/fsnotify_backend.h | 3 + include/uapi/linux/fanotify.h | 26 ++++ 7 files changed, 278 insertions(+), 1 deletion(-) create mode 100644 fs/notify/fanotify/fanotify_fastpath.c diff --git a/fs/notify/fanotify/Makefile b/fs/notify/fanotify/Makefile index 25ef222915e5..fddab88dde37 100644 --- a/fs/notify/fanotify/Makefile +++ b/fs/notify/fanotify/Makefile @@ -1,2 +1,2 @@ # SPDX-License-Identifier: GPL-2.0-only -obj-$(CONFIG_FANOTIFY) += fanotify.o fanotify_user.o +obj-$(CONFIG_FANOTIFY) += fanotify.o fanotify_user.o fanotify_fastpath.o diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c index 224bccaab4cc..a40ec06d0218 100644 --- a/fs/notify/fanotify/fanotify.c +++ b/fs/notify/fanotify/fanotify.c @@ -18,6 +18,8 @@ #include "fanotify.h" +extern struct srcu_struct fsnotify_mark_srcu; + static bool fanotify_path_equal(const struct path *p1, const struct path *p2) { return p1->mnt == p2->mnt && p1->dentry == p2->dentry; @@ -888,6 +890,7 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask, struct fsnotify_event *fsn_event; __kernel_fsid_t fsid = {}; u32 match_mask = 0; + struct fanotify_fastpath_hook *fp_hook; BUILD_BUG_ON(FAN_ACCESS != FS_ACCESS); BUILD_BUG_ON(FAN_MODIFY != FS_MODIFY); @@ -933,6 +936,25 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask, if (FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS)) fsid = fanotify_get_fsid(iter_info); + fp_hook = srcu_dereference(group->fanotify_data.fp_hook, &fsnotify_mark_srcu); + if (fp_hook) { + struct fanotify_fastpath_event fp_event = { + .mask = mask, + .data = data, + .data_type = data_type, + .dir = dir, + .file_name = file_name, + .fsid = &fsid, + .match_mask = match_mask, + }; + + ret = fp_hook->ops->fp_handler(group, fp_hook, &fp_event); + if (ret == FAN_FP_RET_SKIP_EVENT) { + ret = 0; + goto finish; + } + } + event = fanotify_alloc_event(group, mask, data, data_type, dir, file_name, &fsid, match_mask); ret = -ENOMEM; @@ -976,6 +998,9 @@ static void fanotify_free_group_priv(struct fsnotify_group *group) if (mempool_initialized(&group->fanotify_data.error_events_pool)) mempool_exit(&group->fanotify_data.error_events_pool); + + if (group->fanotify_data.fp_hook) + fanotify_fastpath_hook_free(group->fanotify_data.fp_hook); } static void fanotify_free_path_event(struct fanotify_event *event) diff --git a/fs/notify/fanotify/fanotify_fastpath.c b/fs/notify/fanotify/fanotify_fastpath.c new file mode 100644 index 000000000000..0453a1ac25b1 --- /dev/null +++ b/fs/notify/fanotify/fanotify_fastpath.c @@ -0,0 +1,171 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include + +#include "fanotify.h" + +extern struct srcu_struct fsnotify_mark_srcu; + +static DEFINE_SPINLOCK(fp_list_lock); +static LIST_HEAD(fp_list); + +static struct fanotify_fastpath_ops *fanotify_fastpath_find(const char *name) +{ + struct fanotify_fastpath_ops *ops; + + list_for_each_entry(ops, &fp_list, list) { + if (!strcmp(ops->name, name)) + return ops; + } + return NULL; +} + + +/* + * fanotify_fastpath_register - Register a new fastpath handler. + * + * Add a fastpath handler to the fp_list. These fastpath handlers are + * available for all users in the system. + * + * @ops: pointer to fanotify_fastpath_ops to add. + * + * Returns: + * 0 - on success; + * -EEXIST - fastpath handler of the same name already exists. + */ +int fanotify_fastpath_register(struct fanotify_fastpath_ops *ops) +{ + spin_lock(&fp_list_lock); + if (fanotify_fastpath_find(ops->name)) { + /* cannot register two handlers with the same name */ + spin_unlock(&fp_list_lock); + return -EEXIST; + } + list_add_tail(&ops->list, &fp_list); + spin_unlock(&fp_list_lock); + return 0; +} +EXPORT_SYMBOL_GPL(fanotify_fastpath_register); + +/* + * fanotify_fastpath_unregister - Unregister a new fastpath handler. + * + * Remove a fastpath handler from fp_list. + * + * @ops: pointer to fanotify_fastpath_ops to remove. + */ +void fanotify_fastpath_unregister(struct fanotify_fastpath_ops *ops) +{ + spin_lock(&fp_list_lock); + list_del_init(&ops->list); + spin_unlock(&fp_list_lock); +} +EXPORT_SYMBOL_GPL(fanotify_fastpath_unregister); + +/* + * fanotify_fastpath_add - Add a fastpath handler to fsnotify_group. + * + * Add a fastpath handler from fp_list to a fsnotify_group. + * + * @group: fsnotify_group that will have add + * @argp: fanotify_fastpath_args that specifies the fastpath handler + * and the init arguments of the fastpath handler. + * + * Returns: + * 0 - on success; + * -EEXIST - fastpath handler of the same name already exists. + */ +int fanotify_fastpath_add(struct fsnotify_group *group, + struct fanotify_fastpath_args __user *argp) +{ + struct fanotify_fastpath_hook *fp_hook; + struct fanotify_fastpath_ops *fp_ops; + struct fanotify_fastpath_args args; + int ret = 0; + + ret = copy_from_user(&args, argp, sizeof(args)); + if (ret) + return -EFAULT; + + if (args.version != 1 || args.flags || args.init_args_len > FAN_FP_ARGS_MAX) + return -EINVAL; + + args.name[FAN_FP_NAME_MAX - 1] = '\0'; + + fsnotify_group_lock(group); + + if (rcu_access_pointer(group->fanotify_data.fp_hook)) { + fsnotify_group_unlock(group); + return -EBUSY; + } + + fp_hook = kzalloc(sizeof(*fp_hook), GFP_KERNEL); + if (!fp_hook) { + ret = -ENOMEM; + goto out; + } + + spin_lock(&fp_list_lock); + fp_ops = fanotify_fastpath_find(args.name); + if (!fp_ops || !try_module_get(fp_ops->owner)) { + spin_unlock(&fp_list_lock); + ret = -ENOENT; + goto err_free_hook; + } + spin_unlock(&fp_list_lock); + + if (fp_ops->fp_init) { + char *init_args = NULL; + + if (args.init_args_len) { + init_args = strndup_user(u64_to_user_ptr(args.init_args), + args.init_args_len); + if (IS_ERR(init_args)) { + ret = PTR_ERR(init_args); + if (ret == -EINVAL) + ret = -E2BIG; + goto err_module_put; + } + } + ret = fp_ops->fp_init(fp_hook, init_args); + kfree(init_args); + if (ret) + goto err_module_put; + } + fp_hook->ops = fp_ops; + rcu_assign_pointer(group->fanotify_data.fp_hook, fp_hook); + +out: + fsnotify_group_unlock(group); + return ret; + +err_module_put: + module_put(fp_ops->owner); +err_free_hook: + kfree(fp_hook); + goto out; +} + +void fanotify_fastpath_hook_free(struct fanotify_fastpath_hook *fp_hook) +{ + if (fp_hook->ops->fp_free) + fp_hook->ops->fp_free(fp_hook); + + module_put(fp_hook->ops->owner); +} + +void fanotify_fastpath_del(struct fsnotify_group *group) +{ + struct fanotify_fastpath_hook *fp_hook; + + fsnotify_group_lock(group); + fp_hook = group->fanotify_data.fp_hook; + if (!fp_hook) + goto out; + + rcu_assign_pointer(group->fanotify_data.fp_hook, NULL); + fanotify_fastpath_hook_free(fp_hook); + +out: + fsnotify_group_unlock(group); +} diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c index 8e2d43fc6f7c..e96cb83f8409 100644 --- a/fs/notify/fanotify/fanotify_user.c +++ b/fs/notify/fanotify/fanotify_user.c @@ -987,6 +987,13 @@ static long fanotify_ioctl(struct file *file, unsigned int cmd, unsigned long ar spin_unlock(&group->notification_lock); ret = put_user(send_len, (int __user *) p); break; + case FAN_IOC_ADD_FP: + ret = fanotify_fastpath_add(group, p); + break; + case FAN_IOC_DEL_FP: + fanotify_fastpath_del(group); + ret = 0; + break; } return ret; diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h index 89ff45bd6f01..cea95307a580 100644 --- a/include/linux/fanotify.h +++ b/include/linux/fanotify.h @@ -136,4 +136,49 @@ #undef FAN_ALL_PERM_EVENTS #undef FAN_ALL_OUTGOING_EVENTS +struct fsnotify_group; +struct qstr; +struct inode; +struct fanotify_fastpath_hook; + +struct fanotify_fastpath_event { + u32 mask; + const void *data; + int data_type; + struct inode *dir; + const struct qstr *file_name; + __kernel_fsid_t *fsid; + u32 match_mask; +}; + +struct fanotify_fastpath_ops { + int (*fp_handler)(struct fsnotify_group *group, + struct fanotify_fastpath_hook *fp_hook, + struct fanotify_fastpath_event *fp_event); + int (*fp_init)(struct fanotify_fastpath_hook *hook, const char *args); + void (*fp_free)(struct fanotify_fastpath_hook *hook); + + char name[FAN_FP_NAME_MAX]; + struct module *owner; + struct list_head list; + int flags; +}; + +enum fanotify_fastpath_return { + FAN_FP_RET_SEND_TO_USERSPACE = 0, + FAN_FP_RET_SKIP_EVENT = 1, +}; + +struct fanotify_fastpath_hook { + struct fanotify_fastpath_ops *ops; + void *data; +}; + +int fanotify_fastpath_register(struct fanotify_fastpath_ops *ops); +void fanotify_fastpath_unregister(struct fanotify_fastpath_ops *ops); +int fanotify_fastpath_add(struct fsnotify_group *group, + struct fanotify_fastpath_args __user *args); +void fanotify_fastpath_del(struct fsnotify_group *group); +void fanotify_fastpath_hook_free(struct fanotify_fastpath_hook *fp_hook); + #endif /* _LINUX_FANOTIFY_H */ diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h index 3ecf7768e577..ef251b4e4e6f 100644 --- a/include/linux/fsnotify_backend.h +++ b/include/linux/fsnotify_backend.h @@ -117,6 +117,7 @@ struct fsnotify_fname; struct fsnotify_iter_info; struct mem_cgroup; +struct fanotify_fastpath_hook; /* * Each group much define these ops. The fsnotify infrastructure will call @@ -255,6 +256,8 @@ struct fsnotify_group { int f_flags; /* event_f_flags from fanotify_init() */ struct ucounts *ucounts; mempool_t error_events_pool; + + struct fanotify_fastpath_hook __rcu *fp_hook; } fanotify_data; #endif /* CONFIG_FANOTIFY */ }; diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h index 34f221d3a1b9..9c30baeebae0 100644 --- a/include/uapi/linux/fanotify.h +++ b/include/uapi/linux/fanotify.h @@ -3,6 +3,7 @@ #define _UAPI_LINUX_FANOTIFY_H #include +#include /* the following events that user-space can register for */ #define FAN_ACCESS 0x00000001 /* File was accessed */ @@ -243,4 +244,29 @@ struct fanotify_response_info_audit_rule { (long)(meta)->event_len >= (long)FAN_EVENT_METADATA_LEN && \ (long)(meta)->event_len <= (long)(len)) +#define FAN_FP_NAME_MAX 64 +#define FAN_FP_ARGS_MAX 1024 + +/* This is the arguments used to add fastpath handler to a group. */ +struct fanotify_fastpath_args { + /* user space pointer to the name of fastpath handler */ + char name[FAN_FP_NAME_MAX]; + + __u32 version; + __u32 flags; + + /* + * user space pointer to the init args of fastpath handler, + * up to init_args_len (<= FAN_FP_ARGS_MAX). + */ + __u64 init_args; + /* length of init_args */ + __u32 init_args_len; +} __attribute__((__packed__)); + +#define FAN_IOC_MAGIC 'F' + +#define FAN_IOC_ADD_FP _IOW(FAN_IOC_MAGIC, 0, struct fanotify_fastpath_args) +#define FAN_IOC_DEL_FP _IOW(FAN_IOC_MAGIC, 1, char[FAN_FP_NAME_MAX]) + #endif /* _UAPI_LINUX_FANOTIFY_H */ From patchwork Tue Oct 29 23:12:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 13855640 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 944FF20BB3B; Tue, 29 Oct 2024 23:13:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730243591; cv=none; b=G+2xt8UTgFRmS6Cw4TxWhPJgPOI3BmIYcAy3mNeMqYNuRA7ozOZwkg/BdDldyyBN7S0ddeELSHIjtR7Tksdu0BUdJ2pSNsAE3HUCeNaHTeNxL6Man4Nuq0ToyTdfIVLTgId8Wp7ZABk5xFzCbopiEtsrgfVZT0CrcFFn/1CnyTk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730243591; c=relaxed/simple; bh=iWQUP4hNCzTd1WdML5oB3J0FqpvqKWa2lTUc2Xwx1yk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=f9cEVbP312/6hbeJ4Ngb46nIe4AzY9Zk3paYgWd1wN7g28XMgHZ9E0pkhy97Xqach5xZRWKVC+WJdDxq8J4SDIH/HtfmSiM4tJYsM4mngQoRBLnh2nGziRXVZblj5tB88wNHQJvqJ03f7Drt+3MuYn3EjAaM23QFH99kpv/Ri9Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ai0/Ku7Q; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ai0/Ku7Q" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7E8DDC4CEE3; Tue, 29 Oct 2024 23:13:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1730243591; bh=iWQUP4hNCzTd1WdML5oB3J0FqpvqKWa2lTUc2Xwx1yk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ai0/Ku7QJ5xxwq27hB4KB9xQFczmc8b+ejL+RV6wm3+8U8NGLUYkJApO8mfTSkycz CTLCTEZjbHNGvWNIGWfqAlhC5HIabpjyrUp19BZKBurhxYohs7/ZO41QBD6R/HDok9 Y/w3RyqGNvMdMXFMET6Q0iDaqunIv4CjH5Pxl9aJSaJHKhz7jLie90rfak98rREtzw 5mhZw/w+sr9vKBhS8Fpl0LjcdWb75FJl4LbJZb7JvbcK4xx6j0Fbyk9BEHw8xggEpC P6iJmPxyh4dJksArDCXgZRnTbggBde5lI9161fqyEGrNB6D5XdvQXbbThqA2PhBD5e sssGodVV7sa3w== From: Song Liu To: bpf@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Cc: kernel-team@meta.com, andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org, daniel@iogearbox.net, martin.lau@linux.dev, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, kpsingh@kernel.org, mattbobrowski@google.com, amir73il@gmail.com, repnop@google.com, jlayton@kernel.org, josef@toxicpanda.com, Song Liu Subject: [RFC bpf-next fanotify 2/5] samples/fanotify: Add a sample fanotify fastpath handler Date: Tue, 29 Oct 2024 16:12:41 -0700 Message-ID: <20241029231244.2834368-3-song@kernel.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241029231244.2834368-1-song@kernel.org> References: <20241029231244.2834368-1-song@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This fastpath handler filters out events for files with certain prefixes. To use it: [root] insmod fastpath-mod.ko # This requires root. [user] ./fastpath-user /tmp a,b,c & # Root is not needed [user] touch /tmp/aa # a is in the prefix list (a,b,c), no events [user] touch /tmp/xx # x is not in the prefix list, generates events Accessing file xx # this is the output from fastpath_user Signed-off-by: Song Liu --- MAINTAINERS | 1 + samples/Kconfig | 20 ++++- samples/Makefile | 2 +- samples/fanotify/.gitignore | 1 + samples/fanotify/Makefile | 5 +- samples/fanotify/fastpath-mod.c | 138 +++++++++++++++++++++++++++++++ samples/fanotify/fastpath-user.c | 90 ++++++++++++++++++++ 7 files changed, 254 insertions(+), 3 deletions(-) create mode 100644 samples/fanotify/fastpath-mod.c create mode 100644 samples/fanotify/fastpath-user.c diff --git a/MAINTAINERS b/MAINTAINERS index 7ad507f49324..8939a48b2d99 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -8658,6 +8658,7 @@ S: Maintained F: fs/notify/fanotify/ F: include/linux/fanotify.h F: include/uapi/linux/fanotify.h +F: samples/fanotify/ FARADAY FOTG210 USB2 DUAL-ROLE CONTROLLER M: Linus Walleij diff --git a/samples/Kconfig b/samples/Kconfig index b288d9991d27..b0d3dff48bb0 100644 --- a/samples/Kconfig +++ b/samples/Kconfig @@ -149,15 +149,33 @@ config SAMPLE_CONNECTOR with it. See also Documentation/driver-api/connector.rst +config SAMPLE_FANOTIFY + bool "Build fanotify monitoring sample" + depends on FANOTIFY && CC_CAN_LINK && HEADERS_INSTALL + help + When enabled, this builds samples for fanotify. + There multiple samples for fanotify. Please see the + following configs for more details of these + samples. + config SAMPLE_FANOTIFY_ERROR bool "Build fanotify error monitoring sample" - depends on FANOTIFY && CC_CAN_LINK && HEADERS_INSTALL + depends on SAMPLE_FANOTIFY help When enabled, this builds an example code that uses the FAN_FS_ERROR fanotify mechanism to monitor filesystem errors. See also Documentation/admin-guide/filesystem-monitoring.rst. +config SAMPLE_FANOTIFY_FASTPATH + tristate "Build fanotify fastpath sample" + depends on SAMPLE_FANOTIFY && m + help + When enabled, this builds kernel module that contains a + fanotify fastpath handler. + The fastpath handler filters out certain filename + prefixes for the fanotify user. + config SAMPLE_HIDRAW bool "hidraw sample" depends on CC_CAN_LINK && HEADERS_INSTALL diff --git a/samples/Makefile b/samples/Makefile index b85fa64390c5..108360972626 100644 --- a/samples/Makefile +++ b/samples/Makefile @@ -6,7 +6,7 @@ subdir-$(CONFIG_SAMPLE_ANDROID_BINDERFS) += binderfs subdir-$(CONFIG_SAMPLE_CGROUP) += cgroup obj-$(CONFIG_SAMPLE_CONFIGFS) += configfs/ obj-$(CONFIG_SAMPLE_CONNECTOR) += connector/ -obj-$(CONFIG_SAMPLE_FANOTIFY_ERROR) += fanotify/ +obj-$(CONFIG_SAMPLE_FANOTIFY) += fanotify/ subdir-$(CONFIG_SAMPLE_HIDRAW) += hidraw obj-$(CONFIG_SAMPLE_HW_BREAKPOINT) += hw_breakpoint/ obj-$(CONFIG_SAMPLE_KDB) += kdb/ diff --git a/samples/fanotify/.gitignore b/samples/fanotify/.gitignore index d74593e8b2de..306e1ddec4e0 100644 --- a/samples/fanotify/.gitignore +++ b/samples/fanotify/.gitignore @@ -1 +1,2 @@ fs-monitor +fastpath-user diff --git a/samples/fanotify/Makefile b/samples/fanotify/Makefile index e20db1bdde3b..f5bbd7380104 100644 --- a/samples/fanotify/Makefile +++ b/samples/fanotify/Makefile @@ -1,5 +1,8 @@ # SPDX-License-Identifier: GPL-2.0-only -userprogs-always-y += fs-monitor +userprogs-always-$(CONFIG_SAMPLE_FANOTIFY_ERROR) += fs-monitor userccflags += -I usr/include -Wall +obj-$(CONFIG_SAMPLE_FANOTIFY_FASTPATH) += fastpath-mod.o + +userprogs-always-$(CONFIG_SAMPLE_FANOTIFY_FASTPATH) += fastpath-user diff --git a/samples/fanotify/fastpath-mod.c b/samples/fanotify/fastpath-mod.c new file mode 100644 index 000000000000..06c4b42ff114 --- /dev/null +++ b/samples/fanotify/fastpath-mod.c @@ -0,0 +1,138 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include +#include +#include +#include +#include + +struct prefix_item { + const char *prefix; + struct list_head list; +}; + +struct sample_fp_data { + /* + * str_table contains all the prefixes to ignore. For example, + * "prefix1\0prefix2\0prefix3" + */ + char *str_table; + + /* item->prefix points to different prefixes in the str_table. */ + struct list_head item_list; +}; + +static int sample_fp_handler(struct fsnotify_group *group, + struct fanotify_fastpath_hook *fp_hook, + struct fanotify_fastpath_event *fp_event) +{ + const struct qstr *file_name = fp_event->file_name; + struct sample_fp_data *fp_data; + struct prefix_item *item; + + if (!file_name) + return FAN_FP_RET_SEND_TO_USERSPACE; + fp_data = fp_hook->data; + + list_for_each_entry(item, &fp_data->item_list, list) { + if (strstr(file_name->name, item->prefix) == (char *)file_name->name) + return FAN_FP_RET_SKIP_EVENT; + } + + return FAN_FP_RET_SEND_TO_USERSPACE; +} + +static int add_item(struct sample_fp_data *fp_data, const char *prev) +{ + struct prefix_item *item; + + item = kzalloc(sizeof(*item), GFP_KERNEL); + if (!item) + return -ENOMEM; + item->prefix = prev; + list_add_tail(&item->list, &fp_data->item_list); + return 0; +} + +static void free_sample_fp_data(struct sample_fp_data *fp_data) +{ + struct prefix_item *item, *tmp; + + list_for_each_entry_safe(item, tmp, &fp_data->item_list, list) { + list_del_init(&item->list); + kfree(item); + } + kfree(fp_data->str_table); + kfree(fp_data); +} + +static int sample_fp_init(struct fanotify_fastpath_hook *fp_hook, const char *args) +{ + struct sample_fp_data *fp_data = kzalloc(sizeof(struct sample_fp_data), GFP_KERNEL); + char *p, *prev; + int ret; + + if (!fp_data) + return -ENOMEM; + + /* Make a copy of the list of prefix to ignore */ + fp_data->str_table = kstrndup(args, FAN_FP_ARGS_MAX, GFP_KERNEL); + if (!fp_data->str_table) { + ret = -ENOMEM; + goto err_out; + } + + INIT_LIST_HEAD(&fp_data->item_list); + prev = fp_data->str_table; + p = fp_data->str_table; + + /* Update the list replace ',' with '\n'*/ + while ((p = strchr(p, ',')) != NULL) { + *p = '\0'; + ret = add_item(fp_data, prev); + if (ret) + goto err_out; + p = p + 1; + prev = p; + } + + ret = add_item(fp_data, prev); + if (ret) + goto err_out; + + fp_hook->data = fp_data; + + return 0; + +err_out: + free_sample_fp_data(fp_data); + return ret; +} + +static void sample_fp_free(struct fanotify_fastpath_hook *fp_hook) +{ + free_sample_fp_data(fp_hook->data); +} + +static struct fanotify_fastpath_ops fan_fp_ignore_a_ops = { + .fp_handler = sample_fp_handler, + .fp_init = sample_fp_init, + .fp_free = sample_fp_free, + .name = "ignore-prefix", + .owner = THIS_MODULE, +}; + +static int __init fanotify_fastpath_sample_init(void) +{ + return fanotify_fastpath_register(&fan_fp_ignore_a_ops); +} +static void __exit fanotify_fastpath_sample_exit(void) +{ + fanotify_fastpath_unregister(&fan_fp_ignore_a_ops); +} + +module_init(fanotify_fastpath_sample_init); +module_exit(fanotify_fastpath_sample_exit); + +MODULE_AUTHOR("Song Liu"); +MODULE_DESCRIPTION("Example fanotify fastpath handler"); +MODULE_LICENSE("GPL"); diff --git a/samples/fanotify/fastpath-user.c b/samples/fanotify/fastpath-user.c new file mode 100644 index 000000000000..f301c4e0d21a --- /dev/null +++ b/samples/fanotify/fastpath-user.c @@ -0,0 +1,90 @@ +// SPDX-License-Identifier: GPL-2.0 +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include + +static int total_event_cnt; + +static void handle_notifications(char *buffer, int len) +{ + struct fanotify_event_metadata *event = + (struct fanotify_event_metadata *) buffer; + struct fanotify_event_info_header *info; + struct fanotify_event_info_fid *fid; + struct file_handle *handle; + char *name; + int off; + + for (; FAN_EVENT_OK(event, len); event = FAN_EVENT_NEXT(event, len)) { + for (off = sizeof(*event) ; off < event->event_len; + off += info->len) { + info = (struct fanotify_event_info_header *) + ((char *) event + off); + switch (info->info_type) { + case FAN_EVENT_INFO_TYPE_DFID_NAME: + fid = (struct fanotify_event_info_fid *) info; + handle = (struct file_handle *)&fid->handle; + name = (char *)handle + sizeof(*handle) + handle->handle_bytes; + + printf("Accessing file %s\n", name); + total_event_cnt++; + break; + default: + break; + } + } + } +} + +int main(int argc, char **argv) +{ + struct fanotify_fastpath_args args = { + .name = "ignore-prefix", + .version = 1, + .flags = 0, + }; + char buffer[BUFSIZ]; + int fd; + + if (argc < 3) { + printf("Usage\n" + "\t %s \n", + argv[0]); + return 1; + } + + args.init_args = (__u64)argv[2]; + args.init_args_len = strlen(argv[2]) + 1; + + fd = fanotify_init(FAN_CLASS_NOTIF | FAN_REPORT_NAME | FAN_REPORT_DIR_FID, O_RDONLY); + if (fd < 0) + errx(1, "fanotify_init"); + + if (fanotify_mark(fd, FAN_MARK_ADD, + FAN_OPEN | FAN_ONDIR | FAN_EVENT_ON_CHILD, + AT_FDCWD, argv[1])) { + errx(1, "fanotify_mark"); + } + + if (ioctl(fd, FAN_IOC_ADD_FP, &args)) + errx(1, "ioctl"); + + while (total_event_cnt < 10) { + int n = read(fd, buffer, BUFSIZ); + + if (n < 0) + errx(1, "read"); + + handle_notifications(buffer, n); + } + + ioctl(fd, FAN_IOC_DEL_FP); + close(fd); + + return 0; +} From patchwork Tue Oct 29 23:12:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 13855641 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2287E20B1EF; Tue, 29 Oct 2024 23:13:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730243599; cv=none; b=S+YDCEHzgJsgJAKNOJ8pZgL9DvrDudSzbHF59KR0VdtN5mEwZmv0fyx7AP+gwSDNaqNDGMnlH9E0KbpJN43jOlzmoVKKJroFoItMiULEXWIF60K8cZEabdkiQSzjpbIdddaHCmx40hrU94wHC4/oX7ciAWDG5MrxYVwbKpRr1Xk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730243599; c=relaxed/simple; bh=/24NpyvuU2fmjvy+HEqjRAqlIX/chnBPTcfm+1cdY1I=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=aaHM1oHdvLAvNnqzd4vf2/OVA3obepNysQVmOSnChr1aOJIl+Z24ImvEZCSEdD23Cg//pkf5mOAywJWeJsgp4C1vegSTNr0DCpcFGi3cUxSRrkWOr/OUm59mg0OXavj4wx9s/I6Ki0J5UPZxfSzV1aIYe5an3rWhvC4MG1CT9i8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=PNzXRFxt; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="PNzXRFxt" Received: by smtp.kernel.org (Postfix) with ESMTPSA id BDF51C4CEE7; Tue, 29 Oct 2024 23:13:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1730243599; bh=/24NpyvuU2fmjvy+HEqjRAqlIX/chnBPTcfm+1cdY1I=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=PNzXRFxt4yXttP6b4iz5oGZoIE6sBS06OryRHZmgrvrLUKg5ILthY9/XhvG5nCHDw lmBEiS2VBBTBl9uohfIuV0Gpf5o1CeWNw5NpLJHNUePJe0W0cIFFZCSYhBqiteeTcP P2HfKwvf6t9KX6VuNnGsvS0N7WzpIM2ZhqmEcqpkqmVwJBuVokqkuKqiQkoXzaYe4/ MZJba5goflySf3s+uyusIMFJcmH6+C2XmBiG8hsAIQs6QLYm6wIfPsqqLF8csC7Fi6 vdYM38+mItHnuPxVJRF1lxo3DErYUm05DFP8KS48cN73p6RWNx6SxIhHKfWz/Btc/N k2pZzGsB7zVpg== From: Song Liu To: bpf@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Cc: kernel-team@meta.com, andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org, daniel@iogearbox.net, martin.lau@linux.dev, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, kpsingh@kernel.org, mattbobrowski@google.com, amir73il@gmail.com, repnop@google.com, jlayton@kernel.org, josef@toxicpanda.com, Song Liu Subject: [RFC bpf-next fanotify 3/5] bpf: Make bpf inode storage available to tracing programs Date: Tue, 29 Oct 2024 16:12:42 -0700 Message-ID: <20241029231244.2834368-4-song@kernel.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241029231244.2834368-1-song@kernel.org> References: <20241029231244.2834368-1-song@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Use the same recursion voidance mechanism as task local storage. TODO: Better testing, add selftests. Signed-off-by: Song Liu --- include/linux/bpf.h | 9 ++ include/linux/bpf_lsm.h | 29 ------ include/linux/fs.h | 4 + kernel/bpf/Makefile | 3 +- kernel/bpf/bpf_inode_storage.c | 174 +++++++++++++++++++++++++-------- kernel/bpf/bpf_lsm.c | 4 - kernel/trace/bpf_trace.c | 8 ++ security/bpf/hooks.c | 5 - 8 files changed, 156 insertions(+), 80 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 19d8ca8ac960..863cb972d1fa 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -2630,6 +2630,7 @@ struct bpf_link *bpf_link_by_id(u32 id); const struct bpf_func_proto *bpf_base_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog); void bpf_task_storage_free(struct task_struct *task); +void bpf_inode_storage_free(struct inode *inode); void bpf_cgrp_storage_free(struct cgroup *cgroup); bool bpf_prog_has_kfunc_call(const struct bpf_prog *prog); const struct btf_func_model * @@ -2900,6 +2901,10 @@ static inline void bpf_task_storage_free(struct task_struct *task) { } +static inline void bpf_inode_storage_free(struct inode *inode) +{ +} + static inline bool bpf_prog_has_kfunc_call(const struct bpf_prog *prog) { return false; @@ -3263,6 +3268,10 @@ extern const struct bpf_func_proto bpf_task_storage_get_recur_proto; extern const struct bpf_func_proto bpf_task_storage_get_proto; extern const struct bpf_func_proto bpf_task_storage_delete_recur_proto; extern const struct bpf_func_proto bpf_task_storage_delete_proto; +extern const struct bpf_func_proto bpf_inode_storage_get_proto; +extern const struct bpf_func_proto bpf_inode_storage_get_recur_proto; +extern const struct bpf_func_proto bpf_inode_storage_delete_proto; +extern const struct bpf_func_proto bpf_inode_storage_delete_recur_proto; extern const struct bpf_func_proto bpf_for_each_map_elem_proto; extern const struct bpf_func_proto bpf_btf_find_by_name_kind_proto; extern const struct bpf_func_proto bpf_sk_setsockopt_proto; diff --git a/include/linux/bpf_lsm.h b/include/linux/bpf_lsm.h index aefcd6564251..a819c2f0a062 100644 --- a/include/linux/bpf_lsm.h +++ b/include/linux/bpf_lsm.h @@ -19,31 +19,12 @@ #include #undef LSM_HOOK -struct bpf_storage_blob { - struct bpf_local_storage __rcu *storage; -}; - -extern struct lsm_blob_sizes bpf_lsm_blob_sizes; - int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog, const struct bpf_prog *prog); bool bpf_lsm_is_sleepable_hook(u32 btf_id); bool bpf_lsm_is_trusted(const struct bpf_prog *prog); -static inline struct bpf_storage_blob *bpf_inode( - const struct inode *inode) -{ - if (unlikely(!inode->i_security)) - return NULL; - - return inode->i_security + bpf_lsm_blob_sizes.lbs_inode; -} - -extern const struct bpf_func_proto bpf_inode_storage_get_proto; -extern const struct bpf_func_proto bpf_inode_storage_delete_proto; -void bpf_inode_storage_free(struct inode *inode); - void bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog, bpf_func_t *bpf_func); int bpf_lsm_get_retval_range(const struct bpf_prog *prog, @@ -66,16 +47,6 @@ static inline int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog, return -EOPNOTSUPP; } -static inline struct bpf_storage_blob *bpf_inode( - const struct inode *inode) -{ - return NULL; -} - -static inline void bpf_inode_storage_free(struct inode *inode) -{ -} - static inline void bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog, bpf_func_t *bpf_func) { diff --git a/include/linux/fs.h b/include/linux/fs.h index 3559446279c1..479097e4dd5b 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -79,6 +79,7 @@ struct fs_context; struct fs_parameter_spec; struct fileattr; struct iomap_ops; +struct bpf_local_storage; extern void __init inode_init(void); extern void __init inode_init_early(void); @@ -648,6 +649,9 @@ struct inode { #ifdef CONFIG_SECURITY void *i_security; #endif +#ifdef CONFIG_BPF_SYSCALL + struct bpf_local_storage __rcu *i_bpf_storage; +#endif /* Stat data, not accessed from path walking */ unsigned long i_ino; diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile index 9b9c151b5c82..a5b7136b4884 100644 --- a/kernel/bpf/Makefile +++ b/kernel/bpf/Makefile @@ -10,8 +10,7 @@ obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o log.o obj-$(CONFIG_BPF_SYSCALL) += bpf_iter.o map_iter.o task_iter.o prog_iter.o link_iter.o obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list.o lpm_trie.o map_in_map.o bloom_filter.o obj-$(CONFIG_BPF_SYSCALL) += local_storage.o queue_stack_maps.o ringbuf.o -obj-$(CONFIG_BPF_SYSCALL) += bpf_local_storage.o bpf_task_storage.o -obj-${CONFIG_BPF_LSM} += bpf_inode_storage.o +obj-$(CONFIG_BPF_SYSCALL) += bpf_local_storage.o bpf_task_storage.o bpf_inode_storage.o obj-$(CONFIG_BPF_SYSCALL) += disasm.o mprog.o obj-$(CONFIG_BPF_JIT) += trampoline.o obj-$(CONFIG_BPF_SYSCALL) += btf.o memalloc.o diff --git a/kernel/bpf/bpf_inode_storage.c b/kernel/bpf/bpf_inode_storage.c index 29da6d3838f6..79bf7ebb329c 100644 --- a/kernel/bpf/bpf_inode_storage.c +++ b/kernel/bpf/bpf_inode_storage.c @@ -21,16 +21,36 @@ DEFINE_BPF_STORAGE_CACHE(inode_cache); -static struct bpf_local_storage __rcu ** -inode_storage_ptr(void *owner) +static DEFINE_PER_CPU(int, bpf_inode_storage_busy); + +static void bpf_inode_storage_lock(void) +{ + migrate_disable(); + this_cpu_inc(bpf_inode_storage_busy); +} + +static void bpf_inode_storage_unlock(void) +{ + this_cpu_dec(bpf_inode_storage_busy); + migrate_enable(); +} + +static bool bpf_inode_storage_trylock(void) +{ + migrate_disable(); + if (unlikely(this_cpu_inc_return(bpf_inode_storage_busy) != 1)) { + this_cpu_dec(bpf_inode_storage_busy); + migrate_enable(); + return false; + } + return true; +} + +static struct bpf_local_storage __rcu **inode_storage_ptr(void *owner) { struct inode *inode = owner; - struct bpf_storage_blob *bsb; - bsb = bpf_inode(inode); - if (!bsb) - return NULL; - return &bsb->storage; + return &inode->i_bpf_storage; } static struct bpf_local_storage_data *inode_storage_lookup(struct inode *inode, @@ -39,14 +59,9 @@ static struct bpf_local_storage_data *inode_storage_lookup(struct inode *inode, { struct bpf_local_storage *inode_storage; struct bpf_local_storage_map *smap; - struct bpf_storage_blob *bsb; - - bsb = bpf_inode(inode); - if (!bsb) - return NULL; inode_storage = - rcu_dereference_check(bsb->storage, bpf_rcu_lock_held()); + rcu_dereference_check(inode->i_bpf_storage, bpf_rcu_lock_held()); if (!inode_storage) return NULL; @@ -57,21 +72,18 @@ static struct bpf_local_storage_data *inode_storage_lookup(struct inode *inode, void bpf_inode_storage_free(struct inode *inode) { struct bpf_local_storage *local_storage; - struct bpf_storage_blob *bsb; - - bsb = bpf_inode(inode); - if (!bsb) - return; rcu_read_lock(); - local_storage = rcu_dereference(bsb->storage); + local_storage = rcu_dereference(inode->i_bpf_storage); if (!local_storage) { rcu_read_unlock(); return; } + bpf_inode_storage_lock(); bpf_local_storage_destroy(local_storage); + bpf_inode_storage_unlock(); rcu_read_unlock(); } @@ -83,7 +95,9 @@ static void *bpf_fd_inode_storage_lookup_elem(struct bpf_map *map, void *key) if (fd_empty(f)) return ERR_PTR(-EBADF); + bpf_inode_storage_lock(); sdata = inode_storage_lookup(file_inode(fd_file(f)), map, true); + bpf_inode_storage_unlock(); return sdata ? sdata->data : NULL; } @@ -98,13 +112,16 @@ static long bpf_fd_inode_storage_update_elem(struct bpf_map *map, void *key, if (!inode_storage_ptr(file_inode(fd_file(f)))) return -EBADF; + bpf_inode_storage_lock(); sdata = bpf_local_storage_update(file_inode(fd_file(f)), (struct bpf_local_storage_map *)map, value, map_flags, GFP_ATOMIC); + bpf_inode_storage_unlock(); return PTR_ERR_OR_ZERO(sdata); } -static int inode_storage_delete(struct inode *inode, struct bpf_map *map) +static int inode_storage_delete(struct inode *inode, struct bpf_map *map, + bool nobusy) { struct bpf_local_storage_data *sdata; @@ -112,6 +129,9 @@ static int inode_storage_delete(struct inode *inode, struct bpf_map *map) if (!sdata) return -ENOENT; + if (!nobusy) + return -EBUSY; + bpf_selem_unlink(SELEM(sdata), false); return 0; @@ -119,60 +139,114 @@ static int inode_storage_delete(struct inode *inode, struct bpf_map *map) static long bpf_fd_inode_storage_delete_elem(struct bpf_map *map, void *key) { + int err; + CLASS(fd_raw, f)(*(int *)key); if (fd_empty(f)) return -EBADF; - return inode_storage_delete(file_inode(fd_file(f)), map); + bpf_inode_storage_lock(); + err = inode_storage_delete(file_inode(fd_file(f)), map, true); + bpf_inode_storage_unlock(); + return err; } -/* *gfp_flags* is a hidden argument provided by the verifier */ -BPF_CALL_5(bpf_inode_storage_get, struct bpf_map *, map, struct inode *, inode, - void *, value, u64, flags, gfp_t, gfp_flags) +static void *__bpf_inode_storage_get(struct bpf_map *map, struct inode *inode, + void *value, u64 flags, gfp_t gfp_flags, bool nobusy) { struct bpf_local_storage_data *sdata; - WARN_ON_ONCE(!bpf_rcu_lock_held()); - if (flags & ~(BPF_LOCAL_STORAGE_GET_F_CREATE)) - return (unsigned long)NULL; - /* explicitly check that the inode_storage_ptr is not * NULL as inode_storage_lookup returns NULL in this case and * bpf_local_storage_update expects the owner to have a * valid storage pointer. */ if (!inode || !inode_storage_ptr(inode)) - return (unsigned long)NULL; + return NULL; sdata = inode_storage_lookup(inode, map, true); if (sdata) - return (unsigned long)sdata->data; + return sdata->data; - /* This helper must only called from where the inode is guaranteed - * to have a refcount and cannot be freed. - */ - if (flags & BPF_LOCAL_STORAGE_GET_F_CREATE) { + /* only allocate new storage, when the inode is refcounted */ + if (atomic_read(&inode->i_count) && + flags & BPF_LOCAL_STORAGE_GET_F_CREATE) { sdata = bpf_local_storage_update( inode, (struct bpf_local_storage_map *)map, value, BPF_NOEXIST, gfp_flags); - return IS_ERR(sdata) ? (unsigned long)NULL : - (unsigned long)sdata->data; + return IS_ERR(sdata) ? NULL : sdata->data; } - return (unsigned long)NULL; + return NULL; } -BPF_CALL_2(bpf_inode_storage_delete, - struct bpf_map *, map, struct inode *, inode) +/* *gfp_flags* is a hidden argument provided by the verifier */ +BPF_CALL_5(bpf_inode_storage_get_recur, struct bpf_map *, map, struct inode *, inode, + void *, value, u64, flags, gfp_t, gfp_flags) { + bool nobusy; + void *data; + + WARN_ON_ONCE(!bpf_rcu_lock_held()); + if (flags & ~(BPF_LOCAL_STORAGE_GET_F_CREATE)) + return (unsigned long)NULL; + + nobusy = bpf_inode_storage_trylock(); + data = __bpf_inode_storage_get(map, inode, value, flags, gfp_flags, nobusy); + if (nobusy) + bpf_inode_storage_unlock(); + return (unsigned long)data; +} + +/* *gfp_flags* is a hidden argument provided by the verifier */ +BPF_CALL_5(bpf_inode_storage_get, struct bpf_map *, map, struct inode *, inode, + void *, value, u64, flags, gfp_t, gfp_flags) +{ + void *data; + + WARN_ON_ONCE(!bpf_rcu_lock_held()); + if (flags & ~(BPF_LOCAL_STORAGE_GET_F_CREATE)) + return (unsigned long)NULL; + + bpf_inode_storage_lock(); + data = __bpf_inode_storage_get(map, inode, value, flags, gfp_flags, true); + bpf_inode_storage_unlock(); + return (unsigned long)data; +} + +BPF_CALL_2(bpf_inode_storage_delete_recur, struct bpf_map *, map, struct inode *, inode) +{ + bool nobusy; + int ret; + WARN_ON_ONCE(!bpf_rcu_lock_held()); if (!inode) return -EINVAL; + nobusy = bpf_inode_storage_trylock(); /* This helper must only called from where the inode is guaranteed * to have a refcount and cannot be freed. */ - return inode_storage_delete(inode, map); + ret = inode_storage_delete(inode, map, nobusy); + bpf_inode_storage_unlock(); + return ret; +} + +BPF_CALL_2(bpf_inode_storage_delete, struct bpf_map *, map, struct inode *, inode) +{ + int ret; + + WARN_ON_ONCE(!bpf_rcu_lock_held()); + if (!inode) + return -EINVAL; + + bpf_inode_storage_lock(); + /* This helper must only called from where the inode is guaranteed + * to have a refcount and cannot be freed. + */ + ret = inode_storage_delete(inode, map, true); + bpf_inode_storage_unlock(); + return ret; } static int notsupp_get_next_key(struct bpf_map *map, void *key, @@ -208,6 +282,17 @@ const struct bpf_map_ops inode_storage_map_ops = { BTF_ID_LIST_SINGLE(bpf_inode_storage_btf_ids, struct, inode) +const struct bpf_func_proto bpf_inode_storage_get_recur_proto = { + .func = bpf_inode_storage_get_recur, + .gpl_only = false, + .ret_type = RET_PTR_TO_MAP_VALUE_OR_NULL, + .arg1_type = ARG_CONST_MAP_PTR, + .arg2_type = ARG_PTR_TO_BTF_ID_OR_NULL, + .arg2_btf_id = &bpf_inode_storage_btf_ids[0], + .arg3_type = ARG_PTR_TO_MAP_VALUE_OR_NULL, + .arg4_type = ARG_ANYTHING, +}; + const struct bpf_func_proto bpf_inode_storage_get_proto = { .func = bpf_inode_storage_get, .gpl_only = false, @@ -219,6 +304,15 @@ const struct bpf_func_proto bpf_inode_storage_get_proto = { .arg4_type = ARG_ANYTHING, }; +const struct bpf_func_proto bpf_inode_storage_delete_recur_proto = { + .func = bpf_inode_storage_delete_recur, + .gpl_only = false, + .ret_type = RET_INTEGER, + .arg1_type = ARG_CONST_MAP_PTR, + .arg2_type = ARG_PTR_TO_BTF_ID_OR_NULL, + .arg2_btf_id = &bpf_inode_storage_btf_ids[0], +}; + const struct bpf_func_proto bpf_inode_storage_delete_proto = { .func = bpf_inode_storage_delete, .gpl_only = false, diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c index 6292ac5f9bd1..51e2de17325a 100644 --- a/kernel/bpf/bpf_lsm.c +++ b/kernel/bpf/bpf_lsm.c @@ -231,10 +231,6 @@ bpf_lsm_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) } switch (func_id) { - case BPF_FUNC_inode_storage_get: - return &bpf_inode_storage_get_proto; - case BPF_FUNC_inode_storage_delete: - return &bpf_inode_storage_delete_proto; #ifdef CONFIG_NET case BPF_FUNC_sk_storage_get: return &bpf_sk_storage_get_proto; diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index a582cd25ca87..3ec39e6704e2 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -1529,6 +1529,14 @@ bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) if (bpf_prog_check_recur(prog)) return &bpf_task_storage_delete_recur_proto; return &bpf_task_storage_delete_proto; + case BPF_FUNC_inode_storage_get: + if (bpf_prog_check_recur(prog)) + return &bpf_inode_storage_get_recur_proto; + return &bpf_inode_storage_get_proto; + case BPF_FUNC_inode_storage_delete: + if (bpf_prog_check_recur(prog)) + return &bpf_inode_storage_delete_recur_proto; + return &bpf_inode_storage_delete_proto; case BPF_FUNC_for_each_map_elem: return &bpf_for_each_map_elem_proto; case BPF_FUNC_snprintf: diff --git a/security/bpf/hooks.c b/security/bpf/hooks.c index 3663aec7bcbd..625e0cc7027a 100644 --- a/security/bpf/hooks.c +++ b/security/bpf/hooks.c @@ -29,12 +29,7 @@ static int __init bpf_lsm_init(void) return 0; } -struct lsm_blob_sizes bpf_lsm_blob_sizes __ro_after_init = { - .lbs_inode = sizeof(struct bpf_storage_blob), -}; - DEFINE_LSM(bpf) = { .name = "bpf", .init = bpf_lsm_init, - .blobs = &bpf_lsm_blob_sizes }; From patchwork Tue Oct 29 23:12:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 13855642 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EC1A920B1EF; Tue, 29 Oct 2024 23:13:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730243608; cv=none; b=fNPGryrOe5/lma2uGJOc/qYYBrjhHj0Z0h5BTkfK5m0ISe48V1/yp4iqwIJ4Sj2JKbVAiGbm8puIsoTsaBOv6/AdgLrJz4Mp0k8y0EWRLaru/IcrXNvLe6PMS4gkknLH6rwkd8PO27R3yqmJV0Myn6ObKOGjPjdMRU1KY8nwehU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730243608; c=relaxed/simple; bh=PnPQ4TtK0Ou43FdXx0o3MrwJnIIGcK5+VZMftzGIYHc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=bj9iV7vy3NscM7Xs7cCHZvl4OrvZXxkyD2UjpqWuWhb+/8Y3j8aC90WjwettxFdxFZ0W18A+ULxcKlOhpUn37Sd7d7HaVeFsHc4mchpLX+wA/ihYeIW+js9yUTvOx+g3OdPiQprD9GzY8aui0plv1FVqLJn0jlJaEPMwR2Dvin8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=F6twaFWo; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="F6twaFWo" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 67325C4CEE6; Tue, 29 Oct 2024 23:13:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1730243607; bh=PnPQ4TtK0Ou43FdXx0o3MrwJnIIGcK5+VZMftzGIYHc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=F6twaFWoteunqCRxuHIGwiRC+31y/dcKvfOQEnLZXhCfFY0cvj9QqIYOY3ODx3G7U C+vmtwbFMYagKnyBZ7p0b6gTvR4/TqdGGTWpoUiDNhXBY6XcqLsHLDu9/HQJoB+wOE mchtwHW/HbZaLWAXO0eel8ZS1aJAb7us32qzfE2LQSDBWVB75NPydMsgDQ3s2K6Re3 XbyjaYMW0reVUtZ3WKFy/t1dt4CMT0KR80nVT/HAA6tfXtnXxImqut6RpkW9LEDxTf w8qIHQrrQb/raShSZetQmJqXMbTtjDot2NQDL9Ug5I7+K4kzeulCiD7QebwObIIekU 5RM3uPODA9zUQ== From: Song Liu To: bpf@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Cc: kernel-team@meta.com, andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org, daniel@iogearbox.net, martin.lau@linux.dev, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, kpsingh@kernel.org, mattbobrowski@google.com, amir73il@gmail.com, repnop@google.com, jlayton@kernel.org, josef@toxicpanda.com, Song Liu Subject: [RFC bpf-next fanotify 4/5] fanotify: Enable bpf based fanotify fastpath handler Date: Tue, 29 Oct 2024 16:12:43 -0700 Message-ID: <20241029231244.2834368-5-song@kernel.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241029231244.2834368-1-song@kernel.org> References: <20241029231244.2834368-1-song@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Allow user to write fanotify fastpath handlers with bpf programs. Major changes: 1. Make kfuncs in fs/bpf_fs_kfuncs.c available to STRUCT_OPS programs. 2. Add kfunc bpf_iput; 3. Add kfunc bpf_fanotify_data_inode; 4. Add struct_ops bpf_fanotify_fastpath_ops. TODO: 1. Maybe split this into multiple patches. 2. With current logic, the bpf based fastpath handler is added to the global list, and thus available to all users. This is similar to bpf based tcp congestion algorithms. It is possible to add an API so that the bpf based handler is not added to global list, which is similar to hid-bpf. I plan to add that API later. Signed-off-by: Song Liu --- fs/Makefile | 2 +- fs/bpf_fs_kfuncs.c | 23 +++- fs/notify/fanotify/fanotify_fastpath.c | 153 ++++++++++++++++++++++++- kernel/bpf/verifier.c | 5 + 4 files changed, 177 insertions(+), 6 deletions(-) diff --git a/fs/Makefile b/fs/Makefile index 61679fd587b7..1043d999262d 100644 --- a/fs/Makefile +++ b/fs/Makefile @@ -129,4 +129,4 @@ obj-$(CONFIG_EFIVAR_FS) += efivarfs/ obj-$(CONFIG_EROFS_FS) += erofs/ obj-$(CONFIG_VBOXSF_FS) += vboxsf/ obj-$(CONFIG_ZONEFS_FS) += zonefs/ -obj-$(CONFIG_BPF_LSM) += bpf_fs_kfuncs.o +obj-$(CONFIG_BPF_SYSCALL) += bpf_fs_kfuncs.o diff --git a/fs/bpf_fs_kfuncs.c b/fs/bpf_fs_kfuncs.c index 3fe9f59ef867..8110276faff9 100644 --- a/fs/bpf_fs_kfuncs.c +++ b/fs/bpf_fs_kfuncs.c @@ -152,6 +152,18 @@ __bpf_kfunc int bpf_get_file_xattr(struct file *file, const char *name__str, return bpf_get_dentry_xattr(dentry, name__str, value_p); } +/** + * bpf_iput - Drop a reference on the inode + * + * @inode: inode to drop reference. + * + * Drop a refcount on inode. + */ +__bpf_kfunc void bpf_iput(struct inode *inode) +{ + iput(inode); +} + __bpf_kfunc_end_defs(); BTF_KFUNCS_START(bpf_fs_kfunc_set_ids) @@ -161,12 +173,14 @@ BTF_ID_FLAGS(func, bpf_put_file, KF_RELEASE) BTF_ID_FLAGS(func, bpf_path_d_path, KF_TRUSTED_ARGS) BTF_ID_FLAGS(func, bpf_get_dentry_xattr, KF_SLEEPABLE | KF_TRUSTED_ARGS) BTF_ID_FLAGS(func, bpf_get_file_xattr, KF_SLEEPABLE | KF_TRUSTED_ARGS) +BTF_ID_FLAGS(func, bpf_iput, KF_RELEASE) BTF_KFUNCS_END(bpf_fs_kfunc_set_ids) static int bpf_fs_kfuncs_filter(const struct bpf_prog *prog, u32 kfunc_id) { if (!btf_id_set8_contains(&bpf_fs_kfunc_set_ids, kfunc_id) || - prog->type == BPF_PROG_TYPE_LSM) + prog->type == BPF_PROG_TYPE_LSM || + prog->type == BPF_PROG_TYPE_STRUCT_OPS) return 0; return -EACCES; } @@ -179,7 +193,12 @@ static const struct btf_kfunc_id_set bpf_fs_kfunc_set = { static int __init bpf_fs_kfuncs_init(void) { - return register_btf_kfunc_id_set(BPF_PROG_TYPE_LSM, &bpf_fs_kfunc_set); + int ret; + + ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_LSM, &bpf_fs_kfunc_set); + ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS, &bpf_fs_kfunc_set); + + return ret; } late_initcall(bpf_fs_kfuncs_init); diff --git a/fs/notify/fanotify/fanotify_fastpath.c b/fs/notify/fanotify/fanotify_fastpath.c index 0453a1ac25b1..4781270e7b6a 100644 --- a/fs/notify/fanotify/fanotify_fastpath.c +++ b/fs/notify/fanotify/fanotify_fastpath.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 #include #include +#include #include "fanotify.h" @@ -107,7 +108,7 @@ int fanotify_fastpath_add(struct fsnotify_group *group, spin_lock(&fp_list_lock); fp_ops = fanotify_fastpath_find(args.name); - if (!fp_ops || !try_module_get(fp_ops->owner)) { + if (!fp_ops || !bpf_try_module_get(fp_ops, fp_ops->owner)) { spin_unlock(&fp_list_lock); ret = -ENOENT; goto err_free_hook; @@ -140,7 +141,7 @@ int fanotify_fastpath_add(struct fsnotify_group *group, return ret; err_module_put: - module_put(fp_ops->owner); + bpf_module_put(fp_ops, fp_ops->owner); err_free_hook: kfree(fp_hook); goto out; @@ -151,7 +152,7 @@ void fanotify_fastpath_hook_free(struct fanotify_fastpath_hook *fp_hook) if (fp_hook->ops->fp_free) fp_hook->ops->fp_free(fp_hook); - module_put(fp_hook->ops->owner); + bpf_module_put(fp_hook->ops, fp_hook->ops->owner); } void fanotify_fastpath_del(struct fsnotify_group *group) @@ -169,3 +170,149 @@ void fanotify_fastpath_del(struct fsnotify_group *group) out: fsnotify_group_unlock(group); } + +__bpf_kfunc_start_defs(); + +/** + * bpf_fanotify_data_inode - get inode from fanotify_fastpath_event + * + * @event: fanotify_fastpath_event to get inode from + * + * Get referenced inode from fanotify_fastpath_event. + * + * Return: A refcounted inode or NULL. + * + */ +__bpf_kfunc struct inode *bpf_fanotify_data_inode(struct fanotify_fastpath_event *event) +{ + struct inode *inode = fsnotify_data_inode(event->data, event->data_type); + + return inode ? igrab(inode) : NULL; +} + +__bpf_kfunc_end_defs(); + +BTF_KFUNCS_START(bpf_fanotify_kfunc_set_ids) +BTF_ID_FLAGS(func, bpf_fanotify_data_inode, + KF_ACQUIRE | KF_TRUSTED_ARGS | KF_RET_NULL) +BTF_KFUNCS_END(bpf_fanotify_kfunc_set_ids) + +static const struct btf_kfunc_id_set bpf_fanotify_kfunc_set = { + .owner = THIS_MODULE, + .set = &bpf_fanotify_kfunc_set_ids, +}; + +static const struct bpf_func_proto * +bpf_fanotify_fastpath_get_func_proto(enum bpf_func_id func_id, + const struct bpf_prog *prog) +{ + return tracing_prog_func_proto(func_id, prog); +} + +static bool bpf_fanotify_fastpath_is_valid_access(int off, int size, + enum bpf_access_type type, + const struct bpf_prog *prog, + struct bpf_insn_access_aux *info) +{ + if (!bpf_tracing_btf_ctx_access(off, size, type, prog, info)) + return false; + + return true; +} + +static int bpf_fanotify_fastpath_btf_struct_access(struct bpf_verifier_log *log, + const struct bpf_reg_state *reg, + int off, int size) +{ + return 0; +} + +static const struct bpf_verifier_ops bpf_fanotify_fastpath_verifier_ops = { + .get_func_proto = bpf_fanotify_fastpath_get_func_proto, + .is_valid_access = bpf_fanotify_fastpath_is_valid_access, + .btf_struct_access = bpf_fanotify_fastpath_btf_struct_access, +}; + +static int bpf_fanotify_fastpath_reg(void *kdata, struct bpf_link *link) +{ + return fanotify_fastpath_register(kdata); +} + +static void bpf_fanotify_fastpath_unreg(void *kdata, struct bpf_link *link) +{ + fanotify_fastpath_unregister(kdata); +} + +static int bpf_fanotify_fastpath_init(struct btf *btf) +{ + return 0; +} + +static int bpf_fanotify_fastpath_init_member(const struct btf_type *t, + const struct btf_member *member, + void *kdata, const void *udata) +{ + const struct fanotify_fastpath_ops *uops; + struct fanotify_fastpath_ops *ops; + u32 moff; + int ret; + + uops = (const struct fanotify_fastpath_ops *)udata; + ops = (struct fanotify_fastpath_ops *)kdata; + + moff = __btf_member_bit_offset(t, member) / 8; + switch (moff) { + case offsetof(struct fanotify_fastpath_ops, name): + ret = bpf_obj_name_cpy(ops->name, uops->name, + sizeof(ops->name)); + if (ret <= 0) + return -EINVAL; + return 1; + } + + return 0; +} + +static int __bpf_fan_fp_handler(struct fsnotify_group *group, + struct fanotify_fastpath_hook *fp_hook, + struct fanotify_fastpath_event *fp_event) +{ + return 0; +} + +static int __bpf_fan_fp_init(struct fanotify_fastpath_hook *hook, const char *args) +{ + return 0; +} + +static void __bpf_fan_fp_free(struct fanotify_fastpath_hook *hook) +{ +} + +/* For bpf_struct_ops->cfi_stubs */ +static struct fanotify_fastpath_ops __bpf_fanotify_fastpath_ops = { + .fp_handler = __bpf_fan_fp_handler, + .fp_init = __bpf_fan_fp_init, + .fp_free = __bpf_fan_fp_free, +}; + +static struct bpf_struct_ops bpf_fanotify_fastpath_ops = { + .verifier_ops = &bpf_fanotify_fastpath_verifier_ops, + .reg = bpf_fanotify_fastpath_reg, + .unreg = bpf_fanotify_fastpath_unreg, + .init = bpf_fanotify_fastpath_init, + .init_member = bpf_fanotify_fastpath_init_member, + .name = "fanotify_fastpath_ops", + .cfi_stubs = &__bpf_fanotify_fastpath_ops, + .owner = THIS_MODULE, +}; + +static int __init bpf_fanotify_fastpath_struct_ops_init(void) +{ + int ret; + + ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS, &bpf_fanotify_kfunc_set); + ret = ret ?: register_bpf_struct_ops(&bpf_fanotify_fastpath_ops, fanotify_fastpath_ops); + return ret; +} +late_initcall(bpf_fanotify_fastpath_struct_ops_init); diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 9a7ed527e47e..cbca27d24ae5 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -6528,6 +6528,10 @@ BTF_TYPE_SAFE_TRUSTED(struct dentry) { struct inode *d_inode; }; +BTF_TYPE_SAFE_TRUSTED(struct fanotify_fastpath_event) { + struct inode *dir; +}; + BTF_TYPE_SAFE_TRUSTED_OR_NULL(struct socket) { struct sock *sk; }; @@ -6563,6 +6567,7 @@ static bool type_is_trusted(struct bpf_verifier_env *env, BTF_TYPE_EMIT(BTF_TYPE_SAFE_TRUSTED(struct linux_binprm)); BTF_TYPE_EMIT(BTF_TYPE_SAFE_TRUSTED(struct file)); BTF_TYPE_EMIT(BTF_TYPE_SAFE_TRUSTED(struct dentry)); + BTF_TYPE_EMIT(BTF_TYPE_SAFE_TRUSTED(struct fanotify_fastpath_event)); return btf_nested_type_is_trusted(&env->log, reg, field_name, btf_id, "__safe_trusted"); } From patchwork Tue Oct 29 23:12:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 13855643 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3380320BB41; Tue, 29 Oct 2024 23:13:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730243616; cv=none; b=Xq9LBhTxLoyiWRUXPLE9HGlBILi2dGWtYiR9FuDFGnMP2Cqma8iLZsXg0ryJif6DhKjAMTKj7KY/twBe7QfUvxi1B1u4u0SD0Z3gb3hfzvDAlS6LBromLYVM1BlBzsGCkfoYcVE6LMA0U+U6U+f/wieEO+4TpwCX8/MST7Tr/TE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730243616; c=relaxed/simple; bh=Dm2s2fcTkM92SSMD+7Lh75bKhbJpwc+561tVJlj3PmQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=utJsND+Ebi6+WC04rVmCl4rh04qqe+kC0hycoNJVuWu+EDOBIfp5n7f2uUfkZKwFzbQJGB55vY5FVuiUzn1sQ7bgPK0Kp8eGJmvPIk+fb4715xdg1uj3NMOzzcCN12eGhcbkH869rKWHpVhbxSTGukbBww8oB+YNFgX4SxfPRMo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Y5nMYoz+; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Y5nMYoz+" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5CA87C4CEE3; Tue, 29 Oct 2024 23:13:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1730243615; bh=Dm2s2fcTkM92SSMD+7Lh75bKhbJpwc+561tVJlj3PmQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Y5nMYoz+gETFkkdLvPuISJZxuGfbwh/bmg2t6I1tUY/mSFb5h2uJHJ6Z2hJ1RTL9E pwXlSy6BiydxduzLn0wBCLeRNZuhaSnhuFhnWqwFScluV3f6OCf3oBMh0LsZWQrcUD +Ot3XM/HCeiKJW1E7i+qJkWTCV9BnJ4pmNgRIkkbhZx1KZGUysy6KVlxrG6NO1ZP3c TaXqv1j4UHMTiHpehTTiHUqr+4kiZl93Zp0myyZNDWJJGBeTkDvRykF0rZvMaAkI7S c4mN6oeOIX19qpbgKYv0k6SHI8LP1L8zuo//kPbnmgl11ZrhcsD9iHOjBpi+BpmqaW QCT9M3QUSgl0w== From: Song Liu To: bpf@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Cc: kernel-team@meta.com, andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org, daniel@iogearbox.net, martin.lau@linux.dev, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, kpsingh@kernel.org, mattbobrowski@google.com, amir73il@gmail.com, repnop@google.com, jlayton@kernel.org, josef@toxicpanda.com, Song Liu Subject: [RFC bpf-next fanotify 5/5] selftests/bpf: Add test for BPF based fanotify fastpath handler Date: Tue, 29 Oct 2024 16:12:44 -0700 Message-ID: <20241029231244.2834368-6-song@kernel.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241029231244.2834368-1-song@kernel.org> References: <20241029231244.2834368-1-song@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This test shows a simplified logic that monitors a subtree. This is simplified as it doesn't handle all the scenarios, such as: 1) moving a subsubtree into/outof the being monitoring subtree; 2) mount point inside the being monitored subtree Therefore, this is not to show a way to reliably monitor a subtree. Instead, this is to test the functionalities of bpf based fastpath. To really monitor a subtree reliably, we will need more complex logic. Overview of the logic: 1. fanotify is created for the whole file system (/tmp); 2. A bpf map (inode_storage_map) is used to tag directories to monitor (starting from /tmp/fanotify_test); 3. On fsnotify_mkdir, thee tag is propagated to newly created sub directories (/tmp/fanotify_test/subdir); 4. The bpf fastpath checks whether the fanotify event happens in a directory with the tag. If yes, the event is sent to user space; otherwise, the event is dropped. Signed-off-by: Song Liu --- tools/testing/selftests/bpf/bpf_kfuncs.h | 4 + tools/testing/selftests/bpf/config | 1 + .../testing/selftests/bpf/prog_tests/fan_fp.c | 245 ++++++++++++++++++ tools/testing/selftests/bpf/progs/fan_fp.c | 77 ++++++ 4 files changed, 327 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/fan_fp.c create mode 100644 tools/testing/selftests/bpf/progs/fan_fp.c diff --git a/tools/testing/selftests/bpf/bpf_kfuncs.h b/tools/testing/selftests/bpf/bpf_kfuncs.h index 2eb3483f2fb0..44dcf4991244 100644 --- a/tools/testing/selftests/bpf/bpf_kfuncs.h +++ b/tools/testing/selftests/bpf/bpf_kfuncs.h @@ -87,4 +87,8 @@ struct dentry; */ extern int bpf_get_dentry_xattr(struct dentry *dentry, const char *name, struct bpf_dynptr *value_ptr) __ksym __weak; + +struct fanotify_fastpath_event; +extern struct inode *bpf_fanotify_data_inode(struct fanotify_fastpath_event *event) __ksym __weak; +extern void bpf_iput(struct inode *inode) __ksym __weak; #endif diff --git a/tools/testing/selftests/bpf/config b/tools/testing/selftests/bpf/config index 4ca84c8d9116..392cbcad8a8b 100644 --- a/tools/testing/selftests/bpf/config +++ b/tools/testing/selftests/bpf/config @@ -24,6 +24,7 @@ CONFIG_DEBUG_INFO_BTF=y CONFIG_DEBUG_INFO_DWARF4=y CONFIG_DUMMY=y CONFIG_DYNAMIC_FTRACE=y +CONFIG_FANOTIFY=y CONFIG_FPROBE=y CONFIG_FTRACE_SYSCALLS=y CONFIG_FUNCTION_ERROR_INJECTION=y diff --git a/tools/testing/selftests/bpf/prog_tests/fan_fp.c b/tools/testing/selftests/bpf/prog_tests/fan_fp.c new file mode 100644 index 000000000000..ea57ed366647 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/fan_fp.c @@ -0,0 +1,245 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ + +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +#include "fan_fp.skel.h" + +#define TEST_FS "/tmp/" +#define TEST_DIR "/tmp/fanotify_test/" + +static int create_test_subtree(void) +{ + int err; + + err = mkdir(TEST_DIR, 0777); + if (err && errno != EEXIST) + return err; + + return open(TEST_DIR, O_RDONLY); +} + +static int create_fanotify_fd(void) +{ + int fanotify_fd, err; + + fanotify_fd = fanotify_init(FAN_CLASS_NOTIF | FAN_REPORT_NAME | FAN_REPORT_DIR_FID, + O_RDONLY); + + if (!ASSERT_OK_FD(fanotify_fd, "fanotify_init")) + return -1; + + err = fanotify_mark(fanotify_fd, FAN_MARK_ADD | FAN_MARK_FILESYSTEM, + FAN_CREATE | FAN_OPEN | FAN_ONDIR | FAN_EVENT_ON_CHILD, + AT_FDCWD, TEST_FS); + if (!ASSERT_OK(err, "fanotify_mark")) { + close(fanotify_fd); + return -1; + } + + return fanotify_fd; +} + +static int attach_global_fastpath(int fanotify_fd) +{ + struct fanotify_fastpath_args args = { + .name = "_tmp_test_sub_tree", + .version = 1, + .flags = 0, + }; + + if (ioctl(fanotify_fd, FAN_IOC_ADD_FP, &args)) + return -1; + + return 0; +} + +#define EVENT_BUFFER_SIZE 4096 +struct file_access_result { + char name_prefix[16]; + bool accessed; +} access_results[3] = { + {"aa", false}, + {"bb", false}, + {"cc", false}, +}; + +static void update_access_results(char *name) +{ + int i; + + for (i = 0; i < 3; i++) { + if (strstr(name, access_results[i].name_prefix)) + access_results[i].accessed = true; + } +} + +static void parse_event(char *buffer, int len) +{ + struct fanotify_event_metadata *event = + (struct fanotify_event_metadata *) buffer; + struct fanotify_event_info_header *info; + struct fanotify_event_info_fid *fid; + struct file_handle *handle; + char *name; + int off; + + for (; FAN_EVENT_OK(event, len); event = FAN_EVENT_NEXT(event, len)) { + for (off = sizeof(*event) ; off < event->event_len; + off += info->len) { + info = (struct fanotify_event_info_header *) + ((char *) event + off); + switch (info->info_type) { + case FAN_EVENT_INFO_TYPE_DFID_NAME: + fid = (struct fanotify_event_info_fid *) info; + handle = (struct file_handle *)&fid->handle; + name = (char *)handle + sizeof(*handle) + handle->handle_bytes; + update_access_results(name); + break; + default: + break; + } + } + } +} + +static void touch_file(const char *path) +{ + int fd; + + fd = open(path, O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666); + if (!ASSERT_OK_FD(fd, "open")) + goto cleanup; + close(fd); +cleanup: + unlink(path); +} + +static void generate_and_test_event(int fanotify_fd) +{ + char buffer[EVENT_BUFFER_SIZE]; + int len, err; + + /* access /tmp/fanotify_test/aa, this will generate event */ + touch_file(TEST_DIR "aa"); + + /* create /tmp/fanotify_test/subdir, this will get tag from the + * parent directory (added in the bpf program on fsnotify_mkdir) + */ + err = mkdir(TEST_DIR "subdir", 0777); + ASSERT_OK(err, "mkdir"); + + /* access /tmp/fanotify_test/subdir/bb, this will generate event */ + touch_file(TEST_DIR "subdir/bb"); + + /* access /tmp/cc, this will NOT generate event, as the BPF + * fastpath filtered this event out. (Because /tmp doesn't have + * the tag.) + */ + touch_file(TEST_FS "cc"); + + /* read and parse the events */ + len = read(fanotify_fd, buffer, EVENT_BUFFER_SIZE); + if (!ASSERT_GE(len, 0, "read event")) + goto cleanup; + parse_event(buffer, len); + + /* verify we generated events for aa and bb, but filtered out the + * event for cc. + */ + ASSERT_TRUE(access_results[0].accessed, "access aa"); + ASSERT_TRUE(access_results[1].accessed, "access bb"); + ASSERT_FALSE(access_results[2].accessed, "access cc"); + +cleanup: + rmdir(TEST_DIR "subdir"); + rmdir(TEST_DIR); +} + +/* This test shows a simplified logic that monitors a subtree. This is + * simplified as it doesn't handle all the scenarios, such as: + * + * 1) moving a subsubtree into/outof the being monitoring subtree; + * 2) mount point inside the being monitored subtree + * + * Therefore, this is not to show a way to reliably monitor a subtree. + * Instead, this is to test the functionalities of bpf based fastpath. + * + * Overview of the logic: + * 1. fanotify is created for the whole file system (/tmp); + * 2. A bpf map (inode_storage_map) is used to tag directories to + * monitor (starting from /tmp/fanotify_test); + * 3. On fsnotify_mkdir, thee tag is propagated to newly created sub + * directories (/tmp/fanotify_test/subdir); + * 4. The bpf fastpath checks whether the event happens in a directory + * with the tag. If yes, the event is sent to user space; otherwise, + * the event is dropped. + */ +static void test_monitor_subtree(void) +{ + struct bpf_link *link; + struct fan_fp *skel; + int test_root_fd; + __u32 one = 1; + int err, fanotify_fd; + + test_root_fd = create_test_subtree(); + + if (!ASSERT_OK_FD(test_root_fd, "create_test_subtree")) + return; + + skel = fan_fp__open_and_load(); + + if (!ASSERT_OK_PTR(skel, "fan_fp__open_and_load")) + goto close_test_root_fd; + + /* Add tag to /tmp/fanotify_test/ */ + err = bpf_map_update_elem(bpf_map__fd(skel->maps.inode_storage_map), + &test_root_fd, &one, BPF_ANY); + if (!ASSERT_OK(err, "bpf_map_update_elem")) + goto destroy_skel; + link = bpf_map__attach_struct_ops(skel->maps.bpf_fanotify_fastpath_ops); + if (!ASSERT_OK_PTR(link, "bpf_map__attach_struct_ops")) + goto destroy_skel; + + + fanotify_fd = create_fanotify_fd(); + if (!ASSERT_OK_FD(fanotify_fd, "create_fanotify_fd")) + goto destroy_link; + + err = attach_global_fastpath(fanotify_fd); + if (!ASSERT_OK(err, "attach_global_fastpath")) + goto close_fanotify_fd; + + generate_and_test_event(fanotify_fd); + + ASSERT_EQ(skel->bss->added_inode_storage, 1, "added_inode_storage"); + +close_fanotify_fd: + close(fanotify_fd); + +destroy_link: + bpf_link__destroy(link); +destroy_skel: + fan_fp__destroy(skel); + +close_test_root_fd: + close(test_root_fd); + rmdir(TEST_DIR); +} + +void test_bpf_fanotify_fastpath(void) +{ + if (test__start_subtest("subtree")) + test_monitor_subtree(); +} diff --git a/tools/testing/selftests/bpf/progs/fan_fp.c b/tools/testing/selftests/bpf/progs/fan_fp.c new file mode 100644 index 000000000000..ee86dc189e38 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/fan_fp.c @@ -0,0 +1,77 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2024 Facebook */ + +#include "vmlinux.h" +#include +#include + +#define FS_CREATE 0x00000100 /* Subfile was created */ +#define FS_ISDIR 0x40000000 /* event occurred against dir */ + +struct { + __uint(type, BPF_MAP_TYPE_INODE_STORAGE); + __uint(map_flags, BPF_F_NO_PREALLOC); + __type(key, int); + __type(value, __u32); +} inode_storage_map SEC(".maps"); + +int added_inode_storage; + +SEC("struct_ops") +int BPF_PROG(bpf_fp_handler, + struct fsnotify_group *group, + struct fanotify_fastpath_hook *fp_hook, + struct fanotify_fastpath_event *fp_event) +{ + struct inode *dir; + __u32 *value; + + dir = fp_event->dir; + + value = bpf_inode_storage_get(&inode_storage_map, dir, 0, 0); + + /* if dir doesn't have the tag, skip the event */ + if (!value) + return FAN_FP_RET_SKIP_EVENT; + + /* propagate tag to subdir on fsnotify_mkdir */ + if (fp_event->mask == (FS_CREATE | FS_ISDIR) && + fp_event->data_type == FSNOTIFY_EVENT_DENTRY) { + struct inode *new_inode; + + new_inode = bpf_fanotify_data_inode(fp_event); + if (!new_inode) + goto out; + + value = bpf_inode_storage_get(&inode_storage_map, new_inode, 0, + BPF_LOCAL_STORAGE_GET_F_CREATE); + if (value) { + *value = 1; + added_inode_storage++; + } + bpf_iput(new_inode); + } +out: + return FAN_FP_RET_SEND_TO_USERSPACE; +} + +SEC("struct_ops") +int BPF_PROG(bpf_fp_init, struct fanotify_fastpath_hook *hook, const char *args) +{ + return 0; +} + +SEC("struct_ops") +void BPF_PROG(bpf_fp_free, struct fanotify_fastpath_hook *hook) +{ +} + +SEC(".struct_ops.link") +struct fanotify_fastpath_ops bpf_fanotify_fastpath_ops = { + .fp_handler = (void *)bpf_fp_handler, + .fp_init = (void *)bpf_fp_init, + .fp_free = (void *)bpf_fp_free, + .name = "_tmp_test_sub_tree", +}; + +char _license[] SEC("license") = "GPL";