From patchwork Thu Nov 14 08:43:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 13874730 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 11354E573; Thu, 14 Nov 2024 08:44:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731573850; cv=none; b=WVynjhWhbO4QNtBiUQxJONDklv2rhp+dntYpqlj3AHE/lb13fSRd4kiDs6x1LdD5zjXSNmC897ap441Lvergfik1f7XYtHY3KGYfWtMvei3eZuHFcVudR76MtsCK8F4361+TwwENn47dUqMO3ziYIFoi0J6RZkL/uhVZlBHrX8g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731573850; c=relaxed/simple; bh=Xiiv/B9YnbIscwfZvAT3Alc/LK36lg2qnRQFOpJ0xZo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=T2DWtY4e6Qes/81UpdTCFAGiwlwuTbgSkTBKlFIW6Im6u9XG0sYd7AiWfvXnpWlvRzYR18EI4/LRkcWt1IhMyhKRMqeMd+kFdXSt2Z0k3olMT+i5glQXvl7UvEbfCIp9T1M6kAuMiyOvGZklNT+vRhBV+gDlJTS8J6uckGFMQko= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=kPKuOCU1; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="kPKuOCU1" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 54A5DC4CECD; Thu, 14 Nov 2024 08:44:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1731573849; bh=Xiiv/B9YnbIscwfZvAT3Alc/LK36lg2qnRQFOpJ0xZo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=kPKuOCU1mZAVyaK7a/67/nYrtHrWuJniOj1/4YlkZfISInFB+AmYC2QitB5LqzpXj VBU3u3W87kRH/zKwS1k6L4QuzCyhVSQsrT9Hcg8NpdzeuRVwU2xPLwcOdC3q0Bz97l q4P6vFuOAOwTmbrCb3DcTiSE5XXuBWD2kQLUYtjqBmMLNvUiF71Wxz01hAIclsFvv3 P7S9QOWIFgTKcdQxlsxaLhjd45HBSS5C9+Ptnn1bECFA8VUyTcLaDfn+uUSL9A2lVl y+Iitlid34HQazJAmlOV930GHZHYMbS9T/2upEIkXxvJl+6mYFiJdFwMVXMMbDoYQs P7ItEiSQQcgVw== From: Song Liu To: bpf@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-security-module@vger.kernel.org Cc: kernel-team@meta.com, andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org, daniel@iogearbox.net, martin.lau@linux.dev, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, kpsingh@kernel.org, mattbobrowski@google.com, amir73il@gmail.com, repnop@google.com, jlayton@kernel.org, josef@toxicpanda.com, mic@digikod.net, gnoack@google.com, Song Liu Subject: [RFC/PATCH v2 bpf-next fanotify 1/7] fanotify: Introduce fanotify fastpath handler Date: Thu, 14 Nov 2024 00:43:39 -0800 Message-ID: <20241114084345.1564165-2-song@kernel.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241114084345.1564165-1-song@kernel.org> References: <20241114084345.1564165-1-song@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 fanotify fastpath handler enables handling fanotify events within the kernel, and thus saves a trip to the user space. fanotify fastpath handler can be useful in many use cases. For example, if a user is only interested in events for some files in side a directory, a fastpath handler can be used to filter out irrelevant events. fanotify fastpath handler is attached to fsnotify_group. At most one fastpath handler can be attached to a fsnotify_group. The attach/detach of fastpath handlers are controlled by two new ioctls on the fanotify fds: FAN_IOC_ADD_FP and FAN_IOC_DEL_FP. fanotify fastpath handler is packaged in a kernel module. In the future, it is also possible to package fastpath handler in a BPF program. Since loading modules requires CAP_SYS_ADMIN, _loading_ fanotify fastpath handler in kernel modules is limited to CAP_SYS_ADMIN. However, non-SYS_CAP_ADMIN users can _attach_ fastpath handler loaded by sys admin to their fanotify fds. To make fanotify fastpath handler more useful for non-CAP_SYS_ADMIN users, a fastpath handler can take arguments at attach time. sysfs entry /sys/kernel/fanotify_fastpath is added to help users know which fastpath handlers are available. At the moment, files are added for each fastpath handler: flags, desc, and init_args. Signed-off-by: Song Liu --- fs/notify/fanotify/Kconfig | 13 ++ fs/notify/fanotify/Makefile | 1 + fs/notify/fanotify/fanotify.c | 29 +++ fs/notify/fanotify/fanotify_fastpath.c | 282 +++++++++++++++++++++++++ fs/notify/fanotify/fanotify_user.c | 7 + include/linux/fanotify.h | 131 ++++++++++++ include/linux/fsnotify_backend.h | 4 + include/uapi/linux/fanotify.h | 25 +++ 8 files changed, 492 insertions(+) create mode 100644 fs/notify/fanotify/fanotify_fastpath.c diff --git a/fs/notify/fanotify/Kconfig b/fs/notify/fanotify/Kconfig index 0e36aaf379b7..74677d3699a3 100644 --- a/fs/notify/fanotify/Kconfig +++ b/fs/notify/fanotify/Kconfig @@ -24,3 +24,16 @@ config FANOTIFY_ACCESS_PERMISSIONS hierarchical storage management systems. If unsure, say N. + +config FANOTIFY_FASTPATH + bool "fanotify fastpath handler" + depends on FANOTIFY + default y + help + Say Y here if you want to use fanotify in kernel fastpath handler. + The fastpath handler can be implemented in a kernel module or a + BPF program. The fastpath handler can speed up fanotify in many + use cases. For example, when the listener is only interested in + a subset of events. + + If unsure, say Y. \ No newline at end of file diff --git a/fs/notify/fanotify/Makefile b/fs/notify/fanotify/Makefile index 25ef222915e5..543cb7aa08fc 100644 --- a/fs/notify/fanotify/Makefile +++ b/fs/notify/fanotify/Makefile @@ -1,2 +1,3 @@ # SPDX-License-Identifier: GPL-2.0-only obj-$(CONFIG_FANOTIFY) += fanotify.o fanotify_user.o +obj-$(CONFIG_FANOTIFY_FASTPATH) += fanotify_fastpath.o diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c index 224bccaab4cc..b395b628a58b 100644 --- a/fs/notify/fanotify/fanotify.c +++ b/fs/notify/fanotify/fanotify.c @@ -18,6 +18,8 @@ #include "fanotify.h" +extern struct srcu_struct fsnotify_mark_srcu; + static bool fanotify_path_equal(const struct path *p1, const struct path *p2) { return p1->mnt == p2->mnt && p1->dentry == p2->dentry; @@ -888,6 +890,7 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask, struct fsnotify_event *fsn_event; __kernel_fsid_t fsid = {}; u32 match_mask = 0; + struct fanotify_fastpath_hook *fp_hook __maybe_unused; BUILD_BUG_ON(FAN_ACCESS != FS_ACCESS); BUILD_BUG_ON(FAN_MODIFY != FS_MODIFY); @@ -933,6 +936,27 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask, if (FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS)) fsid = fanotify_get_fsid(iter_info); +#ifdef CONFIG_FANOTIFY_FASTPATH + fp_hook = srcu_dereference(group->fanotify_data.fp_hook, &fsnotify_mark_srcu); + if (fp_hook) { + struct fanotify_fastpath_event fp_event = { + .mask = mask, + .data = data, + .data_type = data_type, + .dir = dir, + .file_name = file_name, + .fsid = &fsid, + .match_mask = match_mask, + }; + + ret = fp_hook->ops->fp_handler(group, fp_hook, &fp_event); + if (ret == FAN_FP_RET_SKIP_EVENT) { + ret = 0; + goto finish; + } + } +#endif + event = fanotify_alloc_event(group, mask, data, data_type, dir, file_name, &fsid, match_mask); ret = -ENOMEM; @@ -976,6 +1000,11 @@ static void fanotify_free_group_priv(struct fsnotify_group *group) if (mempool_initialized(&group->fanotify_data.error_events_pool)) mempool_exit(&group->fanotify_data.error_events_pool); + +#ifdef CONFIG_FANOTIFY_FASTPATH + if (group->fanotify_data.fp_hook) + fanotify_fastpath_hook_free(group->fanotify_data.fp_hook); +#endif } static void fanotify_free_path_event(struct fanotify_event *event) diff --git a/fs/notify/fanotify/fanotify_fastpath.c b/fs/notify/fanotify/fanotify_fastpath.c new file mode 100644 index 000000000000..f2aefcf0ca6a --- /dev/null +++ b/fs/notify/fanotify/fanotify_fastpath.c @@ -0,0 +1,282 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include + +#include "fanotify.h" + +extern struct srcu_struct fsnotify_mark_srcu; + +static DEFINE_SPINLOCK(fp_list_lock); +static LIST_HEAD(fp_list); + +static struct kobject *fan_fp_root_kobj; + +static struct { + enum fanotify_fastpath_flags flag; + const char *name; +} fanotify_fastpath_flags_names[] = { + { + .flag = FAN_FP_F_SYS_ADMIN_ONLY, + .name = "SYS_ADMIN_ONLY", + } +}; + +static ssize_t flags_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct fanotify_fastpath_ops *ops; + ssize_t len = 0; + int i; + + ops = container_of(kobj, struct fanotify_fastpath_ops, kobj); + for (i = 0; i < ARRAY_SIZE(fanotify_fastpath_flags_names); i++) { + if (ops->flags & fanotify_fastpath_flags_names[i].flag) { + len += sysfs_emit_at(buf, len, "%s%s", len ? " " : "", + fanotify_fastpath_flags_names[i].name); + } + } + len += sysfs_emit_at(buf, len, "\n"); + return len; +} + +static ssize_t desc_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct fanotify_fastpath_ops *ops; + + ops = container_of(kobj, struct fanotify_fastpath_ops, kobj); + + return sysfs_emit(buf, "%s\n", ops->desc ?: "N/A"); +} + +static ssize_t init_args_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct fanotify_fastpath_ops *ops; + + ops = container_of(kobj, struct fanotify_fastpath_ops, kobj); + + return sysfs_emit(buf, "%s\n", ops->init_args ?: "N/A"); +} + +static struct kobj_attribute flags_kobj_attr = __ATTR_RO(flags); +static struct kobj_attribute desc_kobj_attr = __ATTR_RO(desc); +static struct kobj_attribute init_args_kobj_attr = __ATTR_RO(init_args); + +static struct attribute *fan_fp_attrs[] = { + &flags_kobj_attr.attr, + &desc_kobj_attr.attr, + &init_args_kobj_attr.attr, + NULL, +}; +ATTRIBUTE_GROUPS(fan_fp); + +static void fan_fp_kobj_release(struct kobject *kobj) +{ + +} + +static const struct kobj_type fan_fp_ktype = { + .release = fan_fp_kobj_release, + .sysfs_ops = &kobj_sysfs_ops, + .default_groups = fan_fp_groups, +}; + +static struct fanotify_fastpath_ops *fanotify_fastpath_find(const char *name) +{ + struct fanotify_fastpath_ops *ops; + + list_for_each_entry(ops, &fp_list, list) { + if (!strcmp(ops->name, name)) + return ops; + } + return NULL; +} + +static void __fanotify_fastpath_unregister(struct fanotify_fastpath_ops *ops) +{ + spin_lock(&fp_list_lock); + list_del_init(&ops->list); + spin_unlock(&fp_list_lock); +} + +/* + * fanotify_fastpath_register - Register a new fastpath handler. + * + * Add a fastpath handler to the fp_list. These fastpath handlers are + * available for all users in the system. + * + * @ops: pointer to fanotify_fastpath_ops to add. + * + * Returns: + * 0 - on success; + * -EEXIST - fastpath handler of the same name already exists. + */ +int fanotify_fastpath_register(struct fanotify_fastpath_ops *ops) +{ + int ret; + + spin_lock(&fp_list_lock); + if (fanotify_fastpath_find(ops->name)) { + /* cannot register two handlers with the same name */ + spin_unlock(&fp_list_lock); + return -EEXIST; + } + list_add_tail(&ops->list, &fp_list); + spin_unlock(&fp_list_lock); + + + kobject_init(&ops->kobj, &fan_fp_ktype); + ret = kobject_add(&ops->kobj, fan_fp_root_kobj, "%s", ops->name); + if (ret) { + __fanotify_fastpath_unregister(ops); + return ret; + } + return 0; +} +EXPORT_SYMBOL_GPL(fanotify_fastpath_register); + +/* + * fanotify_fastpath_unregister - Unregister a new fastpath handler. + * + * Remove a fastpath handler from fp_list. + * + * @ops: pointer to fanotify_fastpath_ops to remove. + */ +void fanotify_fastpath_unregister(struct fanotify_fastpath_ops *ops) +{ + kobject_put(&ops->kobj); + __fanotify_fastpath_unregister(ops); +} +EXPORT_SYMBOL_GPL(fanotify_fastpath_unregister); + +/* + * fanotify_fastpath_add - Add a fastpath handler to fsnotify_group. + * + * Add a fastpath handler from fp_list to a fsnotify_group. + * + * @group: fsnotify_group that will have add + * @argp: fanotify_fastpath_args that specifies the fastpath handler + * and the init arguments of the fastpath handler. + * + * Returns: + * 0 - on success; + * -EEXIST - fastpath handler of the same name already exists. + */ +int fanotify_fastpath_add(struct fsnotify_group *group, + struct fanotify_fastpath_args __user *argp) +{ + struct fanotify_fastpath_hook *fp_hook; + struct fanotify_fastpath_ops *fp_ops; + struct fanotify_fastpath_args args; + void *init_args = NULL; + int ret = 0; + + ret = copy_from_user(&args, argp, sizeof(args)); + if (ret) + return -EFAULT; + + if (args.version != 1 || args.flags || args.init_args_size > FAN_FP_ARGS_MAX) + return -EINVAL; + + args.name[FAN_FP_NAME_MAX - 1] = '\0'; + + fsnotify_group_lock(group); + + if (rcu_access_pointer(group->fanotify_data.fp_hook)) { + fsnotify_group_unlock(group); + return -EBUSY; + } + + fp_hook = kzalloc(sizeof(*fp_hook), GFP_KERNEL); + if (!fp_hook) { + ret = -ENOMEM; + goto out; + } + + spin_lock(&fp_list_lock); + fp_ops = fanotify_fastpath_find(args.name); + if (!fp_ops || !try_module_get(fp_ops->owner)) { + spin_unlock(&fp_list_lock); + ret = -ENOENT; + goto err_free_hook; + } + spin_unlock(&fp_list_lock); + + if (!capable(CAP_SYS_ADMIN) && (fp_ops->flags & FAN_FP_F_SYS_ADMIN_ONLY)) { + ret = -EPERM; + goto err_module_put; + } + + if (fp_ops->fp_init) { + if (args.init_args_size) { + init_args = kzalloc(args.init_args_size, GFP_KERNEL); + if (!init_args) { + ret = -ENOMEM; + goto err_module_put; + } + if (copy_from_user(init_args, (void __user *)args.init_args, + args.init_args_size)) { + ret = -EFAULT; + goto err_free_args; + } + + } + ret = fp_ops->fp_init(fp_hook, init_args); + if (ret) + goto err_free_args; + kfree(init_args); + } + fp_hook->ops = fp_ops; + rcu_assign_pointer(group->fanotify_data.fp_hook, fp_hook); + +out: + fsnotify_group_unlock(group); + return ret; + +err_free_args: + kfree(init_args); +err_module_put: + module_put(fp_ops->owner); +err_free_hook: + kfree(fp_hook); + goto out; +} + +void fanotify_fastpath_hook_free(struct fanotify_fastpath_hook *fp_hook) +{ + if (fp_hook->ops->fp_free) + fp_hook->ops->fp_free(fp_hook); + + module_put(fp_hook->ops->owner); + kfree(fp_hook); +} + +/* + * fanotify_fastpath_add - Delete a fastpath handler from fsnotify_group. + */ +void fanotify_fastpath_del(struct fsnotify_group *group) +{ + struct fanotify_fastpath_hook *fp_hook; + + fsnotify_group_lock(group); + fp_hook = group->fanotify_data.fp_hook; + if (!fp_hook) + goto out; + + rcu_assign_pointer(group->fanotify_data.fp_hook, NULL); + fanotify_fastpath_hook_free(fp_hook); + +out: + fsnotify_group_unlock(group); +} + +static int __init fanotify_fastpath_init(void) +{ + fan_fp_root_kobj = kobject_create_and_add("fanotify_fastpath", kernel_kobj); + if (!fan_fp_root_kobj) + return -ENOMEM; + return 0; +} +device_initcall(fanotify_fastpath_init); diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c index 8e2d43fc6f7c..e96cb83f8409 100644 --- a/fs/notify/fanotify/fanotify_user.c +++ b/fs/notify/fanotify/fanotify_user.c @@ -987,6 +987,13 @@ static long fanotify_ioctl(struct file *file, unsigned int cmd, unsigned long ar spin_unlock(&group->notification_lock); ret = put_user(send_len, (int __user *) p); break; + case FAN_IOC_ADD_FP: + ret = fanotify_fastpath_add(group, p); + break; + case FAN_IOC_DEL_FP: + fanotify_fastpath_del(group); + ret = 0; + break; } return ret; diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h index 89ff45bd6f01..8645d0b29e9d 100644 --- a/include/linux/fanotify.h +++ b/include/linux/fanotify.h @@ -2,6 +2,7 @@ #ifndef _LINUX_FANOTIFY_H #define _LINUX_FANOTIFY_H +#include #include #include @@ -136,4 +137,134 @@ #undef FAN_ALL_PERM_EVENTS #undef FAN_ALL_OUTGOING_EVENTS +struct fsnotify_group; +struct qstr; +struct inode; +struct fanotify_fastpath_hook; + +/* + * Event passed to fanotify fastpath handler + * + * @mask: event type and flags + * @data: object that event happened on + * @data_type: type of object for fanotify_data_XXX() accessors + * @dir: optional directory associated with event - + * if @file_name is not NULL, this is the directory that + * @file_name is relative to + * @file_name: optional file name associated with event + * @match_mask: mark types of this group that matched the event + */ +struct fanotify_fastpath_event { + u32 mask; + const void *data; + int data_type; + struct inode *dir; + const struct qstr *file_name; + __kernel_fsid_t *fsid; + u32 match_mask; +}; + +/* + * fanotify fastpath handler should implement these ops. + * + * fp_handler - Main call for the fastpath handler. + * @group: The group being notified + * @fp_hook: fanotify_fastpath_hook for the attach on @group. + * Returns: enum fanotify_fastpath_return. + * + * fp_init - Initialize the fanotify_fastpath_hook. + * @hook: fanotify_fastpath_hook to be initialized + * @args: Arguments used to initialize @hook + * + * fp_free - Free the fanotify_fastpath_hook. + * @hook: fanotify_fastpath_hook to be freed. + * + * @name: Name of the fanotify_fastpath_ops. This need to be unique + * in the system + * @owner: Owner module of this fanotify_fastpath_ops + * @list: Attach to global list of fanotify_fastpath_ops + * @flags: Flags for the fanotify_fastpath_ops + * @desc: Description of what this fastpath handler do (optional) + * @init_args: Description of the init_args in a string (optional) + */ +struct fanotify_fastpath_ops { + int (*fp_handler)(struct fsnotify_group *group, + struct fanotify_fastpath_hook *fp_hook, + struct fanotify_fastpath_event *fp_event); + int (*fp_init)(struct fanotify_fastpath_hook *hook, void *args); + void (*fp_free)(struct fanotify_fastpath_hook *hook); + + char name[FAN_FP_NAME_MAX]; + struct module *owner; + struct list_head list; + u32 flags; + const char *desc; + const char *init_args; + + /* internal */ + struct kobject kobj; +}; + +/* Flags for fanotify_fastpath_ops->flags */ +enum fanotify_fastpath_flags { + /* CAP_SYS_ADMIN is required to use this fastpath handler */ + FAN_FP_F_SYS_ADMIN_ONLY = BIT(0), + + FAN_FP_F_ALL = FAN_FP_F_SYS_ADMIN_ONLY, +}; + +/* Return value of fp_handler */ +enum fanotify_fastpath_return { + /* The event should be sent to user space */ + FAN_FP_RET_SEND_TO_USERSPACE = 0, + /* The event should NOT be sent to user space */ + FAN_FP_RET_SKIP_EVENT = 1, +}; + +/* + * Hook that attaches fanotify_fastpath_ops to a group. + * @ops: the ops + * @data: per group data used by the ops + */ +struct fanotify_fastpath_hook { + struct fanotify_fastpath_ops *ops; + void *data; +}; + +#ifdef CONFIG_FANOTIFY_FASTPATH + +int fanotify_fastpath_register(struct fanotify_fastpath_ops *ops); +void fanotify_fastpath_unregister(struct fanotify_fastpath_ops *ops); +int fanotify_fastpath_add(struct fsnotify_group *group, + struct fanotify_fastpath_args __user *args); +void fanotify_fastpath_del(struct fsnotify_group *group); +void fanotify_fastpath_hook_free(struct fanotify_fastpath_hook *fp_hook); + +#else /* CONFIG_FANOTIFY_FASTPATH */ + +static inline int fanotify_fastpath_register(struct fanotify_fastpath_ops *ops) +{ + return -EOPNOTSUPP; +} + +static inline void fanotify_fastpath_unregister(struct fanotify_fastpath_ops *ops) +{ +} + +static inline int fanotify_fastpath_add(struct fsnotify_group *group, + struct fanotify_fastpath_args __user *args) +{ + return -ENOENT; +} + +static inline void fanotify_fastpath_del(struct fsnotify_group *group) +{ +} + +static inline void fanotify_fastpath_hook_free(struct fanotify_fastpath_hook *fp_hook) +{ +} + +#endif /* CONFIG_FANOTIFY_FASTPATH */ + #endif /* _LINUX_FANOTIFY_H */ diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h index 3ecf7768e577..9b22d9b9d0bb 100644 --- a/include/linux/fsnotify_backend.h +++ b/include/linux/fsnotify_backend.h @@ -117,6 +117,7 @@ struct fsnotify_fname; struct fsnotify_iter_info; struct mem_cgroup; +struct fanotify_fastpath_hook; /* * Each group much define these ops. The fsnotify infrastructure will call @@ -255,6 +256,9 @@ struct fsnotify_group { int f_flags; /* event_f_flags from fanotify_init() */ struct ucounts *ucounts; mempool_t error_events_pool; +#ifdef CONFIG_FANOTIFY_FASTPATH + struct fanotify_fastpath_hook __rcu *fp_hook; +#endif /* CONFIG_FANOTIFY_FASTPATH */ } fanotify_data; #endif /* CONFIG_FANOTIFY */ }; diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h index 34f221d3a1b9..654d5ab44143 100644 --- a/include/uapi/linux/fanotify.h +++ b/include/uapi/linux/fanotify.h @@ -3,6 +3,7 @@ #define _UAPI_LINUX_FANOTIFY_H #include +#include /* the following events that user-space can register for */ #define FAN_ACCESS 0x00000001 /* File was accessed */ @@ -243,4 +244,28 @@ struct fanotify_response_info_audit_rule { (long)(meta)->event_len >= (long)FAN_EVENT_METADATA_LEN && \ (long)(meta)->event_len <= (long)(len)) +#define FAN_FP_NAME_MAX 64 +#define FAN_FP_ARGS_MAX 64 + +/* This is the arguments used to add fastpath handler to a group. */ +struct fanotify_fastpath_args { + char name[FAN_FP_NAME_MAX]; + + __u32 version; + __u32 flags; + + /* + * user space pointer to the init args of fastpath handler, + * up to init_args_len (<= FAN_FP_ARGS_MAX). + */ + __u64 init_args; + /* size of init_args */ + __u32 init_args_size; +} __attribute__((__packed__)); + +#define FAN_IOC_MAGIC 'F' + +#define FAN_IOC_ADD_FP _IOW(FAN_IOC_MAGIC, 0, struct fanotify_fastpath_args) +#define FAN_IOC_DEL_FP _IOW(FAN_IOC_MAGIC, 1, char[FAN_FP_NAME_MAX]) + #endif /* _UAPI_LINUX_FANOTIFY_H */ From patchwork Thu Nov 14 08:43:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 13874731 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 44D3F1F76A8; Thu, 14 Nov 2024 08:44:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731573860; cv=none; b=Y/zWrU/1RRerHXOPuJo7lOhNGmuDxh8+32yvMEs+F2QyoCqWWz+bvDh5Z7OTfzHrDRCTcLCrg4OJ+fSyo8iNxLZUPjdwHFmEElp9/e0atqoxYa7oYGpZixcdRyPsSZmZmj8mh4+uuacR1Bpyn6u2Z8WBw3DFC40aB4oEjLYjuwM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731573860; c=relaxed/simple; bh=HK4iU8O22NadzfqP2gnHIsvdLo88uc0RAYaz8YOCi1c=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=SWQlEIg8sdad2o+GT2XbicwT9/rVL8uDJGEh/qR5rckYlg/mukh1CY9k5uMP8K+kcrBu428clFm7/Lrj3IAbzGdeqh15bOWwu333Gt9+mBbLShzPFhDotA7vy0kZL/vSiH4klBI2z6ELlKXlcjgMXOMamK9Ku7/kSqjV0u1z3RU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=VIkqJ0x1; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="VIkqJ0x1" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 50CEDC4CECD; Thu, 14 Nov 2024 08:44:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1731573857; bh=HK4iU8O22NadzfqP2gnHIsvdLo88uc0RAYaz8YOCi1c=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VIkqJ0x1Cdhxh2ZeHDRWqKxo6mm5f28jQTlfQKXY+nF1LtB9NRRrPzYUBB+K/rro0 ASgvIB9zTtRwtD7h15vw1Eo2uJ5LytxdNohe6KrS1sIrn2NyhYQja2mR9UWcW29AGA yPEX7zN41Den97QAS0e8O/hMlk7nj8ZbbHStx4J6fUD85mq9YsLaMfrPoqTmBmNJ2b jLYDK5XCsZCEVv6g4nes5IErI2R1mL2aIUZZjnagFk53SLUc+mz6rHiGXyY3U5VZGn ta7bfVBjrBRjVxKjklIWuT5bN5NQXlpJq5JjyrLmD5sBeA2zFT/qz358x/kM8aFPBv EfHn3eVzXMEFQ== From: Song Liu To: bpf@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-security-module@vger.kernel.org Cc: kernel-team@meta.com, andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org, daniel@iogearbox.net, martin.lau@linux.dev, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, kpsingh@kernel.org, mattbobrowski@google.com, amir73il@gmail.com, repnop@google.com, jlayton@kernel.org, josef@toxicpanda.com, mic@digikod.net, gnoack@google.com, Song Liu Subject: [RFC/PATCH v2 bpf-next fanotify 2/7] samples/fanotify: Add a sample fanotify fastpath handler Date: Thu, 14 Nov 2024 00:43:40 -0800 Message-ID: <20241114084345.1564165-3-song@kernel.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241114084345.1564165-1-song@kernel.org> References: <20241114084345.1564165-1-song@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This fastpath handler monitors a subtree inside a mount point. To use it: [root] insmod ./fastpath-mod.ko [root] mkdir -p /tmp/a/b/c/d [root] ./fastpath-user /tmp/ /tmp/a/b & [root] touch /tmp/xx # Doesn't generate event [root]# touch /tmp/a/xxa # Doesn't generate event [root]# touch /tmp/a/b/xxab # Generates an event Accessing file xxab # this is the output from fastpath-user [root@]# touch /tmp/a/b/c/xxabc # Generates an event Accessing file xxabc # this is the output from fastpath-user Signed-off-by: Song Liu --- MAINTAINERS | 1 + samples/Kconfig | 20 +++++- samples/Makefile | 2 +- samples/fanotify/.gitignore | 1 + samples/fanotify/Makefile | 5 +- samples/fanotify/fastpath-mod.c | 82 +++++++++++++++++++++++ samples/fanotify/fastpath-user.c | 111 +++++++++++++++++++++++++++++++ 7 files changed, 219 insertions(+), 3 deletions(-) create mode 100644 samples/fanotify/fastpath-mod.c create mode 100644 samples/fanotify/fastpath-user.c diff --git a/MAINTAINERS b/MAINTAINERS index 7ad507f49324..8939a48b2d99 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -8658,6 +8658,7 @@ S: Maintained F: fs/notify/fanotify/ F: include/linux/fanotify.h F: include/uapi/linux/fanotify.h +F: samples/fanotify/ FARADAY FOTG210 USB2 DUAL-ROLE CONTROLLER M: Linus Walleij diff --git a/samples/Kconfig b/samples/Kconfig index b288d9991d27..b0d3dff48bb0 100644 --- a/samples/Kconfig +++ b/samples/Kconfig @@ -149,15 +149,33 @@ config SAMPLE_CONNECTOR with it. See also Documentation/driver-api/connector.rst +config SAMPLE_FANOTIFY + bool "Build fanotify monitoring sample" + depends on FANOTIFY && CC_CAN_LINK && HEADERS_INSTALL + help + When enabled, this builds samples for fanotify. + There multiple samples for fanotify. Please see the + following configs for more details of these + samples. + config SAMPLE_FANOTIFY_ERROR bool "Build fanotify error monitoring sample" - depends on FANOTIFY && CC_CAN_LINK && HEADERS_INSTALL + depends on SAMPLE_FANOTIFY help When enabled, this builds an example code that uses the FAN_FS_ERROR fanotify mechanism to monitor filesystem errors. See also Documentation/admin-guide/filesystem-monitoring.rst. +config SAMPLE_FANOTIFY_FASTPATH + tristate "Build fanotify fastpath sample" + depends on SAMPLE_FANOTIFY && m + help + When enabled, this builds kernel module that contains a + fanotify fastpath handler. + The fastpath handler filters out certain filename + prefixes for the fanotify user. + config SAMPLE_HIDRAW bool "hidraw sample" depends on CC_CAN_LINK && HEADERS_INSTALL diff --git a/samples/Makefile b/samples/Makefile index b85fa64390c5..108360972626 100644 --- a/samples/Makefile +++ b/samples/Makefile @@ -6,7 +6,7 @@ subdir-$(CONFIG_SAMPLE_ANDROID_BINDERFS) += binderfs subdir-$(CONFIG_SAMPLE_CGROUP) += cgroup obj-$(CONFIG_SAMPLE_CONFIGFS) += configfs/ obj-$(CONFIG_SAMPLE_CONNECTOR) += connector/ -obj-$(CONFIG_SAMPLE_FANOTIFY_ERROR) += fanotify/ +obj-$(CONFIG_SAMPLE_FANOTIFY) += fanotify/ subdir-$(CONFIG_SAMPLE_HIDRAW) += hidraw obj-$(CONFIG_SAMPLE_HW_BREAKPOINT) += hw_breakpoint/ obj-$(CONFIG_SAMPLE_KDB) += kdb/ diff --git a/samples/fanotify/.gitignore b/samples/fanotify/.gitignore index d74593e8b2de..306e1ddec4e0 100644 --- a/samples/fanotify/.gitignore +++ b/samples/fanotify/.gitignore @@ -1 +1,2 @@ fs-monitor +fastpath-user diff --git a/samples/fanotify/Makefile b/samples/fanotify/Makefile index e20db1bdde3b..f5bbd7380104 100644 --- a/samples/fanotify/Makefile +++ b/samples/fanotify/Makefile @@ -1,5 +1,8 @@ # SPDX-License-Identifier: GPL-2.0-only -userprogs-always-y += fs-monitor +userprogs-always-$(CONFIG_SAMPLE_FANOTIFY_ERROR) += fs-monitor userccflags += -I usr/include -Wall +obj-$(CONFIG_SAMPLE_FANOTIFY_FASTPATH) += fastpath-mod.o + +userprogs-always-$(CONFIG_SAMPLE_FANOTIFY_FASTPATH) += fastpath-user diff --git a/samples/fanotify/fastpath-mod.c b/samples/fanotify/fastpath-mod.c new file mode 100644 index 000000000000..7e2e1878f8e7 --- /dev/null +++ b/samples/fanotify/fastpath-mod.c @@ -0,0 +1,82 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include +#include +#include +#include +#include + +static int sample_fp_handler(struct fsnotify_group *group, + struct fanotify_fastpath_hook *fp_hook, + struct fanotify_fastpath_event *fp_event) +{ + struct dentry *dentry; + struct path *subtree; + + dentry = fsnotify_data_dentry(fp_event->data, fp_event->data_type); + if (!dentry) + return FAN_FP_RET_SEND_TO_USERSPACE; + + subtree = fp_hook->data; + + if (is_subdir(dentry, subtree->dentry)) + return FAN_FP_RET_SEND_TO_USERSPACE; + return FAN_FP_RET_SKIP_EVENT; +} + +static int sample_fp_init(struct fanotify_fastpath_hook *fp_hook, void *args) +{ + struct path *subtree; + struct file *file; + int fd; + + fd = *(int *)args; + + file = fget(fd); + if (!file) + return -EBADF; + subtree = kzalloc(sizeof(struct path), GFP_KERNEL); + if (!subtree) { + fput(file); + return -ENOMEM; + } + path_get(&file->f_path); + *subtree = file->f_path; + fput(file); + fp_hook->data = subtree; + return 0; +} + +static void sample_fp_free(struct fanotify_fastpath_hook *fp_hook) +{ + struct path *subtree = fp_hook->data; + + path_put(subtree); + kfree(subtree); +} + +static struct fanotify_fastpath_ops fan_fp_ignore_a_ops = { + .fp_handler = sample_fp_handler, + .fp_init = sample_fp_init, + .fp_free = sample_fp_free, + .name = "monitor-subtree", + .owner = THIS_MODULE, + .flags = FAN_FP_F_SYS_ADMIN_ONLY, + .desc = "only emit events under a subtree", + .init_args = "struct {\n\tint subtree_fd;\n};", +}; + +static int __init fanotify_fastpath_sample_init(void) +{ + return fanotify_fastpath_register(&fan_fp_ignore_a_ops); +} +static void __exit fanotify_fastpath_sample_exit(void) +{ + fanotify_fastpath_unregister(&fan_fp_ignore_a_ops); +} + +module_init(fanotify_fastpath_sample_init); +module_exit(fanotify_fastpath_sample_exit); + +MODULE_AUTHOR("Song Liu"); +MODULE_DESCRIPTION("Example fanotify fastpath handler"); +MODULE_LICENSE("GPL"); diff --git a/samples/fanotify/fastpath-user.c b/samples/fanotify/fastpath-user.c new file mode 100644 index 000000000000..abe33a6b6b41 --- /dev/null +++ b/samples/fanotify/fastpath-user.c @@ -0,0 +1,111 @@ +// SPDX-License-Identifier: GPL-2.0 +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include + +static int total_event_cnt; + +static void handle_notifications(char *buffer, int len) +{ + struct fanotify_event_metadata *event = + (struct fanotify_event_metadata *) buffer; + struct fanotify_event_info_header *info; + struct fanotify_event_info_fid *fid; + struct file_handle *handle; + char *name; + int off; + + for (; FAN_EVENT_OK(event, len); event = FAN_EVENT_NEXT(event, len)) { + for (off = sizeof(*event) ; off < event->event_len; + off += info->len) { + info = (struct fanotify_event_info_header *) + ((char *) event + off); + switch (info->info_type) { + case FAN_EVENT_INFO_TYPE_DFID_NAME: + fid = (struct fanotify_event_info_fid *) info; + handle = (struct file_handle *)&fid->handle; + name = (char *)handle + sizeof(*handle) + handle->handle_bytes; + + printf("Accessing file %s\n", name); + total_event_cnt++; + break; + default: + break; + } + } + } +} + +int main(int argc, char **argv) +{ + struct fanotify_fastpath_args args = { + .name = "monitor-subtree", + .version = 1, + .flags = 0, + }; + char buffer[BUFSIZ]; + const char *msg; + int fanotify_fd; + int subtree_fd; + + if (argc < 3) { + printf("Usage:\n" + "\t %s \n", + argv[0]); + return 1; + } + + subtree_fd = open(argv[2], O_RDONLY | O_CLOEXEC); + + if (subtree_fd < 0) + errx(1, "open subtree_fd"); + + args.init_args = (__u64)&subtree_fd; + args.init_args_size = sizeof(int); + + fanotify_fd = fanotify_init(FAN_CLASS_NOTIF | FAN_REPORT_NAME | FAN_REPORT_DIR_FID, + O_RDONLY); + if (fanotify_fd < 0) { + close(subtree_fd); + errx(1, "fanotify_init"); + } + + if (fanotify_mark(fanotify_fd, FAN_MARK_ADD | FAN_MARK_FILESYSTEM, + FAN_OPEN | FAN_ONDIR | FAN_EVENT_ON_CHILD, + AT_FDCWD, argv[1])) { + msg = "fanotify_mark"; + goto err_out; + } + + if (ioctl(fanotify_fd, FAN_IOC_ADD_FP, &args)) { + msg = "ioctl"; + goto err_out; + } + + while (total_event_cnt < 10) { + int n = read(fanotify_fd, buffer, BUFSIZ); + + if (n < 0) { + msg = "read"; + goto err_out; + } + + handle_notifications(buffer, n); + } + + ioctl(fanotify_fd, FAN_IOC_DEL_FP); + close(fanotify_fd); + close(subtree_fd); + return 0; + +err_out: + close(fanotify_fd); + close(subtree_fd); + errx(1, msg); + return 0; +} From patchwork Thu Nov 14 08:43:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 13874732 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 825D61F76A8; Thu, 14 Nov 2024 08:44:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731573868; cv=none; b=TN3/OptGyJ4Bsl2sZoR5QLM0akWyDl4cBI4pq4U75BnkRbX55hHIiiViQJ/sH3h5z34m/xdNu6dOIpCWAFJSFnCfp3c98AVcjlxvsZooH/M4E1bIL77J/eULxV/AK0166M5LB4jmIZQdZlGoHmn+4662M5LytlerxUv+AglQPtM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731573868; c=relaxed/simple; bh=S7xDbIph8VcpsMfUCa0ILfHb706b9RjGxs+MZvWRZXA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=S9mA1xFM2cmfSSYytxpu4Z3jTpymjSADeQofDxW1KaXsC2DLJ/zb1hDdUD6VoMMXGZqpB0wVHfJ4cNPnzoHve5wzD6qWZ6yNGZXBgVzxV2adhI4WRxKRvz1SVlgdUIzO/eX3zpTv+XNcr4+3/yY0HNzTxNK5yjTB3i4P8f2o9wg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=bGRjNOqW; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="bGRjNOqW" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 606C9C4CECD; Thu, 14 Nov 2024 08:44:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1731573866; bh=S7xDbIph8VcpsMfUCa0ILfHb706b9RjGxs+MZvWRZXA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=bGRjNOqWB9zGl8VzoPYtqpDMNWzk4/it4beKpBtApasjKJ01uQj9/yQCJHAcYPJA7 SX+qk4SPHHCz2Nseu2D6up8m0gsYet/SDcApOjWq87+9f84OUQ+uU2fTm/IIkbk59P J5wsD1HK/vMcxtTftwJsualVLrELT6yCFWrZfG6LgUOziEKds0mIZEthzITPIjGZQk a5QDnyxAbw3NoxMEhf+vVx3BWRp8QNOn89+545bqdbDOReFWCdUX86Mdtg/a3JwQU+ xypoz6FzFb7ZnC09MvzQg11hOd1tBVzPS2upZyPiNBmjP+ganjR5tus0jY4XITgUyF Ewhe5wNn2OXcg== From: Song Liu To: bpf@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-security-module@vger.kernel.org Cc: kernel-team@meta.com, andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org, daniel@iogearbox.net, martin.lau@linux.dev, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, kpsingh@kernel.org, mattbobrowski@google.com, amir73il@gmail.com, repnop@google.com, jlayton@kernel.org, josef@toxicpanda.com, mic@digikod.net, gnoack@google.com, Song Liu Subject: [RFC/PATCH v2 bpf-next fanotify 3/7] bpf: Make bpf inode storage available to tracing programs Date: Thu, 14 Nov 2024 00:43:41 -0800 Message-ID: <20241114084345.1564165-4-song@kernel.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241114084345.1564165-1-song@kernel.org> References: <20241114084345.1564165-1-song@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Use the same recursion voidance mechanism as task local storage. Signed-off-by: Song Liu --- Do not apply this. Another version of this with selftest is submitted separately, with proper selftests. --- fs/inode.c | 2 + include/linux/bpf.h | 9 ++ include/linux/bpf_lsm.h | 29 ------ include/linux/fs.h | 4 + kernel/bpf/Makefile | 3 +- kernel/bpf/bpf_inode_storage.c | 176 +++++++++++++++++++++++++-------- kernel/bpf/bpf_lsm.c | 4 - kernel/trace/bpf_trace.c | 8 ++ security/bpf/hooks.c | 7 -- 9 files changed, 159 insertions(+), 83 deletions(-) diff --git a/fs/inode.c b/fs/inode.c index 8dabb224f941..c196a62bd48f 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -250,6 +250,8 @@ EXPORT_SYMBOL(free_inode_nonrcu); static void i_callback(struct rcu_head *head) { struct inode *inode = container_of(head, struct inode, i_rcu); + + bpf_inode_storage_free(inode); if (inode->free_inode) inode->free_inode(inode); else diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 19d8ca8ac960..863cb972d1fa 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -2630,6 +2630,7 @@ struct bpf_link *bpf_link_by_id(u32 id); const struct bpf_func_proto *bpf_base_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog); void bpf_task_storage_free(struct task_struct *task); +void bpf_inode_storage_free(struct inode *inode); void bpf_cgrp_storage_free(struct cgroup *cgroup); bool bpf_prog_has_kfunc_call(const struct bpf_prog *prog); const struct btf_func_model * @@ -2900,6 +2901,10 @@ static inline void bpf_task_storage_free(struct task_struct *task) { } +static inline void bpf_inode_storage_free(struct inode *inode) +{ +} + static inline bool bpf_prog_has_kfunc_call(const struct bpf_prog *prog) { return false; @@ -3263,6 +3268,10 @@ extern const struct bpf_func_proto bpf_task_storage_get_recur_proto; extern const struct bpf_func_proto bpf_task_storage_get_proto; extern const struct bpf_func_proto bpf_task_storage_delete_recur_proto; extern const struct bpf_func_proto bpf_task_storage_delete_proto; +extern const struct bpf_func_proto bpf_inode_storage_get_proto; +extern const struct bpf_func_proto bpf_inode_storage_get_recur_proto; +extern const struct bpf_func_proto bpf_inode_storage_delete_proto; +extern const struct bpf_func_proto bpf_inode_storage_delete_recur_proto; extern const struct bpf_func_proto bpf_for_each_map_elem_proto; extern const struct bpf_func_proto bpf_btf_find_by_name_kind_proto; extern const struct bpf_func_proto bpf_sk_setsockopt_proto; diff --git a/include/linux/bpf_lsm.h b/include/linux/bpf_lsm.h index aefcd6564251..a819c2f0a062 100644 --- a/include/linux/bpf_lsm.h +++ b/include/linux/bpf_lsm.h @@ -19,31 +19,12 @@ #include #undef LSM_HOOK -struct bpf_storage_blob { - struct bpf_local_storage __rcu *storage; -}; - -extern struct lsm_blob_sizes bpf_lsm_blob_sizes; - int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog, const struct bpf_prog *prog); bool bpf_lsm_is_sleepable_hook(u32 btf_id); bool bpf_lsm_is_trusted(const struct bpf_prog *prog); -static inline struct bpf_storage_blob *bpf_inode( - const struct inode *inode) -{ - if (unlikely(!inode->i_security)) - return NULL; - - return inode->i_security + bpf_lsm_blob_sizes.lbs_inode; -} - -extern const struct bpf_func_proto bpf_inode_storage_get_proto; -extern const struct bpf_func_proto bpf_inode_storage_delete_proto; -void bpf_inode_storage_free(struct inode *inode); - void bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog, bpf_func_t *bpf_func); int bpf_lsm_get_retval_range(const struct bpf_prog *prog, @@ -66,16 +47,6 @@ static inline int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog, return -EOPNOTSUPP; } -static inline struct bpf_storage_blob *bpf_inode( - const struct inode *inode) -{ - return NULL; -} - -static inline void bpf_inode_storage_free(struct inode *inode) -{ -} - static inline void bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog, bpf_func_t *bpf_func) { diff --git a/include/linux/fs.h b/include/linux/fs.h index 3559446279c1..479097e4dd5b 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -79,6 +79,7 @@ struct fs_context; struct fs_parameter_spec; struct fileattr; struct iomap_ops; +struct bpf_local_storage; extern void __init inode_init(void); extern void __init inode_init_early(void); @@ -648,6 +649,9 @@ struct inode { #ifdef CONFIG_SECURITY void *i_security; #endif +#ifdef CONFIG_BPF_SYSCALL + struct bpf_local_storage __rcu *i_bpf_storage; +#endif /* Stat data, not accessed from path walking */ unsigned long i_ino; diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile index 9b9c151b5c82..a5b7136b4884 100644 --- a/kernel/bpf/Makefile +++ b/kernel/bpf/Makefile @@ -10,8 +10,7 @@ obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o log.o obj-$(CONFIG_BPF_SYSCALL) += bpf_iter.o map_iter.o task_iter.o prog_iter.o link_iter.o obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list.o lpm_trie.o map_in_map.o bloom_filter.o obj-$(CONFIG_BPF_SYSCALL) += local_storage.o queue_stack_maps.o ringbuf.o -obj-$(CONFIG_BPF_SYSCALL) += bpf_local_storage.o bpf_task_storage.o -obj-${CONFIG_BPF_LSM} += bpf_inode_storage.o +obj-$(CONFIG_BPF_SYSCALL) += bpf_local_storage.o bpf_task_storage.o bpf_inode_storage.o obj-$(CONFIG_BPF_SYSCALL) += disasm.o mprog.o obj-$(CONFIG_BPF_JIT) += trampoline.o obj-$(CONFIG_BPF_SYSCALL) += btf.o memalloc.o diff --git a/kernel/bpf/bpf_inode_storage.c b/kernel/bpf/bpf_inode_storage.c index 29da6d3838f6..50c082da8dc5 100644 --- a/kernel/bpf/bpf_inode_storage.c +++ b/kernel/bpf/bpf_inode_storage.c @@ -21,16 +21,36 @@ DEFINE_BPF_STORAGE_CACHE(inode_cache); -static struct bpf_local_storage __rcu ** -inode_storage_ptr(void *owner) +static DEFINE_PER_CPU(int, bpf_inode_storage_busy); + +static void bpf_inode_storage_lock(void) +{ + migrate_disable(); + this_cpu_inc(bpf_inode_storage_busy); +} + +static void bpf_inode_storage_unlock(void) +{ + this_cpu_dec(bpf_inode_storage_busy); + migrate_enable(); +} + +static bool bpf_inode_storage_trylock(void) +{ + migrate_disable(); + if (unlikely(this_cpu_inc_return(bpf_inode_storage_busy) != 1)) { + this_cpu_dec(bpf_inode_storage_busy); + migrate_enable(); + return false; + } + return true; +} + +static struct bpf_local_storage __rcu **inode_storage_ptr(void *owner) { struct inode *inode = owner; - struct bpf_storage_blob *bsb; - bsb = bpf_inode(inode); - if (!bsb) - return NULL; - return &bsb->storage; + return &inode->i_bpf_storage; } static struct bpf_local_storage_data *inode_storage_lookup(struct inode *inode, @@ -39,14 +59,9 @@ static struct bpf_local_storage_data *inode_storage_lookup(struct inode *inode, { struct bpf_local_storage *inode_storage; struct bpf_local_storage_map *smap; - struct bpf_storage_blob *bsb; - - bsb = bpf_inode(inode); - if (!bsb) - return NULL; inode_storage = - rcu_dereference_check(bsb->storage, bpf_rcu_lock_held()); + rcu_dereference_check(inode->i_bpf_storage, bpf_rcu_lock_held()); if (!inode_storage) return NULL; @@ -57,21 +72,18 @@ static struct bpf_local_storage_data *inode_storage_lookup(struct inode *inode, void bpf_inode_storage_free(struct inode *inode) { struct bpf_local_storage *local_storage; - struct bpf_storage_blob *bsb; - - bsb = bpf_inode(inode); - if (!bsb) - return; rcu_read_lock(); - local_storage = rcu_dereference(bsb->storage); + local_storage = rcu_dereference(inode->i_bpf_storage); if (!local_storage) { rcu_read_unlock(); return; } + bpf_inode_storage_lock(); bpf_local_storage_destroy(local_storage); + bpf_inode_storage_unlock(); rcu_read_unlock(); } @@ -83,7 +95,9 @@ static void *bpf_fd_inode_storage_lookup_elem(struct bpf_map *map, void *key) if (fd_empty(f)) return ERR_PTR(-EBADF); + bpf_inode_storage_lock(); sdata = inode_storage_lookup(file_inode(fd_file(f)), map, true); + bpf_inode_storage_unlock(); return sdata ? sdata->data : NULL; } @@ -98,13 +112,16 @@ static long bpf_fd_inode_storage_update_elem(struct bpf_map *map, void *key, if (!inode_storage_ptr(file_inode(fd_file(f)))) return -EBADF; + bpf_inode_storage_lock(); sdata = bpf_local_storage_update(file_inode(fd_file(f)), (struct bpf_local_storage_map *)map, value, map_flags, GFP_ATOMIC); + bpf_inode_storage_unlock(); return PTR_ERR_OR_ZERO(sdata); } -static int inode_storage_delete(struct inode *inode, struct bpf_map *map) +static int inode_storage_delete(struct inode *inode, struct bpf_map *map, + bool nobusy) { struct bpf_local_storage_data *sdata; @@ -112,6 +129,9 @@ static int inode_storage_delete(struct inode *inode, struct bpf_map *map) if (!sdata) return -ENOENT; + if (!nobusy) + return -EBUSY; + bpf_selem_unlink(SELEM(sdata), false); return 0; @@ -119,60 +139,114 @@ static int inode_storage_delete(struct inode *inode, struct bpf_map *map) static long bpf_fd_inode_storage_delete_elem(struct bpf_map *map, void *key) { + int err; + CLASS(fd_raw, f)(*(int *)key); if (fd_empty(f)) return -EBADF; - return inode_storage_delete(file_inode(fd_file(f)), map); + bpf_inode_storage_lock(); + err = inode_storage_delete(file_inode(fd_file(f)), map, true); + bpf_inode_storage_unlock(); + return err; } -/* *gfp_flags* is a hidden argument provided by the verifier */ -BPF_CALL_5(bpf_inode_storage_get, struct bpf_map *, map, struct inode *, inode, - void *, value, u64, flags, gfp_t, gfp_flags) +static void *__bpf_inode_storage_get(struct bpf_map *map, struct inode *inode, + void *value, u64 flags, gfp_t gfp_flags, bool nobusy) { struct bpf_local_storage_data *sdata; - WARN_ON_ONCE(!bpf_rcu_lock_held()); - if (flags & ~(BPF_LOCAL_STORAGE_GET_F_CREATE)) - return (unsigned long)NULL; - /* explicitly check that the inode_storage_ptr is not * NULL as inode_storage_lookup returns NULL in this case and * bpf_local_storage_update expects the owner to have a * valid storage pointer. */ if (!inode || !inode_storage_ptr(inode)) - return (unsigned long)NULL; + return NULL; - sdata = inode_storage_lookup(inode, map, true); + sdata = inode_storage_lookup(inode, map, nobusy); if (sdata) - return (unsigned long)sdata->data; + return sdata->data; - /* This helper must only called from where the inode is guaranteed - * to have a refcount and cannot be freed. - */ - if (flags & BPF_LOCAL_STORAGE_GET_F_CREATE) { + /* only allocate new storage, when the inode is refcounted */ + if (atomic_read(&inode->i_count) && + flags & BPF_LOCAL_STORAGE_GET_F_CREATE && nobusy) { sdata = bpf_local_storage_update( inode, (struct bpf_local_storage_map *)map, value, BPF_NOEXIST, gfp_flags); - return IS_ERR(sdata) ? (unsigned long)NULL : - (unsigned long)sdata->data; + return IS_ERR(sdata) ? NULL : sdata->data; } - return (unsigned long)NULL; + return NULL; } -BPF_CALL_2(bpf_inode_storage_delete, - struct bpf_map *, map, struct inode *, inode) +/* *gfp_flags* is a hidden argument provided by the verifier */ +BPF_CALL_5(bpf_inode_storage_get_recur, struct bpf_map *, map, struct inode *, inode, + void *, value, u64, flags, gfp_t, gfp_flags) { + bool nobusy; + void *data; + + WARN_ON_ONCE(!bpf_rcu_lock_held()); + if (flags & ~(BPF_LOCAL_STORAGE_GET_F_CREATE)) + return (unsigned long)NULL; + + nobusy = bpf_inode_storage_trylock(); + data = __bpf_inode_storage_get(map, inode, value, flags, gfp_flags, nobusy); + if (nobusy) + bpf_inode_storage_unlock(); + return (unsigned long)data; +} + +/* *gfp_flags* is a hidden argument provided by the verifier */ +BPF_CALL_5(bpf_inode_storage_get, struct bpf_map *, map, struct inode *, inode, + void *, value, u64, flags, gfp_t, gfp_flags) +{ + void *data; + + WARN_ON_ONCE(!bpf_rcu_lock_held()); + if (flags & ~(BPF_LOCAL_STORAGE_GET_F_CREATE)) + return (unsigned long)NULL; + + bpf_inode_storage_lock(); + data = __bpf_inode_storage_get(map, inode, value, flags, gfp_flags, true); + bpf_inode_storage_unlock(); + return (unsigned long)data; +} + +BPF_CALL_2(bpf_inode_storage_delete_recur, struct bpf_map *, map, struct inode *, inode) +{ + bool nobusy; + int ret; + WARN_ON_ONCE(!bpf_rcu_lock_held()); if (!inode) return -EINVAL; + nobusy = bpf_inode_storage_trylock(); /* This helper must only called from where the inode is guaranteed * to have a refcount and cannot be freed. */ - return inode_storage_delete(inode, map); + ret = inode_storage_delete(inode, map, nobusy); + bpf_inode_storage_unlock(); + return ret; +} + +BPF_CALL_2(bpf_inode_storage_delete, struct bpf_map *, map, struct inode *, inode) +{ + int ret; + + WARN_ON_ONCE(!bpf_rcu_lock_held()); + if (!inode) + return -EINVAL; + + bpf_inode_storage_lock(); + /* This helper must only called from where the inode is guaranteed + * to have a refcount and cannot be freed. + */ + ret = inode_storage_delete(inode, map, true); + bpf_inode_storage_unlock(); + return ret; } static int notsupp_get_next_key(struct bpf_map *map, void *key, @@ -208,6 +282,17 @@ const struct bpf_map_ops inode_storage_map_ops = { BTF_ID_LIST_SINGLE(bpf_inode_storage_btf_ids, struct, inode) +const struct bpf_func_proto bpf_inode_storage_get_recur_proto = { + .func = bpf_inode_storage_get_recur, + .gpl_only = false, + .ret_type = RET_PTR_TO_MAP_VALUE_OR_NULL, + .arg1_type = ARG_CONST_MAP_PTR, + .arg2_type = ARG_PTR_TO_BTF_ID_OR_NULL, + .arg2_btf_id = &bpf_inode_storage_btf_ids[0], + .arg3_type = ARG_PTR_TO_MAP_VALUE_OR_NULL, + .arg4_type = ARG_ANYTHING, +}; + const struct bpf_func_proto bpf_inode_storage_get_proto = { .func = bpf_inode_storage_get, .gpl_only = false, @@ -219,6 +304,15 @@ const struct bpf_func_proto bpf_inode_storage_get_proto = { .arg4_type = ARG_ANYTHING, }; +const struct bpf_func_proto bpf_inode_storage_delete_recur_proto = { + .func = bpf_inode_storage_delete_recur, + .gpl_only = false, + .ret_type = RET_INTEGER, + .arg1_type = ARG_CONST_MAP_PTR, + .arg2_type = ARG_PTR_TO_BTF_ID_OR_NULL, + .arg2_btf_id = &bpf_inode_storage_btf_ids[0], +}; + const struct bpf_func_proto bpf_inode_storage_delete_proto = { .func = bpf_inode_storage_delete, .gpl_only = false, diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c index 6292ac5f9bd1..51e2de17325a 100644 --- a/kernel/bpf/bpf_lsm.c +++ b/kernel/bpf/bpf_lsm.c @@ -231,10 +231,6 @@ bpf_lsm_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) } switch (func_id) { - case BPF_FUNC_inode_storage_get: - return &bpf_inode_storage_get_proto; - case BPF_FUNC_inode_storage_delete: - return &bpf_inode_storage_delete_proto; #ifdef CONFIG_NET case BPF_FUNC_sk_storage_get: return &bpf_sk_storage_get_proto; diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index a582cd25ca87..3ec39e6704e2 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -1529,6 +1529,14 @@ bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) if (bpf_prog_check_recur(prog)) return &bpf_task_storage_delete_recur_proto; return &bpf_task_storage_delete_proto; + case BPF_FUNC_inode_storage_get: + if (bpf_prog_check_recur(prog)) + return &bpf_inode_storage_get_recur_proto; + return &bpf_inode_storage_get_proto; + case BPF_FUNC_inode_storage_delete: + if (bpf_prog_check_recur(prog)) + return &bpf_inode_storage_delete_recur_proto; + return &bpf_inode_storage_delete_proto; case BPF_FUNC_for_each_map_elem: return &bpf_for_each_map_elem_proto; case BPF_FUNC_snprintf: diff --git a/security/bpf/hooks.c b/security/bpf/hooks.c index 3663aec7bcbd..67719a04bb0b 100644 --- a/security/bpf/hooks.c +++ b/security/bpf/hooks.c @@ -12,8 +12,6 @@ static struct security_hook_list bpf_lsm_hooks[] __ro_after_init = { LSM_HOOK_INIT(NAME, bpf_lsm_##NAME), #include #undef LSM_HOOK - LSM_HOOK_INIT(inode_free_security, bpf_inode_storage_free), - LSM_HOOK_INIT(task_free, bpf_task_storage_free), }; static const struct lsm_id bpf_lsmid = { @@ -29,12 +27,7 @@ static int __init bpf_lsm_init(void) return 0; } -struct lsm_blob_sizes bpf_lsm_blob_sizes __ro_after_init = { - .lbs_inode = sizeof(struct bpf_storage_blob), -}; - DEFINE_LSM(bpf) = { .name = "bpf", .init = bpf_lsm_init, - .blobs = &bpf_lsm_blob_sizes }; From patchwork Thu Nov 14 08:43:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 13874733 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 818A21F76D5; Thu, 14 Nov 2024 08:44:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731573875; cv=none; b=QuVZNdJ2JgF8NWDcp4YKMEMztE0hC3mKBk5prKDSJWeuajVqA6cZ0qI5v7LHnGA32sZrDW1gOAWJv9koTgMbzapHDfgfb/sE+teNIA6KaxaCLzybTaUGwUURx+oFjR7NSmwbt39NcM+o5YaMabpKBmWTg1E4LHW4CdFSUefsAu0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731573875; c=relaxed/simple; bh=1ovS1bjpZU/Le8HjGKou5cNnZ+kNg/ks1kxh0OmdiCQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=LIXMnRk+EFjqGmRynZ4KrlFP8Mk8ar0eGdSuI8MxBzbuuNyJxh4DJ/Te0SAuJcriSt6n5gSHtqHnJ66RSbXyWvGk4U75W9cz1K9q651SxxoMoT8empJUU3IsghsBuVraTF/ytpahuc0PqTZp9Re+1hjl363aTFSwJZi+b8aB7K0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=B+OaYKzY; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="B+OaYKzY" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 79B5BC4CECD; Thu, 14 Nov 2024 08:44:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1731573874; bh=1ovS1bjpZU/Le8HjGKou5cNnZ+kNg/ks1kxh0OmdiCQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=B+OaYKzYiTQ6uglk//x3iQAkbheec6Y3WdJ2Ifzb/jmzMupLSUvWvQSjkqJo1Ov4Y X7+6Dlp0QL9sOGXzw0sQSVWTvqBUtuUsspYE8Da0IrzIifhefeEgGDMJlLDeidj348 kIq91Cx8nJFofyR+HNWs8uJxdy6gQVB8aM5cVVDJHjuabtjLDAm00EBpmta178/Ee7 ktuaBpvuwBg+LgpN01eHqDy4vwX9hhzEAKJvCqclIdc07AIesknLmqABYAzr6Tzd/R 1l3ZSxJ57qWVl9bdnUtRQ/IQCcYU9IQOxswJ0xKGR+ceEnOdnWKAmtwKdBS5jkHIoa 6Qxk6u4hhLH4A== From: Song Liu To: bpf@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-security-module@vger.kernel.org Cc: kernel-team@meta.com, andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org, daniel@iogearbox.net, martin.lau@linux.dev, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, kpsingh@kernel.org, mattbobrowski@google.com, amir73il@gmail.com, repnop@google.com, jlayton@kernel.org, josef@toxicpanda.com, mic@digikod.net, gnoack@google.com, Song Liu Subject: [RFC/PATCH v2 bpf-next fanotify 4/7] bpf: fs: Add three kfuncs Date: Thu, 14 Nov 2024 00:43:42 -0800 Message-ID: <20241114084345.1564165-5-song@kernel.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241114084345.1564165-1-song@kernel.org> References: <20241114084345.1564165-1-song@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Add the following kfuncs: - bpf_iput - bpf_dput - bpf_is_subdir These kfuncs can be used by bpf fanotify fastpath. Both bpf_iput and bpf_dput are marked as KF_SLEEPABLE | KF_RELEASE. They will be used to release reference on inode and dentry. bpf_is_subdir is marked as KF_RCU. It will be used to take rcu protected pointers, for example, kptr saved to a bpf map. Signed-off-by: Song Liu --- fs/bpf_fs_kfuncs.c | 41 +++++++++++++++++++++++++++++++++++++++++ kernel/bpf/verifier.c | 1 + 2 files changed, 42 insertions(+) diff --git a/fs/bpf_fs_kfuncs.c b/fs/bpf_fs_kfuncs.c index 3fe9f59ef867..03ad3a2faec8 100644 --- a/fs/bpf_fs_kfuncs.c +++ b/fs/bpf_fs_kfuncs.c @@ -152,6 +152,44 @@ __bpf_kfunc int bpf_get_file_xattr(struct file *file, const char *name__str, return bpf_get_dentry_xattr(dentry, name__str, value_p); } +/** + * bpf_iput - Drop a reference on the inode + * + * @inode: inode to drop reference. + * + * Drop a refcount on inode. + */ +__bpf_kfunc void bpf_iput(struct inode *inode) +{ + iput(inode); +} + +/** + * bpf_dput - Drop a reference on the dentry + * + * @dentry: dentry to drop reference. + * + * Drop a refcount on dentry. + */ +__bpf_kfunc void bpf_dput(struct dentry *dentry) +{ + dput(dentry); +} + +/** + * bpf_is_subdir - is new dentry a subdirectory of old_dentry + * @new_dentry: new dentry + * @old_dentry: old dentry + * + * Returns true if new_dentry is a subdirectory of the parent (at any depth). + * Returns false otherwise. + * Caller must ensure that "new_dentry" is pinned before calling is_subdir() + */ +__bpf_kfunc bool bpf_is_subdir(struct dentry *new_dentry, struct dentry *old_dentry) +{ + return is_subdir(new_dentry, old_dentry); +} + __bpf_kfunc_end_defs(); BTF_KFUNCS_START(bpf_fs_kfunc_set_ids) @@ -161,6 +199,9 @@ BTF_ID_FLAGS(func, bpf_put_file, KF_RELEASE) BTF_ID_FLAGS(func, bpf_path_d_path, KF_TRUSTED_ARGS) BTF_ID_FLAGS(func, bpf_get_dentry_xattr, KF_SLEEPABLE | KF_TRUSTED_ARGS) BTF_ID_FLAGS(func, bpf_get_file_xattr, KF_SLEEPABLE | KF_TRUSTED_ARGS) +BTF_ID_FLAGS(func, bpf_iput, KF_SLEEPABLE | KF_RELEASE) +BTF_ID_FLAGS(func, bpf_dput, KF_SLEEPABLE | KF_RELEASE) +BTF_ID_FLAGS(func, bpf_is_subdir, KF_RCU) BTF_KFUNCS_END(bpf_fs_kfunc_set_ids) static int bpf_fs_kfuncs_filter(const struct bpf_prog *prog, u32 kfunc_id) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 9a7ed527e47e..65abb2d74ee5 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -5432,6 +5432,7 @@ BTF_ID(struct, bpf_cpumask) #endif BTF_ID(struct, task_struct) BTF_ID(struct, bpf_crypto_ctx) +BTF_ID(struct, dentry) BTF_SET_END(rcu_protected_types) static bool rcu_protected_object(const struct btf *btf, u32 btf_id) From patchwork Thu Nov 14 08:43:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 13874734 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9B1E81F7570; Thu, 14 Nov 2024 08:44:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731573882; cv=none; b=m9kEul/kFrvozOfUDNGgQIzeWBmAWYpkXDWTAWvGSxMtyqniEqxUl57ZrrkJC1N+xUjfsPi+6O6DUKHP2UrfJUL3KNBucSLnc30G5y1g7tQ9JanoXYOKlNpy3abUFgUh0LITUGY+yutHi3chptMYcziqW7Iznsh+NEAkhzS5iko= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731573882; c=relaxed/simple; bh=vqv7nsoFBXs3JlMYaca6LaUZbsVJm6kN4tNuO7C36Xg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pTMOaD4uefmCSI57zT5Err2I/O8dea+twEpfAGSFbx91KOnnfOCRKNn/2ioJ//y9EIwph18BMR/8lYxcdJWU4y2YBLTw3oNFon0a0aOehuWLPGEztm8zaCsd1EWslR+Yj4bWcFMBD8kwxcNJ6ygxkcgb6llwwmv0s8qx10AYqlU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=OMiAW2jZ; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="OMiAW2jZ" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8E191C4CED6; Thu, 14 Nov 2024 08:44:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1731573882; bh=vqv7nsoFBXs3JlMYaca6LaUZbsVJm6kN4tNuO7C36Xg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=OMiAW2jZ6sNdS7VmecsJKbiYAglOcU6SWUX5b+8sGA09GyeHdwreENhpB8dpBe5QC zbhoOH8VcqVbNqqU1PLyMWFMtnDCqiDtG4yTCuBJSshEpg52aunoBZfoTvY4No7C+O ifLK/nKc+GH4TPg/LWoIEsZdfTLm3ereJ6hlY9uS4UCWOPYIeWVHvdDamEnVLfnxi3 h1j6Hg/0aWEAktkB8ELaT2SiOXs4X9WCEDxO7Ixwg72p4fvVvlY64gTl/+Ex3w5v3U fj2nodsaB9FHgg+qNgGmDxs6vBipu1ZNzR1t6drCLbrnxEb0qBSJrFgbZ9270ToE0V /o8Z6455xWctQ== From: Song Liu To: bpf@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-security-module@vger.kernel.org Cc: kernel-team@meta.com, andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org, daniel@iogearbox.net, martin.lau@linux.dev, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, kpsingh@kernel.org, mattbobrowski@google.com, amir73il@gmail.com, repnop@google.com, jlayton@kernel.org, josef@toxicpanda.com, mic@digikod.net, gnoack@google.com, Song Liu Subject: [RFC/PATCH v2 bpf-next fanotify 5/7] bpf: Allow bpf map hold reference on dentry Date: Thu, 14 Nov 2024 00:43:43 -0800 Message-ID: <20241114084345.1564165-6-song@kernel.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241114084345.1564165-1-song@kernel.org> References: <20241114084345.1564165-1-song@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 To save a dentry in a bpf map, proper logic is needed to free the dentry properly on map termination. Signed-off-by: Song Liu --- kernel/bpf/helpers.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 1a43d06eab28..5e3bf2c188a8 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -3004,6 +3004,12 @@ __bpf_kfunc int bpf_copy_from_user_str(void *dst, u32 dst__sz, const void __user return ret + 1; } +__bpf_kfunc void bpf_dentry_release_dtor(struct dentry *dentry) +{ + dput(dentry); +} +CFI_NOSEAL(bpf_dentry_release_dtor); + __bpf_kfunc_end_defs(); BTF_KFUNCS_START(generic_btf_ids) @@ -3046,6 +3052,8 @@ static const struct btf_kfunc_id_set generic_kfunc_set = { BTF_ID_LIST(generic_dtor_ids) BTF_ID(struct, task_struct) BTF_ID(func, bpf_task_release_dtor) +BTF_ID(struct, dentry) +BTF_ID(func, bpf_dentry_release_dtor) #ifdef CONFIG_CGROUPS BTF_ID(struct, cgroup) BTF_ID(func, bpf_cgroup_release_dtor) @@ -3105,11 +3113,15 @@ static int __init kfunc_init(void) .btf_id = generic_dtor_ids[0], .kfunc_btf_id = generic_dtor_ids[1] }, -#ifdef CONFIG_CGROUPS { .btf_id = generic_dtor_ids[2], .kfunc_btf_id = generic_dtor_ids[3] }, +#ifdef CONFIG_CGROUPS + { + .btf_id = generic_dtor_ids[4], + .kfunc_btf_id = generic_dtor_ids[5] + }, #endif }; From patchwork Thu Nov 14 08:43:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 13874735 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 68D5B1F757F; Thu, 14 Nov 2024 08:44:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731573890; cv=none; b=liGVbZyZV8FXLn1V3E5qza1+Z7Qov1ULcdjbxOE5S95WJweQyms1QpxD2ApZgbMLj7AAZqyFXv8AE0DzQoIFS9Afm95zelcHad99g3zfJJ/+c1EWUVEJLKYEouzEkZGpMg2WrEQzyNMRs02jHdT01+hYlMiEzhouQIaiQ39aBWA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731573890; c=relaxed/simple; bh=M4IAn4WGrIBdCXICRh4T7eOrsB4F7fqrCuH6SgxNCdo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=uPddGnAbhCc9vfatJzZBknCH8T/5TbGYsjFwW8RIWM8RfIXMNOQqmSYOLK6yN8p4zVvDYap+D3ArShu8niU6B739TaoWXWf9pCoyiYfIuiG48ndgIQ+GjYTFptBh8Wl5bcVHWmY5gXRh/ZqGklihXxGI45OGpmA1XP/kdysjpfA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=l+VV6NpJ; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="l+VV6NpJ" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A498AC4CECD; Thu, 14 Nov 2024 08:44:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1731573890; bh=M4IAn4WGrIBdCXICRh4T7eOrsB4F7fqrCuH6SgxNCdo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=l+VV6NpJ+BTuYHnCogP9sQWd9zYhmFh2cCuj+nqxFuqMT2YSRz8fk7kGU9kTRPsWI i3KpUWdgRKewQekyWMu3PN0KAVnk8ETvoacXtJ8bi9FXUoHMI8NClO7ZXsCvtJ1nsB rXtHcmJz0eTErRdpFAGGJ6bUaNDLQZu2gHDvVLTuGCY34TF1NUO/wr/y19/Rtbu61i 3uILEgfAnEwm5bcUZd/DcnXGSnnqsn4HQCuDVMhYpMAjDoF5+z9lmlNeoSTGHb/Que 6Pin2DnOjjG72kkp4lppQ8ftWQKutcn3+7nET2EyFUc6mpNL+CuaBJtGrL3Wv4DTYK mHwrZDkDsx/BA== From: Song Liu To: bpf@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-security-module@vger.kernel.org Cc: kernel-team@meta.com, andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org, daniel@iogearbox.net, martin.lau@linux.dev, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, kpsingh@kernel.org, mattbobrowski@google.com, amir73il@gmail.com, repnop@google.com, jlayton@kernel.org, josef@toxicpanda.com, mic@digikod.net, gnoack@google.com, Song Liu Subject: [RFC/PATCH v2 bpf-next fanotify 6/7] fanotify: Enable bpf based fanotify fastpath handler Date: Thu, 14 Nov 2024 00:43:44 -0800 Message-ID: <20241114084345.1564165-7-song@kernel.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241114084345.1564165-1-song@kernel.org> References: <20241114084345.1564165-1-song@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Allow user to write fanotify fastpath handlers with bpf programs. Major changes: 1. Make kfuncs in fs/bpf_fs_kfuncs.c available to STRUCT_OPS programs. 2. Add kfunc bpf_fanotify_data_inode, bpf_fanotify_data_dentry. e. Add struct_ops bpf_fanotify_fastpath_ops. TODO: 1. With current logic, the bpf based fastpath handler is added to the global list, and thus available to all users. This is similar to bpf based tcp congestion algorithms. It is possible to add an API so that the bpf based handler is not added to global list, which is similar to hid-bpf. I plan to add that API later. Signed-off-by: Song Liu --- fs/Makefile | 2 +- fs/bpf_fs_kfuncs.c | 10 +- fs/notify/fanotify/fanotify_fastpath.c | 172 ++++++++++++++++++++++++- kernel/bpf/verifier.c | 5 + 4 files changed, 183 insertions(+), 6 deletions(-) diff --git a/fs/Makefile b/fs/Makefile index 61679fd587b7..1043d999262d 100644 --- a/fs/Makefile +++ b/fs/Makefile @@ -129,4 +129,4 @@ obj-$(CONFIG_EFIVAR_FS) += efivarfs/ obj-$(CONFIG_EROFS_FS) += erofs/ obj-$(CONFIG_VBOXSF_FS) += vboxsf/ obj-$(CONFIG_ZONEFS_FS) += zonefs/ -obj-$(CONFIG_BPF_LSM) += bpf_fs_kfuncs.o +obj-$(CONFIG_BPF_SYSCALL) += bpf_fs_kfuncs.o diff --git a/fs/bpf_fs_kfuncs.c b/fs/bpf_fs_kfuncs.c index 03ad3a2faec8..2f7b91f10175 100644 --- a/fs/bpf_fs_kfuncs.c +++ b/fs/bpf_fs_kfuncs.c @@ -207,7 +207,8 @@ BTF_KFUNCS_END(bpf_fs_kfunc_set_ids) static int bpf_fs_kfuncs_filter(const struct bpf_prog *prog, u32 kfunc_id) { if (!btf_id_set8_contains(&bpf_fs_kfunc_set_ids, kfunc_id) || - prog->type == BPF_PROG_TYPE_LSM) + prog->type == BPF_PROG_TYPE_LSM || + prog->type == BPF_PROG_TYPE_STRUCT_OPS) return 0; return -EACCES; } @@ -220,7 +221,12 @@ static const struct btf_kfunc_id_set bpf_fs_kfunc_set = { static int __init bpf_fs_kfuncs_init(void) { - return register_btf_kfunc_id_set(BPF_PROG_TYPE_LSM, &bpf_fs_kfunc_set); + int ret; + + ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_LSM, &bpf_fs_kfunc_set); + ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS, &bpf_fs_kfunc_set); + + return ret; } late_initcall(bpf_fs_kfuncs_init); diff --git a/fs/notify/fanotify/fanotify_fastpath.c b/fs/notify/fanotify/fanotify_fastpath.c index f2aefcf0ca6a..ec7a1143d687 100644 --- a/fs/notify/fanotify/fanotify_fastpath.c +++ b/fs/notify/fanotify/fanotify_fastpath.c @@ -2,6 +2,7 @@ #include #include #include +#include #include "fanotify.h" @@ -197,7 +198,7 @@ int fanotify_fastpath_add(struct fsnotify_group *group, spin_lock(&fp_list_lock); fp_ops = fanotify_fastpath_find(args.name); - if (!fp_ops || !try_module_get(fp_ops->owner)) { + if (!fp_ops || !bpf_try_module_get(fp_ops, fp_ops->owner)) { spin_unlock(&fp_list_lock); ret = -ENOENT; goto err_free_hook; @@ -238,7 +239,7 @@ int fanotify_fastpath_add(struct fsnotify_group *group, err_free_args: kfree(init_args); err_module_put: - module_put(fp_ops->owner); + bpf_module_put(fp_ops, fp_ops->owner); err_free_hook: kfree(fp_hook); goto out; @@ -249,7 +250,7 @@ void fanotify_fastpath_hook_free(struct fanotify_fastpath_hook *fp_hook) if (fp_hook->ops->fp_free) fp_hook->ops->fp_free(fp_hook); - module_put(fp_hook->ops->owner); + bpf_module_put(fp_hook->ops, fp_hook->ops->owner); kfree(fp_hook); } @@ -280,3 +281,168 @@ static int __init fanotify_fastpath_init(void) return 0; } device_initcall(fanotify_fastpath_init); + +__bpf_kfunc_start_defs(); + +/** + * bpf_fanotify_data_inode - get inode from fanotify_fastpath_event + * + * @event: fanotify_fastpath_event to get inode from + * + * Get referenced inode from fanotify_fastpath_event. + * + * Return: A refcounted inode or NULL. + * + */ +__bpf_kfunc struct inode *bpf_fanotify_data_inode(struct fanotify_fastpath_event *event) +{ + struct inode *inode = fsnotify_data_inode(event->data, event->data_type); + + return inode ? igrab(inode) : NULL; +} + +/** + * bpf_fanotify_data_dentry - get dentry from fanotify_fastpath_event + * + * @event: fanotify_fastpath_event to get dentry from + * + * Get referenced dentry from fanotify_fastpath_event. + * + * Return: A refcounted inode or NULL. + * + */ +__bpf_kfunc struct dentry *bpf_fanotify_data_dentry(struct fanotify_fastpath_event *event) +{ + struct dentry *dentry = fsnotify_data_dentry(event->data, event->data_type); + + return dget(dentry); +} + +__bpf_kfunc_end_defs(); + +BTF_KFUNCS_START(bpf_fanotify_kfunc_set_ids) +BTF_ID_FLAGS(func, bpf_fanotify_data_inode, + KF_ACQUIRE | KF_TRUSTED_ARGS | KF_RET_NULL) +BTF_ID_FLAGS(func, bpf_fanotify_data_dentry, + KF_ACQUIRE | KF_TRUSTED_ARGS | KF_RET_NULL) +BTF_KFUNCS_END(bpf_fanotify_kfunc_set_ids) + +static const struct btf_kfunc_id_set bpf_fanotify_kfunc_set = { + .owner = THIS_MODULE, + .set = &bpf_fanotify_kfunc_set_ids, +}; + +static const struct bpf_func_proto * +bpf_fanotify_fastpath_get_func_proto(enum bpf_func_id func_id, + const struct bpf_prog *prog) +{ + return tracing_prog_func_proto(func_id, prog); +} + +static bool bpf_fanotify_fastpath_is_valid_access(int off, int size, + enum bpf_access_type type, + const struct bpf_prog *prog, + struct bpf_insn_access_aux *info) +{ + if (!bpf_tracing_btf_ctx_access(off, size, type, prog, info)) + return false; + + return true; +} + +static int bpf_fanotify_fastpath_btf_struct_access(struct bpf_verifier_log *log, + const struct bpf_reg_state *reg, + int off, int size) +{ + return 0; +} + +static const struct bpf_verifier_ops bpf_fanotify_fastpath_verifier_ops = { + .get_func_proto = bpf_fanotify_fastpath_get_func_proto, + .is_valid_access = bpf_fanotify_fastpath_is_valid_access, + .btf_struct_access = bpf_fanotify_fastpath_btf_struct_access, +}; + +static int bpf_fanotify_fastpath_reg(void *kdata, struct bpf_link *link) +{ + return fanotify_fastpath_register(kdata); +} + +static void bpf_fanotify_fastpath_unreg(void *kdata, struct bpf_link *link) +{ + fanotify_fastpath_unregister(kdata); +} + +static int bpf_fanotify_fastpath_init(struct btf *btf) +{ + return 0; +} + +static int bpf_fanotify_fastpath_init_member(const struct btf_type *t, + const struct btf_member *member, + void *kdata, const void *udata) +{ + const struct fanotify_fastpath_ops *uops; + struct fanotify_fastpath_ops *ops; + u32 moff; + int ret; + + uops = (const struct fanotify_fastpath_ops *)udata; + ops = (struct fanotify_fastpath_ops *)kdata; + + moff = __btf_member_bit_offset(t, member) / 8; + switch (moff) { + case offsetof(struct fanotify_fastpath_ops, name): + ret = bpf_obj_name_cpy(ops->name, uops->name, + sizeof(ops->name)); + if (ret <= 0) + return -EINVAL; + return 1; + } + + return 0; +} + +static int __bpf_fan_fp_handler(struct fsnotify_group *group, + struct fanotify_fastpath_hook *fp_hook, + struct fanotify_fastpath_event *fp_event) +{ + return 0; +} + +static int __bpf_fan_fp_init(struct fanotify_fastpath_hook *hook, void *args) +{ + return 0; +} + +static void __bpf_fan_fp_free(struct fanotify_fastpath_hook *hook) +{ +} + +/* For bpf_struct_ops->cfi_stubs */ +static struct fanotify_fastpath_ops __bpf_fanotify_fastpath_ops = { + .fp_handler = __bpf_fan_fp_handler, + .fp_init = __bpf_fan_fp_init, + .fp_free = __bpf_fan_fp_free, +}; + +static struct bpf_struct_ops bpf_fanotify_fastpath_ops = { + .verifier_ops = &bpf_fanotify_fastpath_verifier_ops, + .reg = bpf_fanotify_fastpath_reg, + .unreg = bpf_fanotify_fastpath_unreg, + .init = bpf_fanotify_fastpath_init, + .init_member = bpf_fanotify_fastpath_init_member, + .name = "fanotify_fastpath_ops", + .cfi_stubs = &__bpf_fanotify_fastpath_ops, + .owner = THIS_MODULE, +}; + +static int __init bpf_fanotify_fastpath_struct_ops_init(void) +{ + int ret; + + ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS, &bpf_fanotify_kfunc_set); + ret = ret ?: register_bpf_struct_ops(&bpf_fanotify_fastpath_ops, fanotify_fastpath_ops); + return ret; +} +late_initcall(bpf_fanotify_fastpath_struct_ops_init); diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 65abb2d74ee5..cf7af86118fe 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -6529,6 +6529,10 @@ BTF_TYPE_SAFE_TRUSTED(struct dentry) { struct inode *d_inode; }; +BTF_TYPE_SAFE_TRUSTED(struct fanotify_fastpath_event) { + struct inode *dir; +}; + BTF_TYPE_SAFE_TRUSTED_OR_NULL(struct socket) { struct sock *sk; }; @@ -6564,6 +6568,7 @@ static bool type_is_trusted(struct bpf_verifier_env *env, BTF_TYPE_EMIT(BTF_TYPE_SAFE_TRUSTED(struct linux_binprm)); BTF_TYPE_EMIT(BTF_TYPE_SAFE_TRUSTED(struct file)); BTF_TYPE_EMIT(BTF_TYPE_SAFE_TRUSTED(struct dentry)); + BTF_TYPE_EMIT(BTF_TYPE_SAFE_TRUSTED(struct fanotify_fastpath_event)); return btf_nested_type_is_trusted(&env->log, reg, field_name, btf_id, "__safe_trusted"); } From patchwork Thu Nov 14 08:43:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 13874736 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CA3F71F76AC; Thu, 14 Nov 2024 08:44:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731573898; cv=none; b=VYu5q0QVbRObGVkoY4Z+IZ4k7xYBFEla+LS6IDH1/FtY76nWp3JbhvzAaJIYFbM2oT5GPExDqBNl1CdO/5fkxpLyqulbF87xKaJdSqTlHH3a5j0B7XVQUAYz08Qj2VU3JOhqV4OFWR9QL9T0zzA6sZVhyvtcgVOMNEc9J0uTIsE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731573898; c=relaxed/simple; bh=28xo/GcqXu7Noex43IdCT3nlsYOlGRCmbcULqotFvGU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=tRNt5e4MLFRmXzBTqa4AefNEX1ju6U3G22UnDOvntle+32COSBrabD5YnZSsbWf2WOzopBDcDrbk45P+xYoaWc7ae/VMSkLwPtgh6AM6fgzoe5B/rRc+GoLe74Ou41a6rTnR0TSgmsT0CrAJzbFqJAh9kMyC0zoiqhe7XzqYk8U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=GI3AJpw4; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="GI3AJpw4" Received: by smtp.kernel.org (Postfix) with ESMTPSA id C055DC4CECD; Thu, 14 Nov 2024 08:44:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1731573898; bh=28xo/GcqXu7Noex43IdCT3nlsYOlGRCmbcULqotFvGU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=GI3AJpw4RFNXyKxSygIHCED2Kcme73S6J79kzh92/qStihHmAyX4Imu63z8x4UFeC mE817jtgRr4FjyaOyNHZnna9X69S1QYMsO+zDmaz7myxEQGU+pJwvf1CIRBWbinL8e yUgqeFZqYtCGphTg8k2s2UM48aW+bU0RCTPUEvo1RgONoVKWRRSok9116miCA5gbVK cBvGj8cPD8+zvcq5dfA9tojignWFQNzXnrzyOp2GdjKv05ktI/201cnV7QZZ2Vj9ZB X8WKJTpD/Fp9j+powCd7jhXziX0blEZUZutNng6/zbSpC4WN11HOd2WuvZXlLy6AMk aGVE8AufzG8yg== From: Song Liu To: bpf@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-security-module@vger.kernel.org Cc: kernel-team@meta.com, andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org, daniel@iogearbox.net, martin.lau@linux.dev, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, kpsingh@kernel.org, mattbobrowski@google.com, amir73il@gmail.com, repnop@google.com, jlayton@kernel.org, josef@toxicpanda.com, mic@digikod.net, gnoack@google.com, Song Liu Subject: [RFC/PATCH v2 bpf-next fanotify 7/7] selftests/bpf: Add test for BPF based fanotify fastpath handler Date: Thu, 14 Nov 2024 00:43:45 -0800 Message-ID: <20241114084345.1564165-8-song@kernel.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241114084345.1564165-1-song@kernel.org> References: <20241114084345.1564165-1-song@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This test shows a simplified logic that monitors a subtree. This is simplified as it doesn't handle all the scenarios, such as: 1) moving a subsubtree into/outof the being monitoring subtree; 2) mount point inside the being monitored subtree Therefore, this is not to show a way to reliably monitor a subtree. Instead, this is to test the functionalities of bpf based fastpath. To really monitor a subtree reliably, we will need more complex logic. Overview of the logic: 1. fanotify is created for the whole file system (/tmp). 2. dentry of the subtree root is saved in map subtree_root. 3. bpf_is_subdir() is used to check whether a fanotify event happens inside the subtree. Only events happened in the subtree are passed to userspace. 4. A bpf map (inode_storage_map) is used to cache result from bpf_is_subdir(). 5. subsubtree moving is not handled. This is because we don't yet have a good way to walk a subtree from BPF (something similar to d_walk). Signed-off-by: Song Liu --- tools/testing/selftests/bpf/bpf_kfuncs.h | 5 + tools/testing/selftests/bpf/config | 2 + .../testing/selftests/bpf/prog_tests/fan_fp.c | 264 ++++++++++++++++++ tools/testing/selftests/bpf/progs/fan_fp.c | 154 ++++++++++ 4 files changed, 425 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/fan_fp.c create mode 100644 tools/testing/selftests/bpf/progs/fan_fp.c diff --git a/tools/testing/selftests/bpf/bpf_kfuncs.h b/tools/testing/selftests/bpf/bpf_kfuncs.h index 2eb3483f2fb0..6ccfef9685e1 100644 --- a/tools/testing/selftests/bpf/bpf_kfuncs.h +++ b/tools/testing/selftests/bpf/bpf_kfuncs.h @@ -87,4 +87,9 @@ struct dentry; */ extern int bpf_get_dentry_xattr(struct dentry *dentry, const char *name, struct bpf_dynptr *value_ptr) __ksym __weak; + +struct fanotify_fastpath_event; +extern struct inode *bpf_fanotify_data_inode(struct fanotify_fastpath_event *event) __ksym __weak; +extern void bpf_iput(struct inode *inode) __ksym __weak; +extern bool bpf_is_subdir(struct dentry *new_dentry, struct dentry *old_dentry) __ksym __weak; #endif diff --git a/tools/testing/selftests/bpf/config b/tools/testing/selftests/bpf/config index 4ca84c8d9116..505327f53f07 100644 --- a/tools/testing/selftests/bpf/config +++ b/tools/testing/selftests/bpf/config @@ -24,6 +24,8 @@ CONFIG_DEBUG_INFO_BTF=y CONFIG_DEBUG_INFO_DWARF4=y CONFIG_DUMMY=y CONFIG_DYNAMIC_FTRACE=y +CONFIG_FANOTIFY=y +CONFIG_FANOTIFY_FASTPATH=y CONFIG_FPROBE=y CONFIG_FTRACE_SYSCALLS=y CONFIG_FUNCTION_ERROR_INJECTION=y diff --git a/tools/testing/selftests/bpf/prog_tests/fan_fp.c b/tools/testing/selftests/bpf/prog_tests/fan_fp.c new file mode 100644 index 000000000000..92929b811282 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/fan_fp.c @@ -0,0 +1,264 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ + +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +#include "fan_fp.skel.h" + +#define TEST_FS "/tmp/" +#define TEST_DIR "/tmp/fanotify_test/" + +static int create_test_subtree(void) +{ + int err; + + err = mkdir(TEST_DIR, 0777); + if (err && errno != EEXIST) + return err; + + return open(TEST_DIR, O_RDONLY); +} + +static int create_fanotify_fd(void) +{ + int fanotify_fd, err; + + fanotify_fd = fanotify_init(FAN_CLASS_NOTIF | FAN_REPORT_NAME | FAN_REPORT_DIR_FID, + O_RDONLY); + + if (!ASSERT_OK_FD(fanotify_fd, "fanotify_init")) + return -1; + + err = fanotify_mark(fanotify_fd, FAN_MARK_ADD | FAN_MARK_FILESYSTEM, + FAN_CREATE | FAN_OPEN | FAN_ONDIR | FAN_EVENT_ON_CHILD, + AT_FDCWD, TEST_FS); + if (!ASSERT_OK(err, "fanotify_mark")) { + close(fanotify_fd); + return -1; + } + + return fanotify_fd; +} + +static int attach_global_fastpath(int fanotify_fd) +{ + struct fanotify_fastpath_args args = { + .name = "_tmp_test_sub_tree", + .version = 1, + .flags = 0, + }; + + if (ioctl(fanotify_fd, FAN_IOC_ADD_FP, &args)) + return -1; + + return 0; +} + +#define EVENT_BUFFER_SIZE 4096 +struct file_access_result { + char name_prefix[16]; + bool accessed; +} access_results[3] = { + {"aa", false}, + {"bb", false}, + {"cc", false}, +}; + +static void update_access_results(char *name) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(access_results); i++) { + if (strcmp(name, access_results[i].name_prefix) == 0) + access_results[i].accessed = true; + } +} + +static void parse_event(char *buffer, int len) +{ + struct fanotify_event_metadata *event = + (struct fanotify_event_metadata *) buffer; + struct fanotify_event_info_header *info; + struct fanotify_event_info_fid *fid; + struct file_handle *handle; + char *name; + int off; + + for (; FAN_EVENT_OK(event, len); event = FAN_EVENT_NEXT(event, len)) { + for (off = sizeof(*event) ; off < event->event_len; + off += info->len) { + info = (struct fanotify_event_info_header *) + ((char *) event + off); + switch (info->info_type) { + case FAN_EVENT_INFO_TYPE_DFID_NAME: + fid = (struct fanotify_event_info_fid *) info; + handle = (struct file_handle *)&fid->handle; + name = (char *)handle + sizeof(*handle) + handle->handle_bytes; + update_access_results(name); + break; + default: + break; + } + } + } +} + +static void touch_file(const char *path) +{ + int fd; + + fd = open(path, O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666); + if (!ASSERT_OK_FD(fd, "open")) + goto cleanup; + close(fd); +cleanup: + unlink(path); +} + +static void generate_and_test_event(int fanotify_fd, struct fan_fp *skel) +{ + char buffer[EVENT_BUFFER_SIZE]; + int len, err, fd; + + /* Open the dir, so initialize_subdir_root can work */ + fd = open(TEST_DIR, O_RDONLY); + close(fd); + + if (!ASSERT_EQ(skel->bss->initialized, true, "initialized")) + goto cleanup; + + /* access /tmp/fanotify_test/aa, this will generate event */ + touch_file(TEST_DIR "aa"); + + /* create /tmp/fanotify_test/subdir, this will get tag from the + * parent directory (added in the bpf program on fsnotify_mkdir) + */ + err = mkdir(TEST_DIR "subdir", 0777); + ASSERT_OK(err, "mkdir"); + + /* access /tmp/fanotify_test/subdir/bb, this will generate event */ + touch_file(TEST_DIR "subdir/bb"); + + /* access /tmp/cc, this will NOT generate event, as the BPF + * fastpath filtered this event out. (Because /tmp doesn't have + * the tag.) + */ + touch_file(TEST_FS "cc"); + + /* read and parse the events */ + len = read(fanotify_fd, buffer, EVENT_BUFFER_SIZE); + if (!ASSERT_GE(len, 0, "read event")) + goto cleanup; + parse_event(buffer, len); + + /* verify we generated events for aa and bb, but filtered out the + * event for cc. + */ + ASSERT_TRUE(access_results[0].accessed, "access aa"); + ASSERT_TRUE(access_results[1].accessed, "access bb"); + ASSERT_FALSE(access_results[2].accessed, "access cc"); + + /* Each touch_file() generates two events: FAN_CREATE then + * FAN_OPEN. The second event will hit cache. + * open(TEST_DIR) also hit cache, as we updated it cache for + * TEST_DIR from userspace. + * Therefore, we expect 4 cache hits: aa, bb, cc, and TEST_DIR. + */ + ASSERT_EQ(skel->bss->cache_hit, 4, "cache_hit"); + +cleanup: + rmdir(TEST_DIR "subdir"); + rmdir(TEST_DIR); +} + +/* This test shows a simplified logic that monitors a subtree. This is + * simplified as it doesn't handle all the scenarios, such as: + * + * 1) moving a subsubtree into/outof the being monitoring subtree; + * 2) mount point inside the being monitored subtree + * + * Therefore, this is not to show a way to reliably monitor a subtree. + * Instead, this is to test the functionalities of bpf based fastpath. + * + * Overview of the logic: + * 1. fanotify is created for the whole file system (/tmp); + * 2. A bpf map (inode_storage_map) is used to tag directories to + * monitor (starting from /tmp/fanotify_test); + * 3. On fsnotify_mkdir, thee tag is propagated to newly created sub + * directories (/tmp/fanotify_test/subdir); + * 4. The bpf fastpath checks whether the event happens in a directory + * with the tag. If yes, the event is sent to user space; otherwise, + * the event is dropped. + */ +static void test_monitor_subtree(void) +{ + struct bpf_link *link; + struct fan_fp *skel; + int test_root_fd; + int zero = 0; + int err, fanotify_fd; + struct stat st; + + test_root_fd = create_test_subtree(); + + if (!ASSERT_OK_FD(test_root_fd, "create_test_subtree")) + return; + + err = fstat(test_root_fd, &st); + if (!ASSERT_OK(err, "fstat test_root_fd")) + goto close_test_root_fd; + + skel = fan_fp__open_and_load(); + + if (!ASSERT_OK_PTR(skel, "fan_fp__open_and_load")) + goto close_test_root_fd; + + skel->bss->root_ino = st.st_ino; + + /* Add tag to /tmp/fanotify_test/ */ + err = bpf_map_update_elem(bpf_map__fd(skel->maps.inode_storage_map), + &test_root_fd, &zero, BPF_ANY); + if (!ASSERT_OK(err, "bpf_map_update_elem")) + goto destroy_skel; + link = bpf_map__attach_struct_ops(skel->maps.bpf_fanotify_fastpath_ops); + if (!ASSERT_OK_PTR(link, "bpf_map__attach_struct_ops")) + goto destroy_skel; + + fanotify_fd = create_fanotify_fd(); + if (!ASSERT_OK_FD(fanotify_fd, "create_fanotify_fd")) + goto destroy_link; + + err = attach_global_fastpath(fanotify_fd); + if (!ASSERT_OK(err, "attach_global_fastpath")) + goto close_fanotify_fd; + + generate_and_test_event(fanotify_fd, skel); + +close_fanotify_fd: + close(fanotify_fd); + +destroy_link: + bpf_link__destroy(link); +destroy_skel: + fan_fp__destroy(skel); + +close_test_root_fd: + close(test_root_fd); + rmdir(TEST_DIR); +} + +void test_bpf_fanotify_fastpath(void) +{ + if (test__start_subtest("subtree")) + test_monitor_subtree(); +} diff --git a/tools/testing/selftests/bpf/progs/fan_fp.c b/tools/testing/selftests/bpf/progs/fan_fp.c new file mode 100644 index 000000000000..97e7d0b9e644 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/fan_fp.c @@ -0,0 +1,154 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ + +#include "vmlinux.h" +#include +#include +#include "bpf_kfuncs.h" + +struct __dentry_kptr_value { + struct dentry __kptr * dentry; +}; + +/* subdir_root map holds a single dentry pointer to the subtree root. + * This pointer is used to call bpf_is_subdir(). + */ +struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __type(key, int); + __type(value, struct __dentry_kptr_value); + __uint(max_entries, 1); +} subdir_root SEC(".maps"); + +/* inode_storage_map serves as cache for bpf_is_subdir(). inode local + * storage has O(1) access time. So this is preferred over calling + * bpf_is_subdir(). + */ +struct { + __uint(type, BPF_MAP_TYPE_INODE_STORAGE); + __uint(map_flags, BPF_F_NO_PREALLOC); + __type(key, int); + __type(value, int); +} inode_storage_map SEC(".maps"); + +unsigned long root_ino; +bool initialized; + +/* This function initialize map subdir_root. The logic is a bit ungly. + * First, user space sets root_ino. Then a fanotify event is triggered. + * If the event dentry matches root_ino, we take a reference on the + * dentry and save it in subdir_root map. The reference will be freed on + * the termination of subdir_root map. + */ +static void initialize_subdir_root(struct fanotify_fastpath_event *fp_event) +{ + struct __dentry_kptr_value *v; + struct dentry *dentry, *old; + int zero = 0; + + if (initialized) + return; + + dentry = bpf_fanotify_data_dentry(fp_event); + if (!dentry) + return; + + if (dentry->d_inode->i_ino != root_ino) { + bpf_dput(dentry); + return; + } + + v = bpf_map_lookup_elem(&subdir_root, &zero); + if (!v) { + bpf_dput(dentry); + return; + } + + old = bpf_kptr_xchg(&v->dentry, dentry); + if (old) + bpf_dput(old); + initialized = true; +} + +int cache_hit; + +/* bpf_fp_handler is sleepable, as it calls bpf_dput() */ +SEC("struct_ops.s") +int BPF_PROG(bpf_fp_handler, + struct fsnotify_group *group, + struct fanotify_fastpath_hook *fp_hook, + struct fanotify_fastpath_event *fp_event) +{ + struct __dentry_kptr_value *v; + struct dentry *dentry; + int zero = 0; + int *value; + int ret; + + initialize_subdir_root(fp_event); + + /* Before the subdir_root map is initialized, send all events to + * user space. + */ + if (!initialized) + return FAN_FP_RET_SEND_TO_USERSPACE; + + dentry = bpf_fanotify_data_dentry(fp_event); + if (!dentry) + return FAN_FP_RET_SEND_TO_USERSPACE; + + /* If inode_storage_map has cached value, just return it */ + value = bpf_inode_storage_get(&inode_storage_map, dentry->d_inode, 0, 0); + if (value) { + bpf_dput(dentry); + cache_hit++; + return *value; + } + + /* Hold rcu read lock for bpf_is_subdir */ + bpf_rcu_read_lock(); + v = bpf_map_lookup_elem(&subdir_root, &zero); + if (!v || !v->dentry) { + /* This shouldn't happen, but we need this to pass + * the verifier. + */ + ret = FAN_FP_RET_SEND_TO_USERSPACE; + goto out; + } + + if (bpf_is_subdir(dentry, v->dentry)) + ret = FAN_FP_RET_SEND_TO_USERSPACE; + else + ret = FAN_FP_RET_SKIP_EVENT; +out: + bpf_rcu_read_unlock(); + + /* Save current result to the inode_storage_map */ + value = bpf_inode_storage_get(&inode_storage_map, dentry->d_inode, 0, + BPF_LOCAL_STORAGE_GET_F_CREATE); + if (value) + *value = ret; + bpf_dput(dentry); + return ret; +} + +SEC("struct_ops") +int BPF_PROG(bpf_fp_init, struct fanotify_fastpath_hook *hook, const char *args) +{ + return 0; +} + +SEC("struct_ops") +void BPF_PROG(bpf_fp_free, struct fanotify_fastpath_hook *hook) +{ +} + +SEC(".struct_ops.link") +struct fanotify_fastpath_ops bpf_fanotify_fastpath_ops = { + .fp_handler = (void *)bpf_fp_handler, + .fp_init = (void *)bpf_fp_init, + .fp_free = (void *)bpf_fp_free, + .name = "_tmp_test_sub_tree", +}; + +char _license[] SEC("license") = "GPL";