Message ID | cover.1741047969.git.m@maowtm.org (mailing list archive) |
---|---|
Headers | show |
Series | Landlock supervise: a mechanism for interactive permission requests | expand |
On Tue, Mar 04, 2025 at 01:12:56AM +0000, Tingmao Wang wrote: > Landlock supervise: a mechanism for interactive permission requests > > Hi, > > I would like to propose an extension to Landlock to support a "supervisor" > mode, which would enable a user program to sandbox applications (or > itself) in a dynamic, fine-grained, and potentially temporary way. > Practically, this makes it easy to give maximal control to the user, > perhaps in the form of a "just in time" permission prompt. Read on, or > check the sandboxer program in the last patch for a "demo". Thanks for this RFC, this is very promising! > > To Jan Kara and other fanotify reviewers, I've included you in this patch > as Mickaël suggested that we could potentially extend and re-use the > fanotify uapi and code instead of creating an entirely new representation > for permission requests and mechanism for passing it (as this patch > currently does). I've not really thought out how that would work (there > will probably have to be some extension of the fanotify-fd uapi since > landlock handles more than FS access), but I think it is a promising idea, > hence I would like to hear your thoughts if you could spare a moment to > look at this. A good outcome could also be that we add the necessary > hooks so that both this and fanotify (but really fsnotify?) can have _perm > events for create/delete/rename etc. > > FS mailing list - I've CC'd this patchset to you too - even though the > patch doesn't currently touch any FS code, this is very FS related, and > also, in order to address an inode lock related problem which I will > mention in patch 6 of this series, future versions of this patch will > likely need to add a few more LSM hooks. Especially for that part, but > also other bits of this project, a pair of eyes from the FS community > would be very helpful. > > To Tycho Andersen -- I'm CC'ing you as you've worked on the seccomp-unotify > feature which is also quite related, so if you could spare some time for a > quick review, or provide some suggestions, that would be very appreciated > :) > > I'm submitting this series as a non-production-ready, proof-of-concept > RFC, and I would appreciate feedback on any aspects of the design or > implementation. Note that due to the PoC nature of this, I have not > handled checkpatch.pl errors etc. I also welcome suggestions for > alternative names for this feature (e.g. landlock-unotify? > landlock-perm?). At this point I'm very keen to hear some initial > feedback from the community before investing further into polishing this > patch. > > (I've briefly pitched the overall idea to Mickaël, but he has not reviewed > the patch yet) > > > Why extend landlock? > -------------------- > > While this feature could be implemented as its own LSM, I feel like it is > a natural extension to landlock -- landlock has already defined a set of > fine-grained access requests with the intention to add more (and not just > for FS alone), is designed to be an unprivileged, stackable, > process-scoped, ad-hoc mechanism with no persistent state, which works > well as a generic API to support a dynamic sandbox, and landlock is > already doing the path traversal work to evaluate hierarchical filesystem > rules, which would also be useful for a performant dynamic sandbox > implementation. I agree, that would be a great Landlock feature. > > > Use cases > --------- > > I have several potential use cases in mind that will benefit from > landlock-supervise, for example: > > 1. A patch to firejail (I have not discussed with the firejail maintainers > on this yet - wanted to see the reception of this kernel patch first) > which can leverage landlock in a highly flexible way, prompting the user > for permission to access "extra" files after the sandbox has started > (without e.g. having to restart a very stateful GUI program). > > This way of using landlock can potentially replace its current approach of > using bind mounts (as it will allow implementing "blacklists"), allowing > unprivileged sandbox creation (although need to check with firejail if > there are other factors preventing this). This also allows editing > profiles "live" in a highly interactive way (i.e. the user can choose > "allow and remember" on a permission request which will also add the newly > allowed path to a local firejail profile, all automatically) > > 2. A "protected" mode for common development environments (e.g. VSCode or > a terminal can be launched "protected") that doesn't compromise on > ease-of-use. File access to $PWD at launch can be allowed, and access to > other places can be allowed ad-hoc by the developer with hopefully one UI > click. Since landlock can also be used to restrict network access, such a > protected mode can also restrict outgoing connections by default (but ask > the user if they allow it for all or certain processes, on the first > attempt to connect). > > Recently there has been incidents of secret-stealing malware targeting > developers (on Linux) by social engineering them to open and build/run a > project. [1] The hope is that landlock-supervise can drive adoption of > sandboxes for developers and others by making them more user-friendly. > > In addition to the above, I also hope that this would help with landlock > adoption even in non-interaction-heavy scenarios, by allowing application > developers the choice to gracefully recover from over-restrictive rulesets > and collect failure metrics, until they are confident that actually > blocking non-allowed accesses would not break their application or degrade > the user experience. Another interesting use case is to trace programs and get an unprivileged "permissive" mode to quickly create sandbox policies. > > I have more exploration to do regarding applying this to applications, but > I do have a working proof of concept already (implemented as an > enhancement to the sandboxer example). Here is a shortened output: > > bash # env LL_FS_RO=/usr:/lib:/bin:/etc:/dev:/proc LL_FS_RW= LL_SUPERVISE=1 ./sandboxer bash -i > bash # echo "Hi, $(whoami)!" > Hi, root! > bash # ls / > ------------- Sandboxer access request ------------- > Process ls[166] (/usr/bin/ls) wants to read > / > (y)es/(a)lways/(n)o > y > ---------------------------------------------------- > bin > boot > dev > ... > usr > var > bash # echo 'evil' >> /etc/profile > (a spurious create request due to current issue with dcache miss is omitted) > ------------- Sandboxer access request ------------- > Process bash[163] (/usr/bin/bash) wants to read/write > /etc/profile > (y)es/(a)lways/(n)o > n > ---------------------------------------------------- > bash: /etc/profile: Permission denied > bash # > > > Alternatives > ------------ > > I have looked for existing ways to implement the proposed use cases (at > least for FS access), and three main approaches stand out to me: > > 1. Fanotify: there is already FAM_OPEN_PERM which waits for an allow/deny > response from a fanotify listener. However, it does not currently have > the equivalent _PERM for file creation, deletion, rename and linking, and > it is also not designed for unprivileged, process-scoped use (unlike > landlock). As discussed, I was thinking about whether or not it would be possible to use the fanotify interface (e.g. fanotify_init(), fanotify FD...), but looking at your code, I think it would mostly increase complexity. There are also the issue with the Landlock semantic (e.g. access rights) which does not map 1:1 to the fanotify one. A last thing is that fanotify is deeply tied to the VFS. So, unless someone has a better idea, let's continue with your approach. > > 2. Seccomp-unotify: this can be used to trap all syscalls and give the > sandbox a chance to allow or deny any one of them. However, a correct, > TOCTOU-proof implementation will likely require handling a large number of > fs-related syscalls in user-space, with the sandboxer opening the file or > carrying out the operation on behalf of the sandboxee. This is probably > going to be extremely complex and makes everything less performant. We should get inspiration from the fanotify and seccomp-notify features (while implementing the minimum for now) but also identify their design issues and caveats. Tycho, Christian, Kees, any suggestion? > > 3. Using a FUSE filesystem which gates access. This is actually an > approach taken by an existing sandbox solution - flatpak [2], however it > requires either tight integration with the application (and thus doesn't > work well for the mentioned use cases), or if one wants to sandbox a > program "transparently", SYS_ADMIN to chroot. Android's SDCardFS is another example of such use. > > > I've tested that what I have here works with the enhanced sandboxer, but > have yet to write any self tests or do extensive testing or perf > measurements. I have also yet to implement support for supervising tcp > rules as well as FS refer operations. One of the main suggestion would be to align with the audit patch series semantic and the defined "blockers": https://lore.kernel.org/all/20250131163059.1139617-1-mic@digikod.net/ I'll send another series soon. > > Base commit: 78332fdb956f18accfbca5993b10c5ed69f00a2c (tag: > landlock-6.14-rc5, mic/next) > > > [1]: https://cybersecuritynews.com/beware-of-lazarus-linkedin-recruiting-scam/ > [2]: https://flatpak.github.io/xdg-desktop-portal/docs/documents-and-fuse.html > > > Tingmao Wang (9): > Define the supervisor and event structure > Refactor per-layer information in rulesets and rules > Adds a supervisor reference in the per-layer information > User-space API for creating a supervisor-fd > Define user structure for events and responses. > Creating supervisor events for filesystem operations > Implement fdinfo for ruleset and supervisor fd > Implement fops for supervisor-fd > Enhance the sandboxer example to support landlock-supervise > > include/uapi/linux/landlock.h | 119 ++++++ > samples/landlock/sandboxer.c | 759 +++++++++++++++++++++++++++++++++- > security/landlock/Makefile | 2 +- > security/landlock/fs.c | 134 +++++- > security/landlock/ruleset.c | 49 ++- > security/landlock/ruleset.h | 66 +-- > security/landlock/supervise.c | 194 +++++++++ > security/landlock/supervise.h | 171 ++++++++ > security/landlock/syscalls.c | 621 +++++++++++++++++++++++++++- > 9 files changed, 2036 insertions(+), 79 deletions(-) > create mode 100644 security/landlock/supervise.c > create mode 100644 security/landlock/supervise.h > > -- > 2.39.5 >
On 3/4/25 19:48, Mickaël Salaün wrote: > Thanks for this RFC, this is very promising! Hi Mickaël - thanks for the prompt review and for your support! I have read your replies and have some thoughts already, but I kept getting distracted by other stuff and so haven't had much chance to express them. I will address some first today and some more over the weekend. > Another interesting use case is to trace programs and get an > unprivileged "permissive" mode to quickly create sandbox policies. Yes that would also be a good use. I thought of this initially but was thinking "I guess you can always do that with audit" but if we have landlock supervise maybe that would be an easier thing for tools to build upon...? > As discussed, I was thinking about whether or not it would be possible > to use the fanotify interface (e.g. fanotify_init(), fanotify FD...), > but looking at your code, I think it would mostly increase complexity. > There are also the issue with the Landlock semantic (e.g. access rights) > which does not map 1:1 to the fanotify one. A last thing is that > fanotify is deeply tied to the VFS. So, unless someone has a better > idea, let's continue with your approach. That sounds sensible - I will keep going with the current direction of a landlock-specific uapi. (happy to revisit should other people have suggestions) > Android's SDCardFS is another example of such use. Interesting - seems like it was deprecated for reasons unrelated to security though. > One of the main suggestion would be to align with the audit patch series > semantic and the defined "blockers": > https://lore.kernel.org/all/20250131163059.1139617-1-mic@digikod.net/ > I'll send another series soon. I will have a read of the existing audit series - are you planning significant changes to it in the next one?
On Thu, Mar 6, 2025 at 3:57 AM Tingmao Wang <m@maowtm.org> wrote: > > On 3/4/25 19:48, Mickaël Salaün wrote: > > > Thanks for this RFC, this is very promising! > > Hi Mickaël - thanks for the prompt review and for your support! I have > read your replies and have some thoughts already, but I kept getting > distracted by other stuff and so haven't had much chance to express > them. I will address some first today and some more over the weekend. > > > Another interesting use case is to trace programs and get an > > unprivileged "permissive" mode to quickly create sandbox policies. > > Yes that would also be a good use. I thought of this initially but was > thinking "I guess you can always do that with audit" but if we have > landlock supervise maybe that would be an easier thing for tools to > build upon...? > > > As discussed, I was thinking about whether or not it would be possible > > to use the fanotify interface (e.g. fanotify_init(), fanotify FD...), > > but looking at your code, I think it would mostly increase complexity. > > There are also the issue with the Landlock semantic (e.g. access rights) > > which does not map 1:1 to the fanotify one. A last thing is that > > fanotify is deeply tied to the VFS. So, unless someone has a better > > idea, let's continue with your approach. > > That sounds sensible - I will keep going with the current direction of a > landlock-specific uapi. (happy to revisit should other people have > suggestions) > w.r.t sharing infrastructure with fanotify, I only looked briefly at your patches and I have only a vague familiarity with landlock, so I cannot yet form an opinion whether this is a good idea, but I wanted to give you a few more data points about fanotify that seem relevant. 1. There is already some intersection of fanotify and audit lsm via the fanotify_response_info_audit_rule extension for permission events, so it's kind of a precedent of using fanotify to aid an lsm 2. See this fan_pre_modify-wip branch [1] and specifically commit "fanotify: introduce directory entry pre-modify permission events" I do have an intention to add create/delete/rename permission events. Note that the new fsnotify hooks are added in to do_ vfs helpers, not very far from the security_path_ lsm hooks, but not exactly in the same place because we want to fsnotify hooks to be before taking vfs locks, to allow listener to write to filesystem from event context. There are different semantics than just ALLOW/DENY that you need, therefore, only if we move the security_path_ hooks outside the vfs locks, our use cases could use the same hooks 3. There is a recent attempt to add BPF filter to fanotify [2] which is driven among other things from the long standing requirement to add subtree filtering to fanotify watches. The challenge with all the attempt to implement a subtree filter so far, is that adding vfs performance overhead for all the users in the system is unacceptable. IIUC, landlock rule set can already express a subtree filter (?), so it is intriguing to know if there is room for some integration on this aspect, but my guess is that landlock mostly uses subtree filter after filtering by specific pids (?), so it can avoid the performance overhead of a subtree filter on most of the users in the system. Hope this information is useful. Thanks, Amir. [1] https://github.com/amir73il/linux/commits/fan_pre_modify-wip/ [2] https://lore.kernel.org/linux-fsdevel/20241122225958.1775625-1-song@kernel.org/
On Tue 04-03-25 01:12:56, Tingmao Wang wrote: > Alternatives > ------------ > > I have looked for existing ways to implement the proposed use cases (at > least for FS access), and three main approaches stand out to me: > > 1. Fanotify: there is already FAM_OPEN_PERM which waits for an allow/deny > response from a fanotify listener. However, it does not currently have > the equivalent _PERM for file creation, deletion, rename and linking, and > it is also not designed for unprivileged, process-scoped use (unlike > landlock). As Amir wrote, arbitration of creation / deletion / ... is not a principial problem for fanotify and we plan to go in that direction anyway for HSM usecase. However adjusting fanotify permission events for a per-process scope and for unpriviledged users is a fundamental difference to how fanotify is designed to work (it watches filesystem objects, not processes and actions they do) and so I don't think that would be a great fit. Also I don't see fanotify expanding in the networking area as the concepts are rather different there :). Honza
On Thu, Mar 06, 2025 at 02:57:13AM +0000, Tingmao Wang wrote: > On 3/4/25 19:48, Mickaël Salaün wrote: > > > Thanks for this RFC, this is very promising! > > Hi Mickaël - thanks for the prompt review and for your support! I have read > your replies and have some thoughts already, but I kept getting distracted > by other stuff and so haven't had much chance to express them. I will > address some first today and some more over the weekend. > > > Another interesting use case is to trace programs and get an > > unprivileged "permissive" mode to quickly create sandbox policies. > > Yes that would also be a good use. I thought of this initially but was > thinking "I guess you can always do that with audit" but if we have landlock > supervise maybe that would be an easier thing for tools to build upon...? Both approaches are valuable. The supervisor one would be unprivileged, could get access to more information including O_PATH FD's, but it is much slower and relies on user space monitoring code. > > > As discussed, I was thinking about whether or not it would be possible > > to use the fanotify interface (e.g. fanotify_init(), fanotify FD...), > > but looking at your code, I think it would mostly increase complexity. > > There are also the issue with the Landlock semantic (e.g. access rights) > > which does not map 1:1 to the fanotify one. A last thing is that > > fanotify is deeply tied to the VFS. So, unless someone has a better > > idea, let's continue with your approach. > > That sounds sensible - I will keep going with the current direction of a > landlock-specific uapi. (happy to revisit should other people have > suggestions) > > > Android's SDCardFS is another example of such use. > > Interesting - seems like it was deprecated for reasons unrelated to security > though. Yes, Android first used FUSE, then SDCardFS, then FUSE again, but the goal has been the same: https://source.android.com/docs/core/storage/scoped > > > One of the main suggestion would be to align with the audit patch series > > semantic and the defined "blockers": > > https://lore.kernel.org/all/20250131163059.1139617-1-mic@digikod.net/ > > I'll send another series soon. > > I will have a read of the existing audit series - are you planning > significant changes to it in the next one? Not significant changes but still some that hook changes that might require a rebase. I just sent v6, you'll find it applied here: https://web.git.kernel.org/pub/scm/linux/kernel/git/mic/linux.git/log/?h=next
On Thu, Mar 06, 2025 at 06:07:35PM +0100, Amir Goldstein wrote: > On Thu, Mar 6, 2025 at 3:57 AM Tingmao Wang <m@maowtm.org> wrote: > > > > On 3/4/25 19:48, Mickaël Salaün wrote: > > > > > Thanks for this RFC, this is very promising! > > > > Hi Mickaël - thanks for the prompt review and for your support! I have > > read your replies and have some thoughts already, but I kept getting > > distracted by other stuff and so haven't had much chance to express > > them. I will address some first today and some more over the weekend. > > > > > Another interesting use case is to trace programs and get an > > > unprivileged "permissive" mode to quickly create sandbox policies. > > > > Yes that would also be a good use. I thought of this initially but was > > thinking "I guess you can always do that with audit" but if we have > > landlock supervise maybe that would be an easier thing for tools to > > build upon...? > > > > > As discussed, I was thinking about whether or not it would be possible > > > to use the fanotify interface (e.g. fanotify_init(), fanotify FD...), > > > but looking at your code, I think it would mostly increase complexity. > > > There are also the issue with the Landlock semantic (e.g. access rights) > > > which does not map 1:1 to the fanotify one. A last thing is that > > > fanotify is deeply tied to the VFS. So, unless someone has a better > > > idea, let's continue with your approach. > > > > That sounds sensible - I will keep going with the current direction of a > > landlock-specific uapi. (happy to revisit should other people have > > suggestions) > > > > w.r.t sharing infrastructure with fanotify, I only looked briefly at > your patches > and I have only a vague familiarity with landlock, so I cannot yet form an > opinion whether this is a good idea, but I wanted to give you a few more > data points about fanotify that seem relevant. > > 1. There is already some intersection of fanotify and audit lsm via the > fanotify_response_info_audit_rule extension for permission > events, so it's kind of a precedent of using fanotify to aid an lsm > > 2. See this fan_pre_modify-wip branch [1] and specifically commit > "fanotify: introduce directory entry pre-modify permission events" > I do have an intention to add create/delete/rename permission events. > Note that the new fsnotify hooks are added in to do_ vfs helpers, not very > far from the security_path_ lsm hooks, but not exactly in the same place > because we want to fsnotify hooks to be before taking vfs locks, to allow > listener to write to filesystem from event context. > There are different semantics than just ALLOW/DENY that you need, > therefore, only if we move the security_path_ hooks outside the > vfs locks, our use cases could use the same hooks > > 3. There is a recent attempt to add BPF filter to fanotify [2] > which is driven among other things from the long standing requirement > to add subtree filtering to fanotify watches. > The challenge with all the attempt to implement a subtree filter so far, > is that adding vfs performance overhead for all the users in the system > is unacceptable. > > IIUC, landlock rule set can already express a subtree filter (?), Yes, Landlock uses a set of inode tags and a path walk to identify hierarchies. > so it is intriguing to know if there is room for some integration on this > aspect, but my guess is that landlock mostly uses subtree filter > after filtering by specific pids (?), so it can avoid the performance > overhead of a subtree filter on most of the users in the system. Landlock domains are indeed enforced for a set of specific tasks. > > Hope this information is useful. Yes, thanks for the explanations. We should definitely take inspiration from fanotify but I don't think it would be a good fit for Landlock: the semantic of access rights is (and will) be different, and more importantly it is not only to supervise filesystem accesses. > > Thanks, > Amir. > > [1] https://github.com/amir73il/linux/commits/fan_pre_modify-wip/ > [2] https://lore.kernel.org/linux-fsdevel/20241122225958.1775625-1-song@kernel.org/
On Thu, Mar 06, 2025 at 10:04:54PM +0100, Jan Kara wrote: > On Tue 04-03-25 01:12:56, Tingmao Wang wrote: > > Alternatives > > ------------ > > > > I have looked for existing ways to implement the proposed use cases (at > > least for FS access), and three main approaches stand out to me: > > > > 1. Fanotify: there is already FAM_OPEN_PERM which waits for an allow/deny > > response from a fanotify listener. However, it does not currently have > > the equivalent _PERM for file creation, deletion, rename and linking, and > > it is also not designed for unprivileged, process-scoped use (unlike > > landlock). > > As Amir wrote, arbitration of creation / deletion / ... is not a principial > problem for fanotify and we plan to go in that direction anyway for HSM > usecase. However adjusting fanotify permission events for a per-process > scope and for unpriviledged users is a fundamental difference to how > fanotify is designed to work (it watches filesystem objects, not processes > and actions they do) and so I don't think that would be a great fit. Also I > don't see fanotify expanding in the networking area as the concepts are > rather different there :). Yes, I agree. We should take inspiration from the fanonify interface though. > > Honza > > -- > Jan Kara <jack@suse.com> > SUSE Labs, CR >
On 3/6/25 17:07, Amir Goldstein wrote: [...] > > w.r.t sharing infrastructure with fanotify, I only looked briefly at > your patches > and I have only a vague familiarity with landlock, so I cannot yet form an > opinion whether this is a good idea, but I wanted to give you a few more > data points about fanotify that seem relevant. > > 1. There is already some intersection of fanotify and audit lsm via the > fanotify_response_info_audit_rule extension for permission > events, so it's kind of a precedent of using fanotify to aid an lsm > > 2. See this fan_pre_modify-wip branch [1] and specifically commit > "fanotify: introduce directory entry pre-modify permission events" > I do have an intention to add create/delete/rename permission events. > Note that the new fsnotify hooks are added in to do_ vfs helpers, not very > far from the security_path_ lsm hooks, but not exactly in the same place > because we want to fsnotify hooks to be before taking vfs locks, to allow > listener to write to filesystem from event context. > There are different semantics than just ALLOW/DENY that you need, > therefore, only if we move the security_path_ hooks outside the > vfs locks, our use cases could use the same hooks Hi Amir, (this is a slightly long message - feel free to respond at your convenience, thank you in advance!) Thanks a lot for mentioning this branch, and for the explanation! I've had a look and realized that the changes you have there will be very useful for this patch, and in fact, I've already tried a worse attempt of this (not included in this patch series yet) to create some security_pathname_ hooks that takes the parent struct path + last name as char*, that will be called before locking the parent. (We can't have an unprivileged supervisor cause a directory to be locked indefinitely, which will also block users outside of the landlock domain) I'm not sure if we can move security_path tho, because it takes the dentry of the child as an argument, and (I think at least for create / mknod / link) that dentry is only created after locking. Hence the proposal for separate security_pathname_ hooks. A search shows that currently AppArmor and TOMOYO (plus Landlock) uses the security_path_ hooks that would need changing, if we move it (and we will have to understand if the move is ok to do for the other two LSMs...) However, I think it would still make a lot of sense to align with fsnotify here, as you have already made the changes that I would need to do anyway should I implement the proposed new hooks. I think a sensible thing might be to have the extra LSM hooks be called alongside fsnotify_(re)name_perm - following the pattern of what currently happens with fsnotify_open_perm (i.e. security_file_open called first, then fsnotify_open_perm right after). What's your thought on this? Do you think it would be a good idea to have LSM hook equivalents of the fsnotify (re)name perm hooks / fanotify pre-modify events? Also, do you have a rough estimate of when you would upstream the fa/fsnotify changes? (asking just to get an idea of things, not trying to rush or anything :) I suspect this supervise patch would take a while anyway) If you think the general idea is right, here are some further questions I have: I think going by this approach any error return from security_pathname_mknod (or in fact, fsnotify_name_perm) when called in the open O_CREAT code path would end up becoming a -EROFS. Can we turn the bool got_write in open_last_lookups into an int to store any error from mnt_want_write_parent, and return it if lookup_open returns -EROFS? This is so that the user space still gets an -EACCESS on create denials by landlock (and in fact, if fanotify denies a create maybe we want it to return the correct errno also?). Maybe there is a better way, this is just my first though... I also noticed that you don't currently have fsnotify hook calls for link (although it does end up invoking the name_perm hook on the dest with MAY_CREATE). I want to propose also changing do_linkat to (pass the right flags to filename_create_srcu -> mnt_want_write_parent to) call the security_pathname_link hook (instead of the LSM hook it would normally call for a creation event in this proposal) that is basically like security_path_link, except passing the destination as a dir/name pair, and without holding vfs lock (still passing in the dentry of the source itself), to enable landlock to handle link requests separately. Do you think this is alright? (Maybe the code would be a bit convoluted if written verbatim from this logic, maybe there is a better way, but the general idea is hopefully right) btw, side question, I see that you added srcu read sections around the events - I'm not familiar with rcu/locking usage in vfs but is this for preventing e.g. changing the mount in some way (but still allowing access / changes to the directory)? I realize I'm asking you a lot of things - big thanks in advance! (also let me know if I should be pulling in other VFS maintainers) -- For Mickaël, Would you be on board with changing Landlock to use the new hooks as mentioned above? My thinking is that it shouldn't make any difference in terms of security - Landlock permissions for e.g. creating/deleting files are based on the parent, and in fact except for link and rename, the hook_path_ functions in Landlock don't even use the dentry argument. If you're happy with the general direction of this, I can investigate further and test it out etc. This change might also reduce the impact of Landlock on non-landlocked processes, if we avoid holding exclusive inode lock while evaluating rules / traversing paths...? (Just a thought, not measured) In terms of other aspects, ignoring supervisors for now, moving to these hooks: - Should make no difference in the "happy" (access allowed) case - Only when an access is disallowed, in order to know what error to return, we can check (within Landlock hook handler) if the target already exists - if yes, return -EEXIST, otherwise -EACCESS If this is too large of a change at this point and you see / would prefer another way we can progress this series (at least the initial version), let me know. Kind regards, Tingmao > > 3. There is a recent attempt to add BPF filter to fanotify [2] > which is driven among other things from the long standing requirement > to add subtree filtering to fanotify watches. > The challenge with all the attempt to implement a subtree filter so far, > is that adding vfs performance overhead for all the users in the system > is unacceptable. > > IIUC, landlock rule set can already express a subtree filter (?), > so it is intriguing to know if there is room for some integration on this > aspect, but my guess is that landlock mostly uses subtree filter > after filtering by specific pids (?), so it can avoid the performance > overhead of a subtree filter on most of the users in the system. > > Hope this information is useful. > > Thanks, > Amir. > > [1] https://github.com/amir73il/linux/commits/fan_pre_modify-wip/ > [2] https://lore.kernel.org/linux-fsdevel/20241122225958.1775625-1-song@kernel.org/
On Tue, Mar 11, 2025 at 12:42:05AM +0000, Tingmao Wang wrote: > On 3/6/25 17:07, Amir Goldstein wrote: > [...] > > > > w.r.t sharing infrastructure with fanotify, I only looked briefly at > > your patches > > and I have only a vague familiarity with landlock, so I cannot yet form an > > opinion whether this is a good idea, but I wanted to give you a few more > > data points about fanotify that seem relevant. > > > > 1. There is already some intersection of fanotify and audit lsm via the > > fanotify_response_info_audit_rule extension for permission > > events, so it's kind of a precedent of using fanotify to aid an lsm > > > > 2. See this fan_pre_modify-wip branch [1] and specifically commit > > "fanotify: introduce directory entry pre-modify permission events" > > I do have an intention to add create/delete/rename permission events. > > Note that the new fsnotify hooks are added in to do_ vfs helpers, not very > > far from the security_path_ lsm hooks, but not exactly in the same place > > because we want to fsnotify hooks to be before taking vfs locks, to allow > > listener to write to filesystem from event context. > > There are different semantics than just ALLOW/DENY that you need, > > therefore, only if we move the security_path_ hooks outside the > > vfs locks, our use cases could use the same hooks > > Hi Amir, > > (this is a slightly long message - feel free to respond at your convenience, > thank you in advance!) > > Thanks a lot for mentioning this branch, and for the explanation! I've had a > look and realized that the changes you have there will be very useful for > this patch, and in fact, I've already tried a worse attempt of this (not > included in this patch series yet) to create some security_pathname_ hooks > that takes the parent struct path + last name as char*, that will be called > before locking the parent. (We can't have an unprivileged supervisor cause > a directory to be locked indefinitely, which will also block users outside > of the landlock domain) > > I'm not sure if we can move security_path tho, because it takes the dentry > of the child as an argument, and (I think at least for create / mknod / > link) that dentry is only created after locking. Hence the proposal for > separate security_pathname_ hooks. A search shows that currently AppArmor > and TOMOYO (plus Landlock) uses the security_path_ hooks that would need > changing, if we move it (and we will have to understand if the move is ok to > do for the other two LSMs...) > > However, I think it would still make a lot of sense to align with fsnotify > here, as you have already made the changes that I would need to do anyway > should I implement the proposed new hooks. I think a sensible thing might > be to have the extra LSM hooks be called alongside fsnotify_(re)name_perm - > following the pattern of what currently happens with fsnotify_open_perm > (i.e. security_file_open called first, then fsnotify_open_perm right after). Yes, I think it would make sense to use the same hooks for fanotify and other security subsystems, or at least to share them. It would improve consistency across different Linux subsystems and simplify changes and maintenance where these hooks are called. > > What's your thought on this? Do you think it would be a good idea to have > LSM hook equivalents of the fsnotify (re)name perm hooks / fanotify > pre-modify events? > > Also, do you have a rough estimate of when you would upstream the > fa/fsnotify changes? (asking just to get an idea of things, not trying to > rush or anything :) I suspect this supervise patch would take a while > anyway) > > If you think the general idea is right, here are some further questions I > have: > > I think going by this approach any error return from security_pathname_mknod > (or in fact, fsnotify_name_perm) when called in the open O_CREAT code path > would end up becoming a -EROFS. Can we turn the bool got_write in > open_last_lookups into an int to store any error from mnt_want_write_parent, > and return it if lookup_open returns -EROFS? This is so that the user space > still gets an -EACCESS on create denials by landlock (and in fact, if > fanotify denies a create maybe we want it to return the correct errno > also?). Maybe there is a better way, this is just my first though... > > I also noticed that you don't currently have fsnotify hook calls for link > (although it does end up invoking the name_perm hook on the dest with > MAY_CREATE). I want to propose also changing do_linkat to (pass the right > flags to filename_create_srcu -> mnt_want_write_parent to) call the > security_pathname_link hook (instead of the LSM hook it would normally call > for a creation event in this proposal) that is basically like > security_path_link, except passing the destination as a dir/name pair, and > without holding vfs lock (still passing in the dentry of the source itself), > to enable landlock to handle link requests separately. Do you think this is > alright? (Maybe the code would be a bit convoluted if written verbatim from > this logic, maybe there is a better way, but the general idea is hopefully > right) > > btw, side question, I see that you added srcu read sections around the > events - I'm not familiar with rcu/locking usage in vfs but is this for > preventing e.g. changing the mount in some way (but still allowing access / > changes to the directory)? > > I realize I'm asking you a lot of things - big thanks in advance! (also let > me know if I should be pulling in other VFS maintainers) > > -- > > For Mickaël, > > Would you be on board with changing Landlock to use the new hooks as > mentioned above? My thinking is that it shouldn't make any difference in > terms of security - Landlock permissions for e.g. creating/deleting files > are based on the parent, and in fact except for link and rename, the > hook_path_ functions in Landlock don't even use the dentry argument. If > you're happy with the general direction of this, I can investigate further > and test it out etc. This change might also reduce the impact of Landlock > on non-landlocked processes, if we avoid holding exclusive inode lock while > evaluating rules / traversing paths...? (Just a thought, not measured) This looks reasonable. As long as the semantic does not change it should be good and Landlock tests should pass. That would also require other users of this hook to make sure it works for them too. If it is not the case, I guess we could add an alternative hooks with different properties. However, see the issue and the alternative approach below. > > In terms of other aspects, ignoring supervisors for now, moving to these > hooks: > > - Should make no difference in the "happy" (access allowed) case > > - Only when an access is disallowed, in order to know what error to > return, we can check (within Landlock hook handler) if the target > already exists - if yes, return -EEXIST, otherwise -EACCESS We should avoid as much as possible to reimplement the error types in fanotify/LSM hooks. This is partially done for the VFS, and completely duplicated for the network, which can lead to inconsistent errors. It would be good to only have one source of truth, but that might not be possible in all cases. > > If this is too large of a change at this point and you see / would prefer > another way we can progress this series (at least the initial version), let > me know. For this patch series to work, we need all (used) LSM hooks to be blockable (and interruptible). We should then investigate if this is possible, especially with the new fanotify hooks, but I don't think it would work for all hooks (already or that will potentially be used by Landlock). An alternative approach would be to add a task_work (executed before returning to user space) that will wait for the supervisor to take a decision, and in the meantime the LSM hook would return -ERESTARTNOINTR for the syscall to start again after the wait. However, because the request to the supervisor would be called outside of the hook, it should not be possible to directly allow the request (because of race condition) but to update the domain accordingly. The restarted syscall must not trigger a supervisor request though. > > Kind regards, > Tingmao > > > > > 3. There is a recent attempt to add BPF filter to fanotify [2] > > which is driven among other things from the long standing requirement > > to add subtree filtering to fanotify watches. > > The challenge with all the attempt to implement a subtree filter so far, > > is that adding vfs performance overhead for all the users in the system > > is unacceptable. > > > > IIUC, landlock rule set can already express a subtree filter (?), > > so it is intriguing to know if there is room for some integration on this > > aspect, but my guess is that landlock mostly uses subtree filter > > after filtering by specific pids (?), so it can avoid the performance > > overhead of a subtree filter on most of the users in the system. > > > > Hope this information is useful. > > > > Thanks, > > Amir. > > > > [1] https://github.com/amir73il/linux/commits/fan_pre_modify-wip/ > > [2] https://lore.kernel.org/linux-fsdevel/20241122225958.1775625-1-song@kernel.org/ > >
On Tue, Mar 11, 2025 at 12:28 PM Mickaël Salaün <mic@digikod.net> wrote: > > On Tue, Mar 11, 2025 at 12:42:05AM +0000, Tingmao Wang wrote: > > On 3/6/25 17:07, Amir Goldstein wrote: > > [...] > > > > > > w.r.t sharing infrastructure with fanotify, I only looked briefly at > > > your patches > > > and I have only a vague familiarity with landlock, so I cannot yet form an > > > opinion whether this is a good idea, but I wanted to give you a few more > > > data points about fanotify that seem relevant. > > > > > > 1. There is already some intersection of fanotify and audit lsm via the > > > fanotify_response_info_audit_rule extension for permission > > > events, so it's kind of a precedent of using fanotify to aid an lsm > > > > > > 2. See this fan_pre_modify-wip branch [1] and specifically commit > > > "fanotify: introduce directory entry pre-modify permission events" > > > I do have an intention to add create/delete/rename permission events. > > > Note that the new fsnotify hooks are added in to do_ vfs helpers, not very > > > far from the security_path_ lsm hooks, but not exactly in the same place > > > because we want to fsnotify hooks to be before taking vfs locks, to allow > > > listener to write to filesystem from event context. > > > There are different semantics than just ALLOW/DENY that you need, > > > therefore, only if we move the security_path_ hooks outside the > > > vfs locks, our use cases could use the same hooks > > > > Hi Amir, > > > > (this is a slightly long message - feel free to respond at your convenience, > > thank you in advance!) > > > > Thanks a lot for mentioning this branch, and for the explanation! I've had a > > look and realized that the changes you have there will be very useful for > > this patch, and in fact, I've already tried a worse attempt of this (not > > included in this patch series yet) to create some security_pathname_ hooks > > that takes the parent struct path + last name as char*, that will be called > > before locking the parent. (We can't have an unprivileged supervisor cause > > a directory to be locked indefinitely, which will also block users outside > > of the landlock domain) > > > > I'm not sure if we can move security_path tho, because it takes the dentry > > of the child as an argument, and (I think at least for create / mknod / > > link) that dentry is only created after locking. Hence the proposal for > > separate security_pathname_ hooks. A search shows that currently AppArmor > > and TOMOYO (plus Landlock) uses the security_path_ hooks that would need > > changing, if we move it (and we will have to understand if the move is ok to > > do for the other two LSMs...) > > > > However, I think it would still make a lot of sense to align with fsnotify > > here, as you have already made the changes that I would need to do anyway > > should I implement the proposed new hooks. I think a sensible thing might > > be to have the extra LSM hooks be called alongside fsnotify_(re)name_perm - > > following the pattern of what currently happens with fsnotify_open_perm > > (i.e. security_file_open called first, then fsnotify_open_perm right after). I think there is a fundamental difference between LSM hooks and fsnotify, so putting fsnotify behind some LSM hooks might be weird. Specifically, LSM hooks are always global. If a LSM attaches to a hook, say security_file_open, it will see all the file open calls in the system. On the other hand, each fsnotify rule only applies to a group, so that one fanotify handler doesn't touch files watched by another fanotify handler. Given this difference, I am not sure how fsnotify LSM hooks should look like. Does this make sense? > Yes, I think it would make sense to use the same hooks for fanotify and > other security subsystems, or at least to share them. It would improve > consistency across different Linux subsystems and simplify changes and > maintenance where these hooks are called. [...] > > -- > > > > For Mickaël, > > > > Would you be on board with changing Landlock to use the new hooks as > > mentioned above? My thinking is that it shouldn't make any difference in > > terms of security - Landlock permissions for e.g. creating/deleting files > > are based on the parent, and in fact except for link and rename, the > > hook_path_ functions in Landlock don't even use the dentry argument. If > > you're happy with the general direction of this, I can investigate further > > and test it out etc. This change might also reduce the impact of Landlock > > on non-landlocked processes, if we avoid holding exclusive inode lock while > > evaluating rules / traversing paths...? (Just a thought, not measured) I think the filter for process/thread is usually faster than the filter for file/path/subtree? Therefore, it is better for landlock to check the filter for process/thread first. Did I miss/misunderstand something? Thanks, Song > This looks reasonable. As long as the semantic does not change it > should be good and Landlock tests should pass. That would also require > other users of this hook to make sure it works for them too. If it is > not the case, I guess we could add an alternative hooks with different > properties. However, see the issue and the alternative approach below. >
On 3/11/25 20:58, Song Liu wrote: > On Tue, Mar 11, 2025 at 12:28 PM Mickaël Salaün <mic@digikod.net> wrote: >> >> On Tue, Mar 11, 2025 at 12:42:05AM +0000, Tingmao Wang wrote: >>> On 3/6/25 17:07, Amir Goldstein wrote: >>> [...] >>>> >>>> w.r.t sharing infrastructure with fanotify, I only looked briefly at >>>> your patches >>>> and I have only a vague familiarity with landlock, so I cannot yet form an >>>> opinion whether this is a good idea, but I wanted to give you a few more >>>> data points about fanotify that seem relevant. >>>> >>>> 1. There is already some intersection of fanotify and audit lsm via the >>>> fanotify_response_info_audit_rule extension for permission >>>> events, so it's kind of a precedent of using fanotify to aid an lsm >>>> >>>> 2. See this fan_pre_modify-wip branch [1] and specifically commit >>>> "fanotify: introduce directory entry pre-modify permission events" >>>> I do have an intention to add create/delete/rename permission events. >>>> Note that the new fsnotify hooks are added in to do_ vfs helpers, not very >>>> far from the security_path_ lsm hooks, but not exactly in the same place >>>> because we want to fsnotify hooks to be before taking vfs locks, to allow >>>> listener to write to filesystem from event context. >>>> There are different semantics than just ALLOW/DENY that you need, >>>> therefore, only if we move the security_path_ hooks outside the >>>> vfs locks, our use cases could use the same hooks >>> >>> Hi Amir, >>> >>> (this is a slightly long message - feel free to respond at your convenience, >>> thank you in advance!) >>> >>> Thanks a lot for mentioning this branch, and for the explanation! I've had a >>> look and realized that the changes you have there will be very useful for >>> this patch, and in fact, I've already tried a worse attempt of this (not >>> included in this patch series yet) to create some security_pathname_ hooks >>> that takes the parent struct path + last name as char*, that will be called >>> before locking the parent. (We can't have an unprivileged supervisor cause >>> a directory to be locked indefinitely, which will also block users outside >>> of the landlock domain) >>> >>> I'm not sure if we can move security_path tho, because it takes the dentry >>> of the child as an argument, and (I think at least for create / mknod / >>> link) that dentry is only created after locking. Hence the proposal for >>> separate security_pathname_ hooks. A search shows that currently AppArmor >>> and TOMOYO (plus Landlock) uses the security_path_ hooks that would need >>> changing, if we move it (and we will have to understand if the move is ok to >>> do for the other two LSMs...) >>> >>> However, I think it would still make a lot of sense to align with fsnotify >>> here, as you have already made the changes that I would need to do anyway >>> should I implement the proposed new hooks. I think a sensible thing might >>> be to have the extra LSM hooks be called alongside fsnotify_(re)name_perm - >>> following the pattern of what currently happens with fsnotify_open_perm >>> (i.e. security_file_open called first, then fsnotify_open_perm right after). > > I think there is a fundamental difference between LSM hooks and fsnotify, > so putting fsnotify behind some LSM hooks might be weird. Specifically, > LSM hooks are always global. If a LSM attaches to a hook, say > security_file_open, it will see all the file open calls in the system. On the > other hand, each fsnotify rule only applies to a group, so that one fanotify > handler doesn't touch files watched by another fanotify handler. Given this > difference, I am not sure how fsnotify LSM hooks should look like. > > Does this make sense? To clarify, I wasn't suggesting that we put one hook _behind_ another ("behind" in the sense of one calling the other), just that the place that calls the new fsnotify_name_perm/fsnotify_rename_perm hook (in Amir's WIP branch) could also be made to call some new LSM hooks in addition to fsnotify (i.e. security_pathname_create/delete/rename). My understanding of the current code is that VFS calls security_... and fsnotify_... unconditionally, and the fsnotify_... functions figure out who needs to be notified. > >> Yes, I think it would make sense to use the same hooks for fanotify and >> other security subsystems, or at least to share them. It would improve >> consistency across different Linux subsystems and simplify changes and >> maintenance where these hooks are called. Mickaël - I'm not sure what you mean by "the same hook" - do you mean the relevant VFS functions could call both fsnotify and LSM hooks? > > [...] > >>> -- >>> >>> For Mickaël, >>> >>> Would you be on board with changing Landlock to use the new hooks as >>> mentioned above? My thinking is that it shouldn't make any difference in >>> terms of security - Landlock permissions for e.g. creating/deleting files >>> are based on the parent, and in fact except for link and rename, the >>> hook_path_ functions in Landlock don't even use the dentry argument. If >>> you're happy with the general direction of this, I can investigate further >>> and test it out etc. This change might also reduce the impact of Landlock >>> on non-landlocked processes, if we avoid holding exclusive inode lock while >>> evaluating rules / traversing paths...? (Just a thought, not measured) > > I think the filter for process/thread is usually faster than the filter for > file/path/subtree? Therefore, it is better for landlock to check the filter for > process/thread first. Did I miss/misunderstand something? > Sorry, I should have clarified that the "impact" I'm talking about here isn't referring to directly the time it takes for landlock to decide if an access is allowed or not - in a non-landlocked process, the landlock hooks already returns really early and fast. However, I'm thinking of a situation where a landlocked process makes lots of create/delete etc requests on a directory, and landlock does need to do some work (e.g. path traversal) to decide those access. Because the security_path_mknod/unlink/... hooks are called in the VFS from a place where it is holding an exclusive lock on the directory (for O_CREAT'ing a child or other directory modification cases), when landlock is working out an access by the landlocked process, no other tasks will be able to read/write the directory (they will be blocked on the lock), even if their access have nothing to do with landlock. I should add that this is probably just a very minor impact: the user space can't cause the dir to be blocked for arbitrary amount of time, at worst slowing everyone else down by a bit if it deliberately creates lots of layers (max 16) each with lots of rules (the ruleset evaluation is O(log(#rules) * dir_depth)). I didn't measure it, it's just something that occurred to me that could be improved by using new hooks that aren't called with inode locks held. Kind regards, Tingmao > Thanks, > Song > > > > >> This looks reasonable. As long as the semantic does not change it >> should be good and Landlock tests should pass. That would also require >> other users of this hook to make sure it works for them too. If it is >> not the case, I guess we could add an alternative hooks with different >> properties. However, see the issue and the alternative approach below. >>
On Tue, Mar 11, 2025 at 3:03 PM Tingmao Wang <m@maowtm.org> wrote: [...] > > > > I think there is a fundamental difference between LSM hooks and fsnotify, > > so putting fsnotify behind some LSM hooks might be weird. Specifically, > > LSM hooks are always global. If a LSM attaches to a hook, say > > security_file_open, it will see all the file open calls in the system. On the > > other hand, each fsnotify rule only applies to a group, so that one fanotify > > handler doesn't touch files watched by another fanotify handler. Given this > > difference, I am not sure how fsnotify LSM hooks should look like. > > > > Does this make sense? > > To clarify, I wasn't suggesting that we put one hook _behind_ another > ("behind" in the sense of one calling the other), just that the place > that calls the new fsnotify_name_perm/fsnotify_rename_perm hook (in > Amir's WIP branch) could also be made to call some new LSM hooks in > addition to fsnotify (i.e. security_pathname_create/delete/rename). > > My understanding of the current code is that VFS calls security_... and > fsnotify_... unconditionally, and the fsnotify_... functions figure out > who needs to be notified. Yes, VFS calls security_* and fsnotify_* unconditionally. In this sense, fsnotify can be implemented as a LSM. But fsnotify also supports some non-security use cases. So it will be weird to implement it as a LSM. Thanks, Song [...]
On 2025/03/04 10:12, Tingmao Wang wrote: > bash # env LL_FS_RO=/usr:/lib:/bin:/etc:/dev:/proc LL_FS_RW= LL_SUPERVISE=1 ./sandboxer bash -i > bash # echo "Hi, $(whoami)!" > Hi, root! > bash # ls / > ------------- Sandboxer access request ------------- > Process ls[166] (/usr/bin/ls) wants to read > / > (y)es/(a)lways/(n)o > y > ---------------------------------------------------- > bin > boot > dev > ... > usr > var > bash # echo 'evil' >> /etc/profile > (a spurious create request due to current issue with dcache miss is omitted) > ------------- Sandboxer access request ------------- > Process bash[163] (/usr/bin/bash) wants to read/write > /etc/profile > (y)es/(a)lways/(n)o > n > ---------------------------------------------------- > bash: /etc/profile: Permission denied > bash # Please check TOMOYO, for TOMOYO is already doing it. https://tomoyo.sourceforge.net/2.6/chapter-7.html#7.3
On Tue 11-03-25 00:42:05, Tingmao Wang wrote: > On 3/6/25 17:07, Amir Goldstein wrote: > [...] > > > > w.r.t sharing infrastructure with fanotify, I only looked briefly at > > your patches > > and I have only a vague familiarity with landlock, so I cannot yet form an > > opinion whether this is a good idea, but I wanted to give you a few more > > data points about fanotify that seem relevant. > > > > 1. There is already some intersection of fanotify and audit lsm via the > > fanotify_response_info_audit_rule extension for permission > > events, so it's kind of a precedent of using fanotify to aid an lsm > > > > 2. See this fan_pre_modify-wip branch [1] and specifically commit > > "fanotify: introduce directory entry pre-modify permission events" > > I do have an intention to add create/delete/rename permission events. > > Note that the new fsnotify hooks are added in to do_ vfs helpers, not very > > far from the security_path_ lsm hooks, but not exactly in the same place > > because we want to fsnotify hooks to be before taking vfs locks, to allow > > listener to write to filesystem from event context. > > There are different semantics than just ALLOW/DENY that you need, > > therefore, only if we move the security_path_ hooks outside the > > vfs locks, our use cases could use the same hooks > > Hi Amir, > > (this is a slightly long message - feel free to respond at your convenience, > thank you in advance!) > > Thanks a lot for mentioning this branch, and for the explanation! I've had a > look and realized that the changes you have there will be very useful for > this patch, and in fact, I've already tried a worse attempt of this (not > included in this patch series yet) to create some security_pathname_ hooks > that takes the parent struct path + last name as char*, that will be called > before locking the parent. (We can't have an unprivileged supervisor cause > a directory to be locked indefinitely, which will also block users outside > of the landlock domain) Well, but if you call the hook before locking the parent isn't your hook prone to TOCTOU races? I mean you call the hook do your stuff in it and then before the parent is locked, the whole directory hierarchy can get reorganized (from another process) without you knowing... So I'm not sure which guarantees you can provide for such hooks. Honza
On Tue, Mar 11, 2025 at 01:58:57PM -0700, Song Liu wrote: > On Tue, Mar 11, 2025 at 12:28 PM Mickaël Salaün <mic@digikod.net> wrote: > > > > On Tue, Mar 11, 2025 at 12:42:05AM +0000, Tingmao Wang wrote: > > > On 3/6/25 17:07, Amir Goldstein wrote: > > > [...] > > > -- > > > > > > For Mickaël, > > > > > > Would you be on board with changing Landlock to use the new hooks as > > > mentioned above? My thinking is that it shouldn't make any difference in > > > terms of security - Landlock permissions for e.g. creating/deleting files > > > are based on the parent, and in fact except for link and rename, the > > > hook_path_ functions in Landlock don't even use the dentry argument. If > > > you're happy with the general direction of this, I can investigate further > > > and test it out etc. This change might also reduce the impact of Landlock > > > on non-landlocked processes, if we avoid holding exclusive inode lock while > > > evaluating rules / traversing paths...? (Just a thought, not measured) > > I think the filter for process/thread is usually faster than the filter for > file/path/subtree? Therefore, it is better for landlock to check the filter for > process/thread first. Did I miss/misunderstand something? The main reason is because only sandboxed processes should be impacted by Landlock. Similarly, only the security policies restricting a process impact this process. Using 16 layers would only impact the process that sandboxed itself (and BTW the impact of the number of layers would be negligible). There is not really process filters, only pointers set or not in tasks' credentials.
On Tue, Mar 11, 2025 at 1:42 AM Tingmao Wang <m@maowtm.org> wrote: > > On 3/6/25 17:07, Amir Goldstein wrote: > [...] > > > > w.r.t sharing infrastructure with fanotify, I only looked briefly at > > your patches > > and I have only a vague familiarity with landlock, so I cannot yet form an > > opinion whether this is a good idea, but I wanted to give you a few more > > data points about fanotify that seem relevant. > > > > 1. There is already some intersection of fanotify and audit lsm via the > > fanotify_response_info_audit_rule extension for permission > > events, so it's kind of a precedent of using fanotify to aid an lsm > > > > 2. See this fan_pre_modify-wip branch [1] and specifically commit > > "fanotify: introduce directory entry pre-modify permission events" > > I do have an intention to add create/delete/rename permission events. > > Note that the new fsnotify hooks are added in to do_ vfs helpers, not very > > far from the security_path_ lsm hooks, but not exactly in the same place > > because we want to fsnotify hooks to be before taking vfs locks, to allow > > listener to write to filesystem from event context. > > There are different semantics than just ALLOW/DENY that you need, > > therefore, only if we move the security_path_ hooks outside the > > vfs locks, our use cases could use the same hooks > > Hi Amir, > > (this is a slightly long message - feel free to respond at your > convenience, thank you in advance!) > > Thanks a lot for mentioning this branch, and for the explanation! I've > had a look and realized that the changes you have there will be very > useful for this patch, and in fact, I've already tried a worse attempt > of this (not included in this patch series yet) to create some > security_pathname_ hooks that takes the parent struct path + last name > as char*, that will be called before locking the parent. (We can't have > an unprivileged supervisor cause a directory to be locked indefinitely, > which will also block users outside of the landlock domain) > > I'm not sure if we can move security_path tho, because it takes the > dentry of the child as an argument, and (I think at least for create / > mknod / link) that dentry is only created after locking. Hence the > proposal for separate security_pathname_ hooks. A search shows that > currently AppArmor and TOMOYO (plus Landlock) uses the security_path_ > hooks that would need changing, if we move it (and we will have to > understand if the move is ok to do for the other two LSMs...) > > However, I think it would still make a lot of sense to align with > fsnotify here, as you have already made the changes that I would need to > do anyway should I implement the proposed new hooks. I think a sensible > thing might be to have the extra LSM hooks be called alongside > fsnotify_(re)name_perm - following the pattern of what currently happens > with fsnotify_open_perm (i.e. security_file_open called first, then > fsnotify_open_perm right after). > > What's your thought on this? Do you think it would be a good idea to > have LSM hook equivalents of the fsnotify (re)name perm hooks / fanotify > pre-modify events? > No clear answer but some data points: The fanotify permission hooks (formerly fsnotify_perm) used to be inside security_file_{open,permission} so when I started looking at dir modification permission events I started to try using the security_path_ hooks, but as the work progressed I found that fsnotify hooks have different requirements (no locks). Later, we found out that the existing fsnotify permission hooks have different needs than the existing security hooks (for pre-content events), so after: 1cda52f1b4611 fsnotify, lsm: Decouple fsnotify from lsm fanotify is not using any LSM hooks and not dependent on CONFIG_SECURITY. Mentally, I do find it easy for fsnotify and security hook to be next to each other, unless there is a reason to do it otherwise, because from vfs POV they are mostly the same, but note that my branch implements the new fsnotify hooks actually as scopes (for sb_write_srcu) and in some cases as other people have mentioned on this thread, the security hooks need to be inside the vfs locks, while the fsnotify hooks need to be outside of the locks. > Also, do you have a rough estimate of when you would upstream the > fa/fsnotify changes? (asking just to get an idea of things, not trying > to rush or anything :) I suspect this supervise patch would take a while > anyway) > Besides my time to work on this, these patches are waiting for some other things. One is that I was waiting with promoting those patches until pre-content patches got merged and that took longer than expected and even now there may need to be follow ups in the next cycle. Another thing is that these patches rely on the sb_write_srcu design concept which is pretty intrusive to vfs, so I still need to sell this to vfs people. I am going to make another shot of an elevator pitch at LSFMM in two weeks, If we get past this design hurdle, the rest of the work will depend on how much time I can spend on it. I do *want* to make the patches in time for the 2025 LTS kernel, but it may not be a realistic goal. One thing that helped a lot with pushing pre-content events is that Meta already had a production use case for it. I do not know of anyone else that requested the pre-directory-modify hooks (besides myself), so that may make the sale a bit harder. If there is someone out there that does need the pre-directory-modify hooks now would be a good time to speak up. > If you think the general idea is right, here are some further questions > I have: > > I think going by this approach any error return from > security_pathname_mknod (or in fact, fsnotify_name_perm) when called in > the open O_CREAT code path would end up becoming a -EROFS. Can we turn > the bool got_write in open_last_lookups into an int to store any error > from mnt_want_write_parent, and return it if lookup_open returns -EROFS? IIUC you mean like this: err = mnt_want_write_parent(&nd->path, MAY_CREATE, &res, &idx); if (err && err != -EROFS) return err; got_write = !err; /* * do _not_ fail yet - .... Yes, I think that is better, because the logic in the comment only applies to EROFS. > This is so that the user space still gets an -EACCESS on create > denials by landlock (and in fact, if fanotify denies a create maybe we > want it to return the correct errno also?). Maybe there is a better way, > this is just my first though... > > I also noticed that you don't currently have fsnotify hook calls for > link (although it does end up invoking the name_perm hook on the dest > with MAY_CREATE). I want to propose also changing do_linkat to (pass > the right flags to filename_create_srcu -> mnt_want_write_parent to) > call the security_pathname_link hook (instead of the LSM hook it would > normally call for a creation event in this proposal) that is basically > like security_path_link, except passing the destination as a dir/name > pair, and without holding vfs lock (still passing in the dentry of the > source itself), to enable landlock to handle link requests separately. > Do you think this is alright? (Maybe the code would be a bit convoluted > if written verbatim from this logic, maybe there is a better way, but > the general idea is hopefully right) I am not sure I understand your question. fsnotify does not need to know this is a LINK and not CREATE. I do not know what the requirements of other LSMs for those hooks, so hard to say if it is ok to move those hooks but my guess is not ok. > > btw, side question, I see that you added srcu read sections around the > events - I'm not familiar with rcu/locking usage in vfs but is this for > preventing e.g. changing the mount in some way (but still allowing > access / changes to the directory)? > No. this is meant to accommodate fsnotify_wait_handle_events() (see last patch) - wait for in-flight modifications to complete without blocking new modifications. That's the concept that I need to sell to vfs people. Thanks, Amir.