Message ID | 20240728002602.3198398-3-ivanov.mikhail1@huawei-partners.com (mailing list archive) |
---|---|
State | Handled Elsewhere |
Headers | show |
Series | Support TCP listen access-control | expand |
Hello! Thanks for sending these patches! Most comments are about documentation so far. In the implementation, I'm mostly unclear about the interaction with the uncommon Upper Layer Protocols. I'm also not very familiar with the socket state machines, maybe someone from the netdev list would have time to double check that aspect? On Sun, Jul 28, 2024 at 08:25:55AM +0800, Mikhail Ivanov wrote: > LANDLOCK_ACCESS_NET_BIND_TCP is useful to limit the scope of "bindable" > ports to forbid a malicious sandboxed process to impersonate a legitimate > server process. However, bind(2) might be used by (TCP) clients to set the > source port to a (legitimate) value. Controlling the ports that can be > used for listening would allow (TCP) clients to explicitly bind to ports > that are forbidden for listening. > > Such control is implemented with a new LANDLOCK_ACCESS_NET_LISTEN_TCP > access right that restricts listening on undesired ports with listen(2). Nit: I would turn around the first two commit message paragraphs and describe your changes first, before explaining the problems in the bind(2) support. I was initially a bit confused that the description started talking about LANDLOCK_ACCESS_NET_BIND_TCP. General recommendations at: https://www.kernel.org/doc/html/v6.10/process/submitting-patches.html#describe-your-changes > It's worth noticing that this access right doesn't affect changing > backlog value using listen(2) on already listening socket. > > * Create new LANDLOCK_ACCESS_NET_LISTEN_TCP flag. > * Add hook to socket_listen(), which checks whether the socket is allowed > to listen on a binded local port. > * Add check_tcp_socket_can_listen() helper, which validates socket > attributes before the actual access right check. > * Update `struct landlock_net_port_attr` documentation with control of > binding to ephemeral port with listen(2) description. > * Change ABI version to 6. > > Closes: https://github.com/landlock-lsm/linux/issues/15 > Signed-off-by: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com> > --- > include/uapi/linux/landlock.h | 23 +++-- > security/landlock/limits.h | 2 +- > security/landlock/net.c | 90 ++++++++++++++++++++ > security/landlock/syscalls.c | 2 +- > tools/testing/selftests/landlock/base_test.c | 2 +- > 5 files changed, 108 insertions(+), 11 deletions(-) > > diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h > index 68625e728f43..6b8df3293eee 100644 > --- a/include/uapi/linux/landlock.h > +++ b/include/uapi/linux/landlock.h > @@ -104,13 +104,16 @@ struct landlock_net_port_attr { > /** > * @port: Network port in host endianness. > * > - * It should be noted that port 0 passed to :manpage:`bind(2)` will > - * bind to an available port from a specific port range. This can be > - * configured thanks to the ``/proc/sys/net/ipv4/ip_local_port_range`` > - * sysctl (also used for IPv6). A Landlock rule with port 0 and the > - * ``LANDLOCK_ACCESS_NET_BIND_TCP`` right means that requesting to bind > - * on port 0 is allowed and it will automatically translate to binding > - * on the related port range. Please rebase on a more recent revision, we have changed this phrasing in the meantime: - s/a specific port range/the ephemeral port range/ - The paragraph was split in two. > + * It should be noted that some operations cause binding socket to a random > + * available port from a specific port range. This can be configured thanks > + * to the ``/proc/sys/net/ipv4/ip_local_port_range`` sysctl (also used for > + * IPv6). Following operation requests are automatically translate to > + * binding on the related port range: > + * > + * - A Landlock rule with port 0 and the ``LANDLOCK_ACCESS_NET_BIND_TCP`` > + * right means that binding on port 0 is allowed. > + * - A Landlock rule with port 0 and the ``LANDLOCK_ACCESS_NET_LISTEN_TCP`` > + * right means listening without an explicit binding is allowed. There are some grammatical problems in this documentation section. Can I suggest an alternative? Some socket operations will fall back to using a port from the ephemeral port range, if no specific port is requested by the caller. Among others, this happens in the following cases: - :manpage:`bind(2)` is invoked with a socket address that uses port 0. - :manpage:`listen(2)` is invoked on a socket without previously calling :manpage:`bind(2)`. These two actions, which implicitly use an ephemeral port, can be allowed with a Landlock rule with port 0 and the ``LANDLOCK_ACCESS_NET_BIND_TCP`` / ``LANDLOCK_ACCESS_NET_LISTEN_TCP`` right. The ephemeral port range is configured in the ``/proc/sys/net/ipv4/ip_local_port_range`` sysctl (also used for IPv6). When we have the documentation wording finalized, please send an update to the man pages as well, for this and other documentation updates. Small remarks on what I've done here: * I am avoiding the word "binding" when referring to the automatic assignment to an ephemeral port - IMHO, this is potentially confusing, since bind(2) is not explicitly called. * I am also dropping the "It should be noted" / "Note that" phrase, which is frowned upon in man pages. > */ > __u64 port; > }; > @@ -251,7 +254,7 @@ struct landlock_net_port_attr { > * DOC: net_access > * > * Network flags > - * ~~~~~~~~~~~~~~~~ > + * ~~~~~~~~~~~~~ > * > * These flags enable to restrict a sandboxed process to a set of network > * actions. This is supported since the Landlock ABI version 4. > @@ -261,9 +264,13 @@ struct landlock_net_port_attr { > * - %LANDLOCK_ACCESS_NET_BIND_TCP: Bind a TCP socket to a local port. > * - %LANDLOCK_ACCESS_NET_CONNECT_TCP: Connect an active TCP socket to > * a remote port. > + * - %LANDLOCK_ACCESS_NET_LISTEN_TCP: Listen for TCP socket connections on > + * a local port. This access right is available since the sixth version > + * of the Landlock ABI. > */ > /* clang-format off */ > #define LANDLOCK_ACCESS_NET_BIND_TCP (1ULL << 0) > #define LANDLOCK_ACCESS_NET_CONNECT_TCP (1ULL << 1) > +#define LANDLOCK_ACCESS_NET_LISTEN_TCP (1ULL << 2) > /* clang-format on */ > #endif /* _UAPI_LINUX_LANDLOCK_H */ > diff --git a/security/landlock/limits.h b/security/landlock/limits.h > index 4eb643077a2a..2ef147389474 100644 > --- a/security/landlock/limits.h > +++ b/security/landlock/limits.h > @@ -22,7 +22,7 @@ > #define LANDLOCK_MASK_ACCESS_FS ((LANDLOCK_LAST_ACCESS_FS << 1) - 1) > #define LANDLOCK_NUM_ACCESS_FS __const_hweight64(LANDLOCK_MASK_ACCESS_FS) > > -#define LANDLOCK_LAST_ACCESS_NET LANDLOCK_ACCESS_NET_CONNECT_TCP > +#define LANDLOCK_LAST_ACCESS_NET LANDLOCK_ACCESS_NET_LISTEN_TCP > #define LANDLOCK_MASK_ACCESS_NET ((LANDLOCK_LAST_ACCESS_NET << 1) - 1) > #define LANDLOCK_NUM_ACCESS_NET __const_hweight64(LANDLOCK_MASK_ACCESS_NET) > > diff --git a/security/landlock/net.c b/security/landlock/net.c > index 669ba260342f..a29cb27c3f14 100644 > --- a/security/landlock/net.c > +++ b/security/landlock/net.c > @@ -6,10 +6,12 @@ > * Copyright © 2022-2023 Microsoft Corporation > */ > > +#include "net/sock.h" > #include <linux/in.h> > #include <linux/net.h> > #include <linux/socket.h> > #include <net/ipv6.h> > +#include <net/tcp.h> > > #include "common.h" > #include "cred.h" > @@ -194,9 +196,97 @@ static int hook_socket_connect(struct socket *const sock, > LANDLOCK_ACCESS_NET_CONNECT_TCP); > } > > +/* > + * Checks that socket state and attributes are correct for listen. > + * It is required to not wrongfully return -EACCES instead of -EINVAL. ^^^^^^^^^^^^^^ Doc nit: I would just document that this function returns -EINVAL on failure? In this place, I would expect that the function interface is documented for callers. (From that perspective, this is not a requirement, but a guarantee that the function gives.) > + * > + * This checker requires sock->sk to be locked. > + */ > +static int check_tcp_socket_can_listen(struct socket *const sock) > +{ > + struct sock *sk = sock->sk; > + unsigned char cur_sk_state = sk->sk_state; > + const struct tcp_ulp_ops *icsk_ulp_ops; > + > + /* Allows only unconnected TCP socket to listen (cf. inet_listen). */ > + if (sock->state != SS_UNCONNECTED) > + return -EINVAL; > + > + /* > + * Checks sock state. This is needed to ensure consistency with inet stack > + * error handling (cf. __inet_listen_sk). > + */ > + if (WARN_ON_ONCE(!((1 << cur_sk_state) & (TCPF_CLOSE | TCPF_LISTEN)))) > + return -EINVAL; > + > + icsk_ulp_ops = inet_csk(sk)->icsk_ulp_ops; > + > + /* > + * ULP (Upper Layer Protocol) stands for protocols which are higher than > + * transport protocol in OSI model. Linux has an infrastructure that > + * allows TCP sockets to support logic of some ULP (e.g. TLS ULP). > + * > + * Sockets can listen only if ULP control hook has clone method. > + */ > + if (icsk_ulp_ops && !icsk_ulp_ops->clone) > + return -EINVAL; This seems like an implementation detail in the networking subsystem? If I understand correctly, these are cases where we use TCP on top of protocols that are not IP (or have an additional layer in the middle, like TLS?). This can not be recognized through the socket family or type? Do we have cases where we can run TCP on top of something else than plain IPv4 or IPv6, where the clone method exists? > + return 0; > +} > + > +static int hook_socket_listen(struct socket *const sock, const int backlog) > +{ > + int err = 0; > + int family; > + __be16 port; > + struct sock *sk; > + const struct landlock_ruleset *const dom = get_current_net_domain(); > + > + if (!dom) > + return 0; > + if (WARN_ON_ONCE(dom->num_layers < 1)) > + return -EACCES; > + > + /* Checks if it's a (potential) TCP socket. */ > + if (sock->type != SOCK_STREAM) > + return 0; > + > + sk = sock->sk; > + family = sk->__sk_common.skc_family; > + /* > + * Socket cannot be assigned AF_UNSPEC because this type is used only > + * in the context of addresses. > + * > + * Doesn't restrict listening for non-TCP sockets. > + */ > + if (family != AF_INET && family != AF_INET6) > + return 0; Aren't the socket type and family checks duplicated with existing logic that we have for the connect(2) and bind(2) support? Should it be deduplicated, or is that too messy? > + > + lock_sock(sk); > + /* > + * Calling listen(2) for a listening socket does nothing with its state and > + * only changes backlog value (cf. __inet_listen_sk). Checking of listen > + * access right is not required. > + */ > + if (sk->sk_state == TCP_LISTEN) > + goto release_nocheck; > + > + err = check_tcp_socket_can_listen(sock); > + if (unlikely(err)) > + goto release_nocheck; > + > + port = htons(inet_sk(sk)->inet_num); > + release_sock(sk); > + return check_access_socket(dom, port, LANDLOCK_ACCESS_NET_LISTEN_TCP); > + > +release_nocheck: > + release_sock(sk); > + return err; > +} Thanks for sending these patches! —Günther
7/30/2024 11:24 AM, Günther Noack wrote: > Hello! > > Thanks for sending these patches! > > Most comments are about documentation so far. > > In the implementation, I'm mostly unclear about the interaction with the > uncommon Upper Layer Protocols. I'm also not very familiar with the socket > state machines, maybe someone from the netdev list would have time to double > check that aspect? > > > On Sun, Jul 28, 2024 at 08:25:55AM +0800, Mikhail Ivanov wrote: >> LANDLOCK_ACCESS_NET_BIND_TCP is useful to limit the scope of "bindable" >> ports to forbid a malicious sandboxed process to impersonate a legitimate >> server process. However, bind(2) might be used by (TCP) clients to set the >> source port to a (legitimate) value. Controlling the ports that can be >> used for listening would allow (TCP) clients to explicitly bind to ports >> that are forbidden for listening. >> >> Such control is implemented with a new LANDLOCK_ACCESS_NET_LISTEN_TCP >> access right that restricts listening on undesired ports with listen(2). > > Nit: I would turn around the first two commit message paragraphs and describe > your changes first, before explaining the problems in the bind(2) support. I > was initially a bit confused that the description started talking about > LANDLOCK_ACCESS_NET_BIND_TCP. > > General recommendations at: > https://www.kernel.org/doc/html/v6.10/process/submitting-patches.html#describe-your-changes I consider the first paragraph as a problem statement for this patch. According to linux recommendations problem should be established before the description of changes. Do you think that the changes part should stand before the problem anyway? > > >> It's worth noticing that this access right doesn't affect changing >> backlog value using listen(2) on already listening socket. >> >> * Create new LANDLOCK_ACCESS_NET_LISTEN_TCP flag. >> * Add hook to socket_listen(), which checks whether the socket is allowed >> to listen on a binded local port. >> * Add check_tcp_socket_can_listen() helper, which validates socket >> attributes before the actual access right check. >> * Update `struct landlock_net_port_attr` documentation with control of >> binding to ephemeral port with listen(2) description. >> * Change ABI version to 6. >> >> Closes: https://github.com/landlock-lsm/linux/issues/15 >> Signed-off-by: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com> >> --- >> include/uapi/linux/landlock.h | 23 +++-- >> security/landlock/limits.h | 2 +- >> security/landlock/net.c | 90 ++++++++++++++++++++ >> security/landlock/syscalls.c | 2 +- >> tools/testing/selftests/landlock/base_test.c | 2 +- >> 5 files changed, 108 insertions(+), 11 deletions(-) >> >> diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h >> index 68625e728f43..6b8df3293eee 100644 >> --- a/include/uapi/linux/landlock.h >> +++ b/include/uapi/linux/landlock.h >> @@ -104,13 +104,16 @@ struct landlock_net_port_attr { >> /** >> * @port: Network port in host endianness. >> * >> - * It should be noted that port 0 passed to :manpage:`bind(2)` will >> - * bind to an available port from a specific port range. This can be >> - * configured thanks to the ``/proc/sys/net/ipv4/ip_local_port_range`` >> - * sysctl (also used for IPv6). A Landlock rule with port 0 and the >> - * ``LANDLOCK_ACCESS_NET_BIND_TCP`` right means that requesting to bind >> - * on port 0 is allowed and it will automatically translate to binding >> - * on the related port range. > > Please rebase on a more recent revision, we have changed this phrasing in the meantime: > > - s/a specific port range/the ephemeral port range/ > - The paragraph was split in two. ok > >> + * It should be noted that some operations cause binding socket to a random >> + * available port from a specific port range. This can be configured thanks >> + * to the ``/proc/sys/net/ipv4/ip_local_port_range`` sysctl (also used for >> + * IPv6). Following operation requests are automatically translate to >> + * binding on the related port range: >> + * >> + * - A Landlock rule with port 0 and the ``LANDLOCK_ACCESS_NET_BIND_TCP`` >> + * right means that binding on port 0 is allowed. >> + * - A Landlock rule with port 0 and the ``LANDLOCK_ACCESS_NET_LISTEN_TCP`` >> + * right means listening without an explicit binding is allowed. > > There are some grammatical problems in this documentation section. > > Can I suggest an alternative? > > Some socket operations will fall back to using a port from the ephemeral port > range, if no specific port is requested by the caller. Among others, this > happens in the following cases: > > - :manpage:`bind(2)` is invoked with a socket address that uses port 0. > - :manpage:`listen(2)` is invoked on a socket without previously calling > :manpage:`bind(2)`. > > These two actions, which implicitly use an ephemeral port, can be allowed with > a Landlock rule with port 0 and the ``LANDLOCK_ACCESS_NET_BIND_TCP`` / > ``LANDLOCK_ACCESS_NET_LISTEN_TCP`` right. > > The ephemeral port range is configured in the > ``/proc/sys/net/ipv4/ip_local_port_range`` sysctl (also used for IPv6). Thanks, lets take this one. > > When we have the documentation wording finalized, > please send an update to the man pages as well, > for this and other documentation updates. Should I send it after this patchset would be accepted? > > Small remarks on what I've done here: > > * I am avoiding the word "binding" when referring to the automatic assignment to > an ephemeral port - IMHO, this is potentially confusing, since bind(2) is not > explicitly called. > * I am also dropping the "It should be noted" / "Note that" phrase, which is > frowned upon in man pages. Didn't know that, thanks > >> */ >> __u64 port; >> }; >> @@ -251,7 +254,7 @@ struct landlock_net_port_attr { >> * DOC: net_access >> * >> * Network flags >> - * ~~~~~~~~~~~~~~~~ >> + * ~~~~~~~~~~~~~ >> * >> * These flags enable to restrict a sandboxed process to a set of network >> * actions. This is supported since the Landlock ABI version 4. >> @@ -261,9 +264,13 @@ struct landlock_net_port_attr { >> * - %LANDLOCK_ACCESS_NET_BIND_TCP: Bind a TCP socket to a local port. >> * - %LANDLOCK_ACCESS_NET_CONNECT_TCP: Connect an active TCP socket to >> * a remote port. >> + * - %LANDLOCK_ACCESS_NET_LISTEN_TCP: Listen for TCP socket connections on >> + * a local port. This access right is available since the sixth version >> + * of the Landlock ABI. >> */ >> /* clang-format off */ >> #define LANDLOCK_ACCESS_NET_BIND_TCP (1ULL << 0) >> #define LANDLOCK_ACCESS_NET_CONNECT_TCP (1ULL << 1) >> +#define LANDLOCK_ACCESS_NET_LISTEN_TCP (1ULL << 2) >> /* clang-format on */ >> #endif /* _UAPI_LINUX_LANDLOCK_H */ >> diff --git a/security/landlock/limits.h b/security/landlock/limits.h >> index 4eb643077a2a..2ef147389474 100644 >> --- a/security/landlock/limits.h >> +++ b/security/landlock/limits.h >> @@ -22,7 +22,7 @@ >> #define LANDLOCK_MASK_ACCESS_FS ((LANDLOCK_LAST_ACCESS_FS << 1) - 1) >> #define LANDLOCK_NUM_ACCESS_FS __const_hweight64(LANDLOCK_MASK_ACCESS_FS) >> >> -#define LANDLOCK_LAST_ACCESS_NET LANDLOCK_ACCESS_NET_CONNECT_TCP >> +#define LANDLOCK_LAST_ACCESS_NET LANDLOCK_ACCESS_NET_LISTEN_TCP >> #define LANDLOCK_MASK_ACCESS_NET ((LANDLOCK_LAST_ACCESS_NET << 1) - 1) >> #define LANDLOCK_NUM_ACCESS_NET __const_hweight64(LANDLOCK_MASK_ACCESS_NET) >> >> diff --git a/security/landlock/net.c b/security/landlock/net.c >> index 669ba260342f..a29cb27c3f14 100644 >> --- a/security/landlock/net.c >> +++ b/security/landlock/net.c >> @@ -6,10 +6,12 @@ >> * Copyright © 2022-2023 Microsoft Corporation >> */ >> >> +#include "net/sock.h" >> #include <linux/in.h> >> #include <linux/net.h> >> #include <linux/socket.h> >> #include <net/ipv6.h> >> +#include <net/tcp.h> >> >> #include "common.h" >> #include "cred.h" >> @@ -194,9 +196,97 @@ static int hook_socket_connect(struct socket *const sock, >> LANDLOCK_ACCESS_NET_CONNECT_TCP); >> } >> >> +/* >> + * Checks that socket state and attributes are correct for listen. >> + * It is required to not wrongfully return -EACCES instead of -EINVAL. > ^^^^^^^^^^^^^^ > > Doc nit: I would just document that this function returns -EINVAL on failure? > In this place, I would expect that the function interface is documented for > callers. (From that perspective, this is not a requirement, but a guarantee > that the function gives.) Agreed, thanks > >> + * >> + * This checker requires sock->sk to be locked. >> + */ >> +static int check_tcp_socket_can_listen(struct socket *const sock) >> +{ >> + struct sock *sk = sock->sk; >> + unsigned char cur_sk_state = sk->sk_state; >> + const struct tcp_ulp_ops *icsk_ulp_ops; >> + >> + /* Allows only unconnected TCP socket to listen (cf. inet_listen). */ >> + if (sock->state != SS_UNCONNECTED) >> + return -EINVAL; >> + >> + /* >> + * Checks sock state. This is needed to ensure consistency with inet stack >> + * error handling (cf. __inet_listen_sk). >> + */ >> + if (WARN_ON_ONCE(!((1 << cur_sk_state) & (TCPF_CLOSE | TCPF_LISTEN)))) >> + return -EINVAL; >> + >> + icsk_ulp_ops = inet_csk(sk)->icsk_ulp_ops; >> + >> + /* >> + * ULP (Upper Layer Protocol) stands for protocols which are higher than >> + * transport protocol in OSI model. Linux has an infrastructure that >> + * allows TCP sockets to support logic of some ULP (e.g. TLS ULP). >> + * >> + * Sockets can listen only if ULP control hook has clone method. >> + */ >> + if (icsk_ulp_ops && !icsk_ulp_ops->clone) >> + return -EINVAL; > > This seems like an implementation detail in the networking subsystem? Yeap. There is probably a missing reference to the corresponding network stack code (inet_csk_listen_start()). I'll add it. > > If I understand correctly, these are cases where we use TCP on top of protocols > that are not IP (or have an additional layer in the middle, like TLS?). This > can not be recognized through the socket family or type? ULP can be used in the context of TCP protocols as an additional layer (currently supported only by IP and MPTCP), so it cannot be recognized with family or type. You can check this test [1] in which TCP IP socket is created with ULP control hook. [1] https://lore.kernel.org/all/20240728002602.3198398-8-ivanov.mikhail1@huawei-partners.com/ > > Do we have cases where we can run TCP on top of something else than plain IPv4 > or IPv6, where the clone method exists? Yeah, MPTCP protocol for example (see net/mptcp/subflow.c). ULP control hook is supported only by IP and MPTCP, and in both cases clone method is checked during listen(2) execution. > >> + return 0; >> +} >> + >> +static int hook_socket_listen(struct socket *const sock, const int backlog) >> +{ >> + int err = 0; >> + int family; >> + __be16 port; >> + struct sock *sk; >> + const struct landlock_ruleset *const dom = get_current_net_domain(); >> + >> + if (!dom) >> + return 0; >> + if (WARN_ON_ONCE(dom->num_layers < 1)) >> + return -EACCES; >> + >> + /* Checks if it's a (potential) TCP socket. */ >> + if (sock->type != SOCK_STREAM) >> + return 0; >> + >> + sk = sock->sk; >> + family = sk->__sk_common.skc_family; >> + /* >> + * Socket cannot be assigned AF_UNSPEC because this type is used only >> + * in the context of addresses. >> + * >> + * Doesn't restrict listening for non-TCP sockets. >> + */ >> + if (family != AF_INET && family != AF_INET6) >> + return 0; > > Aren't the socket type and family checks duplicated with existing logic that we > have for the connect(2) and bind(2) support? Should it be deduplicated, or is > that too messy? bind(2) and connect(2) hooks also support AF_UNSPEC family, so I think such helper is gonna complicate code a little bit. Also it can complicate switch in current_check_access_socket(). > >> + >> + lock_sock(sk); >> + /* >> + * Calling listen(2) for a listening socket does nothing with its state and >> + * only changes backlog value (cf. __inet_listen_sk). Checking of listen >> + * access right is not required. >> + */ >> + if (sk->sk_state == TCP_LISTEN) >> + goto release_nocheck; >> + >> + err = check_tcp_socket_can_listen(sock); >> + if (unlikely(err)) >> + goto release_nocheck; >> + >> + port = htons(inet_sk(sk)->inet_num); >> + release_sock(sk); >> + return check_access_socket(dom, port, LANDLOCK_ACCESS_NET_LISTEN_TCP); >> + >> +release_nocheck: >> + release_sock(sk); >> + return err; >> +} > > Thanks for sending these patches! > > —Günther
On Sun, Jul 28, 2024 at 08:25:55AM +0800, Mikhail Ivanov wrote: > LANDLOCK_ACCESS_NET_BIND_TCP is useful to limit the scope of "bindable" > ports to forbid a malicious sandboxed process to impersonate a legitimate > server process. However, bind(2) might be used by (TCP) clients to set the > source port to a (legitimate) value. Controlling the ports that can be > used for listening would allow (TCP) clients to explicitly bind to ports > that are forbidden for listening. > > Such control is implemented with a new LANDLOCK_ACCESS_NET_LISTEN_TCP > access right that restricts listening on undesired ports with listen(2). > > It's worth noticing that this access right doesn't affect changing > backlog value using listen(2) on already listening socket. > > * Create new LANDLOCK_ACCESS_NET_LISTEN_TCP flag. > * Add hook to socket_listen(), which checks whether the socket is allowed > to listen on a binded local port. > * Add check_tcp_socket_can_listen() helper, which validates socket > attributes before the actual access right check. > * Update `struct landlock_net_port_attr` documentation with control of > binding to ephemeral port with listen(2) description. > * Change ABI version to 6. > > Closes: https://github.com/landlock-lsm/linux/issues/15 > Signed-off-by: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com> Thanks for this series! I cannot apply this patch series though, could you please provide the base commit? BTW, this can be automatically put in the cover letter with the git format-patch's --base argument. > --- > include/uapi/linux/landlock.h | 23 +++-- > security/landlock/limits.h | 2 +- > security/landlock/net.c | 90 ++++++++++++++++++++ > security/landlock/syscalls.c | 2 +- > tools/testing/selftests/landlock/base_test.c | 2 +- > 5 files changed, 108 insertions(+), 11 deletions(-) > > diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h > index 68625e728f43..6b8df3293eee 100644 > --- a/include/uapi/linux/landlock.h > +++ b/include/uapi/linux/landlock.h > @@ -104,13 +104,16 @@ struct landlock_net_port_attr { > /** > * @port: Network port in host endianness. > * > - * It should be noted that port 0 passed to :manpage:`bind(2)` will > - * bind to an available port from a specific port range. This can be > - * configured thanks to the ``/proc/sys/net/ipv4/ip_local_port_range`` > - * sysctl (also used for IPv6). A Landlock rule with port 0 and the > - * ``LANDLOCK_ACCESS_NET_BIND_TCP`` right means that requesting to bind > - * on port 0 is allowed and it will automatically translate to binding > - * on the related port range. > + * It should be noted that some operations cause binding socket to a random > + * available port from a specific port range. This can be configured thanks > + * to the ``/proc/sys/net/ipv4/ip_local_port_range`` sysctl (also used for > + * IPv6). Following operation requests are automatically translate to > + * binding on the related port range: > + * > + * - A Landlock rule with port 0 and the ``LANDLOCK_ACCESS_NET_BIND_TCP`` > + * right means that binding on port 0 is allowed. > + * - A Landlock rule with port 0 and the ``LANDLOCK_ACCESS_NET_LISTEN_TCP`` > + * right means listening without an explicit binding is allowed. > */ > __u64 port; > }; > @@ -251,7 +254,7 @@ struct landlock_net_port_attr { > * DOC: net_access > * > * Network flags > - * ~~~~~~~~~~~~~~~~ > + * ~~~~~~~~~~~~~ > * > * These flags enable to restrict a sandboxed process to a set of network > * actions. This is supported since the Landlock ABI version 4. > @@ -261,9 +264,13 @@ struct landlock_net_port_attr { > * - %LANDLOCK_ACCESS_NET_BIND_TCP: Bind a TCP socket to a local port. > * - %LANDLOCK_ACCESS_NET_CONNECT_TCP: Connect an active TCP socket to > * a remote port. > + * - %LANDLOCK_ACCESS_NET_LISTEN_TCP: Listen for TCP socket connections on > + * a local port. This access right is available since the sixth version > + * of the Landlock ABI. > */ > /* clang-format off */ > #define LANDLOCK_ACCESS_NET_BIND_TCP (1ULL << 0) > #define LANDLOCK_ACCESS_NET_CONNECT_TCP (1ULL << 1) > +#define LANDLOCK_ACCESS_NET_LISTEN_TCP (1ULL << 2) > /* clang-format on */ > #endif /* _UAPI_LINUX_LANDLOCK_H */ > diff --git a/security/landlock/limits.h b/security/landlock/limits.h > index 4eb643077a2a..2ef147389474 100644 > --- a/security/landlock/limits.h > +++ b/security/landlock/limits.h > @@ -22,7 +22,7 @@ > #define LANDLOCK_MASK_ACCESS_FS ((LANDLOCK_LAST_ACCESS_FS << 1) - 1) > #define LANDLOCK_NUM_ACCESS_FS __const_hweight64(LANDLOCK_MASK_ACCESS_FS) > > -#define LANDLOCK_LAST_ACCESS_NET LANDLOCK_ACCESS_NET_CONNECT_TCP > +#define LANDLOCK_LAST_ACCESS_NET LANDLOCK_ACCESS_NET_LISTEN_TCP > #define LANDLOCK_MASK_ACCESS_NET ((LANDLOCK_LAST_ACCESS_NET << 1) - 1) > #define LANDLOCK_NUM_ACCESS_NET __const_hweight64(LANDLOCK_MASK_ACCESS_NET) > > diff --git a/security/landlock/net.c b/security/landlock/net.c > index 669ba260342f..a29cb27c3f14 100644 > --- a/security/landlock/net.c > +++ b/security/landlock/net.c > @@ -6,10 +6,12 @@ > * Copyright © 2022-2023 Microsoft Corporation > */ > > +#include "net/sock.h" These should not be quotes. > #include <linux/in.h> > #include <linux/net.h> > #include <linux/socket.h> > #include <net/ipv6.h> > +#include <net/tcp.h> > > #include "common.h" > #include "cred.h" > @@ -194,9 +196,97 @@ static int hook_socket_connect(struct socket *const sock, > LANDLOCK_ACCESS_NET_CONNECT_TCP); > } > > +/* > + * Checks that socket state and attributes are correct for listen. > + * It is required to not wrongfully return -EACCES instead of -EINVAL. > + * > + * This checker requires sock->sk to be locked. > + */ > +static int check_tcp_socket_can_listen(struct socket *const sock) Is this function still useful with the listen LSM hook? > +{ > + struct sock *sk = sock->sk; > + unsigned char cur_sk_state = sk->sk_state; > + const struct tcp_ulp_ops *icsk_ulp_ops; > + > + /* Allows only unconnected TCP socket to listen (cf. inet_listen). */ > + if (sock->state != SS_UNCONNECTED) > + return -EINVAL; > + > + /* > + * Checks sock state. This is needed to ensure consistency with inet stack > + * error handling (cf. __inet_listen_sk). > + */ > + if (WARN_ON_ONCE(!((1 << cur_sk_state) & (TCPF_CLOSE | TCPF_LISTEN)))) > + return -EINVAL; > + > + icsk_ulp_ops = inet_csk(sk)->icsk_ulp_ops; > + > + /* > + * ULP (Upper Layer Protocol) stands for protocols which are higher than > + * transport protocol in OSI model. Linux has an infrastructure that > + * allows TCP sockets to support logic of some ULP (e.g. TLS ULP). > + * > + * Sockets can listen only if ULP control hook has clone method. > + */ > + if (icsk_ulp_ops && !icsk_ulp_ops->clone) > + return -EINVAL; > + return 0; > +} > + > +static int hook_socket_listen(struct socket *const sock, const int backlog) > +{ Why can't we just call current_check_access_socket()? > + int err = 0; > + int family; > + __be16 port; > + struct sock *sk; > + const struct landlock_ruleset *const dom = get_current_net_domain(); > + > + if (!dom) > + return 0; > + if (WARN_ON_ONCE(dom->num_layers < 1)) > + return -EACCES; > + > + /* Checks if it's a (potential) TCP socket. */ > + if (sock->type != SOCK_STREAM) > + return 0; > + > + sk = sock->sk; > + family = sk->__sk_common.skc_family; > + /* > + * Socket cannot be assigned AF_UNSPEC because this type is used only > + * in the context of addresses. > + * > + * Doesn't restrict listening for non-TCP sockets. > + */ > + if (family != AF_INET && family != AF_INET6) > + return 0; > + > + lock_sock(sk); > + /* > + * Calling listen(2) for a listening socket does nothing with its state and > + * only changes backlog value (cf. __inet_listen_sk). Checking of listen > + * access right is not required. > + */ > + if (sk->sk_state == TCP_LISTEN) > + goto release_nocheck; > + > + err = check_tcp_socket_can_listen(sock); > + if (unlikely(err)) > + goto release_nocheck; > + > + port = htons(inet_sk(sk)->inet_num); > + release_sock(sk); > + return check_access_socket(dom, port, LANDLOCK_ACCESS_NET_LISTEN_TCP); > + > +release_nocheck: > + release_sock(sk); > + return err; > +} > + > static struct security_hook_list landlock_hooks[] __ro_after_init = { > LSM_HOOK_INIT(socket_bind, hook_socket_bind), > LSM_HOOK_INIT(socket_connect, hook_socket_connect), > + LSM_HOOK_INIT(socket_listen, hook_socket_listen), > }; > > __init void landlock_add_net_hooks(void) > diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c > index 03b470f5a85a..3752bcc033d4 100644 > --- a/security/landlock/syscalls.c > +++ b/security/landlock/syscalls.c > @@ -149,7 +149,7 @@ static const struct file_operations ruleset_fops = { > .write = fop_dummy_write, > }; > > -#define LANDLOCK_ABI_VERSION 5 > +#define LANDLOCK_ABI_VERSION 6 > > /** > * sys_landlock_create_ruleset - Create a new ruleset > diff --git a/tools/testing/selftests/landlock/base_test.c b/tools/testing/selftests/landlock/base_test.c > index 3c1e9f35b531..52b00472a487 100644 > --- a/tools/testing/selftests/landlock/base_test.c > +++ b/tools/testing/selftests/landlock/base_test.c > @@ -75,7 +75,7 @@ TEST(abi_version) > const struct landlock_ruleset_attr ruleset_attr = { > .handled_access_fs = LANDLOCK_ACCESS_FS_READ_FILE, > }; > - ASSERT_EQ(5, landlock_create_ruleset(NULL, 0, > + ASSERT_EQ(6, landlock_create_ruleset(NULL, 0, > LANDLOCK_CREATE_RULESET_VERSION)); > > ASSERT_EQ(-1, landlock_create_ruleset(&ruleset_attr, 0, > -- > 2.34.1 > >
7/31/2024 9:30 PM, Mickaël Salaün wrote: > On Sun, Jul 28, 2024 at 08:25:55AM +0800, Mikhail Ivanov wrote: >> LANDLOCK_ACCESS_NET_BIND_TCP is useful to limit the scope of "bindable" >> ports to forbid a malicious sandboxed process to impersonate a legitimate >> server process. However, bind(2) might be used by (TCP) clients to set the >> source port to a (legitimate) value. Controlling the ports that can be >> used for listening would allow (TCP) clients to explicitly bind to ports >> that are forbidden for listening. >> >> Such control is implemented with a new LANDLOCK_ACCESS_NET_LISTEN_TCP >> access right that restricts listening on undesired ports with listen(2). >> >> It's worth noticing that this access right doesn't affect changing >> backlog value using listen(2) on already listening socket. >> >> * Create new LANDLOCK_ACCESS_NET_LISTEN_TCP flag. >> * Add hook to socket_listen(), which checks whether the socket is allowed >> to listen on a binded local port. >> * Add check_tcp_socket_can_listen() helper, which validates socket >> attributes before the actual access right check. >> * Update `struct landlock_net_port_attr` documentation with control of >> binding to ephemeral port with listen(2) description. >> * Change ABI version to 6. >> >> Closes: https://github.com/landlock-lsm/linux/issues/15 >> Signed-off-by: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com> > > Thanks for this series! > > I cannot apply this patch series though, could you please provide the > base commit? BTW, this can be automatically put in the cover letter > with the git format-patch's --base argument. base-commit: 591561c2b47b7e7225e229e844f5de75ce0c09ec Günther said that I should rebase to the latest commits, so I'll do it in the next version of this patchset. > >> --- >> include/uapi/linux/landlock.h | 23 +++-- >> security/landlock/limits.h | 2 +- >> security/landlock/net.c | 90 ++++++++++++++++++++ >> security/landlock/syscalls.c | 2 +- >> tools/testing/selftests/landlock/base_test.c | 2 +- >> 5 files changed, 108 insertions(+), 11 deletions(-) >> >> diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h >> index 68625e728f43..6b8df3293eee 100644 >> --- a/include/uapi/linux/landlock.h >> +++ b/include/uapi/linux/landlock.h >> @@ -104,13 +104,16 @@ struct landlock_net_port_attr { >> /** >> * @port: Network port in host endianness. >> * >> - * It should be noted that port 0 passed to :manpage:`bind(2)` will >> - * bind to an available port from a specific port range. This can be >> - * configured thanks to the ``/proc/sys/net/ipv4/ip_local_port_range`` >> - * sysctl (also used for IPv6). A Landlock rule with port 0 and the >> - * ``LANDLOCK_ACCESS_NET_BIND_TCP`` right means that requesting to bind >> - * on port 0 is allowed and it will automatically translate to binding >> - * on the related port range. >> + * It should be noted that some operations cause binding socket to a random >> + * available port from a specific port range. This can be configured thanks >> + * to the ``/proc/sys/net/ipv4/ip_local_port_range`` sysctl (also used for >> + * IPv6). Following operation requests are automatically translate to >> + * binding on the related port range: >> + * >> + * - A Landlock rule with port 0 and the ``LANDLOCK_ACCESS_NET_BIND_TCP`` >> + * right means that binding on port 0 is allowed. >> + * - A Landlock rule with port 0 and the ``LANDLOCK_ACCESS_NET_LISTEN_TCP`` >> + * right means listening without an explicit binding is allowed. >> */ >> __u64 port; >> }; >> @@ -251,7 +254,7 @@ struct landlock_net_port_attr { >> * DOC: net_access >> * >> * Network flags >> - * ~~~~~~~~~~~~~~~~ >> + * ~~~~~~~~~~~~~ >> * >> * These flags enable to restrict a sandboxed process to a set of network >> * actions. This is supported since the Landlock ABI version 4. >> @@ -261,9 +264,13 @@ struct landlock_net_port_attr { >> * - %LANDLOCK_ACCESS_NET_BIND_TCP: Bind a TCP socket to a local port. >> * - %LANDLOCK_ACCESS_NET_CONNECT_TCP: Connect an active TCP socket to >> * a remote port. >> + * - %LANDLOCK_ACCESS_NET_LISTEN_TCP: Listen for TCP socket connections on >> + * a local port. This access right is available since the sixth version >> + * of the Landlock ABI. >> */ >> /* clang-format off */ >> #define LANDLOCK_ACCESS_NET_BIND_TCP (1ULL << 0) >> #define LANDLOCK_ACCESS_NET_CONNECT_TCP (1ULL << 1) >> +#define LANDLOCK_ACCESS_NET_LISTEN_TCP (1ULL << 2) >> /* clang-format on */ >> #endif /* _UAPI_LINUX_LANDLOCK_H */ >> diff --git a/security/landlock/limits.h b/security/landlock/limits.h >> index 4eb643077a2a..2ef147389474 100644 >> --- a/security/landlock/limits.h >> +++ b/security/landlock/limits.h >> @@ -22,7 +22,7 @@ >> #define LANDLOCK_MASK_ACCESS_FS ((LANDLOCK_LAST_ACCESS_FS << 1) - 1) >> #define LANDLOCK_NUM_ACCESS_FS __const_hweight64(LANDLOCK_MASK_ACCESS_FS) >> >> -#define LANDLOCK_LAST_ACCESS_NET LANDLOCK_ACCESS_NET_CONNECT_TCP >> +#define LANDLOCK_LAST_ACCESS_NET LANDLOCK_ACCESS_NET_LISTEN_TCP >> #define LANDLOCK_MASK_ACCESS_NET ((LANDLOCK_LAST_ACCESS_NET << 1) - 1) >> #define LANDLOCK_NUM_ACCESS_NET __const_hweight64(LANDLOCK_MASK_ACCESS_NET) >> >> diff --git a/security/landlock/net.c b/security/landlock/net.c >> index 669ba260342f..a29cb27c3f14 100644 >> --- a/security/landlock/net.c >> +++ b/security/landlock/net.c >> @@ -6,10 +6,12 @@ >> * Copyright © 2022-2023 Microsoft Corporation >> */ >> >> +#include "net/sock.h" > > These should not be quotes. will be fixed, thanks > >> #include <linux/in.h> >> #include <linux/net.h> >> #include <linux/socket.h> >> #include <net/ipv6.h> >> +#include <net/tcp.h> >> >> #include "common.h" >> #include "cred.h" >> @@ -194,9 +196,97 @@ static int hook_socket_connect(struct socket *const sock, >> LANDLOCK_ACCESS_NET_CONNECT_TCP); >> } >> >> +/* >> + * Checks that socket state and attributes are correct for listen. >> + * It is required to not wrongfully return -EACCES instead of -EINVAL. >> + * >> + * This checker requires sock->sk to be locked. >> + */ >> +static int check_tcp_socket_can_listen(struct socket *const sock) > > Is this function still useful with the listen LSM hook? Yeap, we need to validate socket structure before checking the access right. You can see [1] and [2] where the behavior of this function is tested. [1] https://lore.kernel.org/all/20240728002602.3198398-6-ivanov.mikhail1@huawei-partners.com/ [2] https://lore.kernel.org/all/20240728002602.3198398-8-ivanov.mikhail1@huawei-partners.com/ > >> +{ >> + struct sock *sk = sock->sk; >> + unsigned char cur_sk_state = sk->sk_state; >> + const struct tcp_ulp_ops *icsk_ulp_ops; >> + >> + /* Allows only unconnected TCP socket to listen (cf. inet_listen). */ >> + if (sock->state != SS_UNCONNECTED) >> + return -EINVAL; >> + >> + /* >> + * Checks sock state. This is needed to ensure consistency with inet stack >> + * error handling (cf. __inet_listen_sk). >> + */ >> + if (WARN_ON_ONCE(!((1 << cur_sk_state) & (TCPF_CLOSE | TCPF_LISTEN)))) >> + return -EINVAL; >> + >> + icsk_ulp_ops = inet_csk(sk)->icsk_ulp_ops; >> + >> + /* >> + * ULP (Upper Layer Protocol) stands for protocols which are higher than >> + * transport protocol in OSI model. Linux has an infrastructure that >> + * allows TCP sockets to support logic of some ULP (e.g. TLS ULP). >> + * >> + * Sockets can listen only if ULP control hook has clone method. >> + */ >> + if (icsk_ulp_ops && !icsk_ulp_ops->clone) >> + return -EINVAL; >> + return 0; >> +} >> + >> +static int hook_socket_listen(struct socket *const sock, const int backlog) >> +{ > > Why can't we just call current_check_access_socket()? I've mentioned in the message of the previous commit that this method has address checks for bind(2) and connect(2). In the case of listen(2) port is extracted from the socket structure, so calling current_check_access_socket() would be pointless. > >> + int err = 0; >> + int family; >> + __be16 port; >> + struct sock *sk; >> + const struct landlock_ruleset *const dom = get_current_net_domain(); >> + >> + if (!dom) >> + return 0; >> + if (WARN_ON_ONCE(dom->num_layers < 1)) >> + return -EACCES; >> + >> + /* Checks if it's a (potential) TCP socket. */ >> + if (sock->type != SOCK_STREAM) >> + return 0; >> + >> + sk = sock->sk; >> + family = sk->__sk_common.skc_family; >> + /* >> + * Socket cannot be assigned AF_UNSPEC because this type is used only >> + * in the context of addresses. >> + * >> + * Doesn't restrict listening for non-TCP sockets. >> + */ >> + if (family != AF_INET && family != AF_INET6) >> + return 0; >> + >> + lock_sock(sk); >> + /* >> + * Calling listen(2) for a listening socket does nothing with its state and >> + * only changes backlog value (cf. __inet_listen_sk). Checking of listen >> + * access right is not required. >> + */ >> + if (sk->sk_state == TCP_LISTEN) >> + goto release_nocheck; >> + >> + err = check_tcp_socket_can_listen(sock); >> + if (unlikely(err)) >> + goto release_nocheck; >> + >> + port = htons(inet_sk(sk)->inet_num); >> + release_sock(sk); >> + return check_access_socket(dom, port, LANDLOCK_ACCESS_NET_LISTEN_TCP); >> + >> +release_nocheck: >> + release_sock(sk); >> + return err; >> +} >> + >> static struct security_hook_list landlock_hooks[] __ro_after_init = { >> LSM_HOOK_INIT(socket_bind, hook_socket_bind), >> LSM_HOOK_INIT(socket_connect, hook_socket_connect), >> + LSM_HOOK_INIT(socket_listen, hook_socket_listen), >> }; >> >> __init void landlock_add_net_hooks(void) >> diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c >> index 03b470f5a85a..3752bcc033d4 100644 >> --- a/security/landlock/syscalls.c >> +++ b/security/landlock/syscalls.c >> @@ -149,7 +149,7 @@ static const struct file_operations ruleset_fops = { >> .write = fop_dummy_write, >> }; >> >> -#define LANDLOCK_ABI_VERSION 5 >> +#define LANDLOCK_ABI_VERSION 6 >> >> /** >> * sys_landlock_create_ruleset - Create a new ruleset >> diff --git a/tools/testing/selftests/landlock/base_test.c b/tools/testing/selftests/landlock/base_test.c >> index 3c1e9f35b531..52b00472a487 100644 >> --- a/tools/testing/selftests/landlock/base_test.c >> +++ b/tools/testing/selftests/landlock/base_test.c >> @@ -75,7 +75,7 @@ TEST(abi_version) >> const struct landlock_ruleset_attr ruleset_attr = { >> .handled_access_fs = LANDLOCK_ACCESS_FS_READ_FILE, >> }; >> - ASSERT_EQ(5, landlock_create_ruleset(NULL, 0, >> + ASSERT_EQ(6, landlock_create_ruleset(NULL, 0, >> LANDLOCK_CREATE_RULESET_VERSION)); >> >> ASSERT_EQ(-1, landlock_create_ruleset(&ruleset_attr, 0, >> -- >> 2.34.1 >> >>
On Wed, Jul 31, 2024 at 08:20:41PM +0300, Mikhail Ivanov wrote: > 7/30/2024 11:24 AM, Günther Noack wrote: > > On Sun, Jul 28, 2024 at 08:25:55AM +0800, Mikhail Ivanov wrote: > > > LANDLOCK_ACCESS_NET_BIND_TCP is useful to limit the scope of "bindable" > > > ports to forbid a malicious sandboxed process to impersonate a legitimate > > > server process. However, bind(2) might be used by (TCP) clients to set the > > > source port to a (legitimate) value. Controlling the ports that can be > > > used for listening would allow (TCP) clients to explicitly bind to ports > > > that are forbidden for listening. > > > > > > Such control is implemented with a new LANDLOCK_ACCESS_NET_LISTEN_TCP > > > access right that restricts listening on undesired ports with listen(2). > > > > Nit: I would turn around the first two commit message paragraphs and describe > > your changes first, before explaining the problems in the bind(2) support. I > > was initially a bit confused that the description started talking about > > LANDLOCK_ACCESS_NET_BIND_TCP. > > > > General recommendations at: > > https://www.kernel.org/doc/html/v6.10/process/submitting-patches.html#describe-your-changes > > I consider the first paragraph as a problem statement for this patch. > According to linux recommendations problem should be established before > the description of changes. Do you think that the changes part should > stand before the problem anyway? Up to you. To be fair, I'm sold on the approach in this patchset anyway :) > > When we have the documentation wording finalized, > > please send an update to the man pages as well, > > for this and other documentation updates. > > Should I send it after this patchset would be accepted? Yes, that would be the normal process which we have been following so far. (I don't like the process much either, because it decouples feature development so far from documentation writing, but it's what we have for now.) An example patch which does that for the network bind(2) and connect(2) features (and where I would still like a review from Konstantin) is: https://lore.kernel.org/all/20240723101917.90918-1-gnoack@google.com/ > > Small remarks on what I've done here: > > > > * I am avoiding the word "binding" when referring to the automatic assignment to > > an ephemeral port - IMHO, this is potentially confusing, since bind(2) is not > > explicitly called. > > * I am also dropping the "It should be noted" / "Note that" phrase, which is > > frowned upon in man pages. > > Didn't know that, thanks Regarding "note that", see https://lore.kernel.org/all/0aafcdd6-4ac7-8501-c607-9a24a98597d7@gmail.com/ https://lore.kernel.org/linux-man/20210729223535.qvyomfqvvahzmu5w@localhost.localdomain/ https://lore.kernel.org/linux-man/20230105225235.6cjtz6orjzxzvo6v@illithid/ (The "Kemper notectomy") This came up in man page reviews, but we'll have an easier time keeping the kernel and man page documentation in sync if we adhere to man page style directly. (The man page style is documented in man-pages(7) and contains some groff-independent wording advice as well.) > > If I understand correctly, these are cases where we use TCP on top of protocols > > that are not IP (or have an additional layer in the middle, like TLS?). This > > can not be recognized through the socket family or type? > > ULP can be used in the context of TCP protocols as an additional layer > (currently supported only by IP and MPTCP), so it cannot be recognized > with family or type. You can check this test [1] in which TCP IP socket > is created with ULP control hook. > > [1] https://lore.kernel.org/all/20240728002602.3198398-8-ivanov.mikhail1@huawei-partners.com/ Thanks, this is helpful. For reference, it seems that ULP were introduced in https://lore.kernel.org/all/20170614183714.GA80310@davejwatson-mba.dhcp.thefacebook.com/ > > Do we have cases where we can run TCP on top of something else than plain IPv4 > > or IPv6, where the clone method exists? > > Yeah, MPTCP protocol for example (see net/mptcp/subflow.c). ULP control > hook is supported only by IP and MPTCP, and in both cases > clone method is checked during listen(2) execution. > > Aren't the socket type and family checks duplicated with existing logic that we > > have for the connect(2) and bind(2) support? Should it be deduplicated, or is > > that too messy? > > bind(2) and connect(2) hooks also support AF_UNSPEC family, so I think > such helper is gonna complicate code a little bit. Also it can > complicate switch in current_check_access_socket(). OK, sounds good.
8/1/2024 1:36 PM, Günther Noack wrote: > On Wed, Jul 31, 2024 at 08:20:41PM +0300, Mikhail Ivanov wrote: >> 7/30/2024 11:24 AM, Günther Noack wrote: >>> On Sun, Jul 28, 2024 at 08:25:55AM +0800, Mikhail Ivanov wrote: >>>> LANDLOCK_ACCESS_NET_BIND_TCP is useful to limit the scope of "bindable" >>>> ports to forbid a malicious sandboxed process to impersonate a legitimate >>>> server process. However, bind(2) might be used by (TCP) clients to set the >>>> source port to a (legitimate) value. Controlling the ports that can be >>>> used for listening would allow (TCP) clients to explicitly bind to ports >>>> that are forbidden for listening. >>>> >>>> Such control is implemented with a new LANDLOCK_ACCESS_NET_LISTEN_TCP >>>> access right that restricts listening on undesired ports with listen(2). >>> >>> Nit: I would turn around the first two commit message paragraphs and describe >>> your changes first, before explaining the problems in the bind(2) support. I >>> was initially a bit confused that the description started talking about >>> LANDLOCK_ACCESS_NET_BIND_TCP. >>> >>> General recommendations at: >>> https://www.kernel.org/doc/html/v6.10/process/submitting-patches.html#describe-your-changes >> >> I consider the first paragraph as a problem statement for this patch. >> According to linux recommendations problem should be established before >> the description of changes. Do you think that the changes part should >> stand before the problem anyway? > > Up to you. To be fair, I'm sold on the approach in this patchset anyway :) Nice :) > > >>> When we have the documentation wording finalized, >>> please send an update to the man pages as well, >>> for this and other documentation updates. >> >> Should I send it after this patchset would be accepted? > > Yes, that would be the normal process which we have been following so far. > > (I don't like the process much either, because it decouples feature development > so far from documentation writing, but it's what we have for now.) > > An example patch which does that for the network bind(2) and connect(2) features > (and where I would still like a review from Konstantin) is: > https://lore.kernel.org/all/20240723101917.90918-1-gnoack@google.com/ got it > > >>> Small remarks on what I've done here: >>> >>> * I am avoiding the word "binding" when referring to the automatic assignment to >>> an ephemeral port - IMHO, this is potentially confusing, since bind(2) is not >>> explicitly called. >>> * I am also dropping the "It should be noted" / "Note that" phrase, which is >>> frowned upon in man pages. >> >> Didn't know that, thanks > > Regarding "note that", see > https://lore.kernel.org/all/0aafcdd6-4ac7-8501-c607-9a24a98597d7@gmail.com/ > https://lore.kernel.org/linux-man/20210729223535.qvyomfqvvahzmu5w@localhost.localdomain/ > https://lore.kernel.org/linux-man/20230105225235.6cjtz6orjzxzvo6v@illithid/ > (The "Kemper notectomy") > > This came up in man page reviews, but we'll have an easier time keeping the > kernel and man page documentation in sync if we adhere to man page style > directly. (The man page style is documented in man-pages(7) and contains some > groff-independent wording advice as well.) Ok, such phrases should be really avoided in kernel as well. > > >>> If I understand correctly, these are cases where we use TCP on top of protocols >>> that are not IP (or have an additional layer in the middle, like TLS?). This >>> can not be recognized through the socket family or type? >> >> ULP can be used in the context of TCP protocols as an additional layer >> (currently supported only by IP and MPTCP), so it cannot be recognized >> with family or type. You can check this test [1] in which TCP IP socket >> is created with ULP control hook. >> >> [1] https://lore.kernel.org/all/20240728002602.3198398-8-ivanov.mikhail1@huawei-partners.com/ > > Thanks, this is helpful. > > For reference, it seems that ULP were introduced in > https://lore.kernel.org/all/20170614183714.GA80310@davejwatson-mba.dhcp.thefacebook.com/ > > >>> Do we have cases where we can run TCP on top of something else than plain IPv4 >>> or IPv6, where the clone method exists? >> >> Yeah, MPTCP protocol for example (see net/mptcp/subflow.c). ULP control >> hook is supported only by IP and MPTCP, and in both cases >> clone method is checked during listen(2) execution. > > >>> Aren't the socket type and family checks duplicated with existing logic that we >>> have for the connect(2) and bind(2) support? Should it be deduplicated, or is >>> that too messy? >> >> bind(2) and connect(2) hooks also support AF_UNSPEC family, so I think >> such helper is gonna complicate code a little bit. Also it can >> complicate switch in current_check_access_socket(). > > OK, sounds good.
On Thu, Aug 01, 2024 at 10:52:25AM +0300, Mikhail Ivanov wrote: > 7/31/2024 9:30 PM, Mickaël Salaün wrote: > > On Sun, Jul 28, 2024 at 08:25:55AM +0800, Mikhail Ivanov wrote: > > > LANDLOCK_ACCESS_NET_BIND_TCP is useful to limit the scope of "bindable" > > > ports to forbid a malicious sandboxed process to impersonate a legitimate > > > server process. However, bind(2) might be used by (TCP) clients to set the > > > source port to a (legitimate) value. Controlling the ports that can be > > > used for listening would allow (TCP) clients to explicitly bind to ports > > > that are forbidden for listening. > > > > > > Such control is implemented with a new LANDLOCK_ACCESS_NET_LISTEN_TCP > > > access right that restricts listening on undesired ports with listen(2). > > > > > > It's worth noticing that this access right doesn't affect changing > > > backlog value using listen(2) on already listening socket. > > > > > > * Create new LANDLOCK_ACCESS_NET_LISTEN_TCP flag. > > > * Add hook to socket_listen(), which checks whether the socket is allowed > > > to listen on a binded local port. > > > * Add check_tcp_socket_can_listen() helper, which validates socket > > > attributes before the actual access right check. > > > * Update `struct landlock_net_port_attr` documentation with control of > > > binding to ephemeral port with listen(2) description. > > > * Change ABI version to 6. > > > > > > Closes: https://github.com/landlock-lsm/linux/issues/15 > > > Signed-off-by: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com> > > > > Thanks for this series! > > > > I cannot apply this patch series though, could you please provide the > > base commit? BTW, this can be automatically put in the cover letter > > with the git format-patch's --base argument. > > base-commit: 591561c2b47b7e7225e229e844f5de75ce0c09ec Thanks, the following commit makes this series to not apply. > > Günther said that I should rebase to the latest commits, so I'll do > it in the next version of this patchset. Yep, currently we're on v6.11-rc1, but please specify the base commit each time. > > > > > > --- > > > include/uapi/linux/landlock.h | 23 +++-- > > > security/landlock/limits.h | 2 +- > > > security/landlock/net.c | 90 ++++++++++++++++++++ > > > security/landlock/syscalls.c | 2 +- > > > tools/testing/selftests/landlock/base_test.c | 2 +- > > > 5 files changed, 108 insertions(+), 11 deletions(-) > > > > > > diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h > > > index 68625e728f43..6b8df3293eee 100644 > > > --- a/include/uapi/linux/landlock.h > > > +++ b/include/uapi/linux/landlock.h > > > @@ -104,13 +104,16 @@ struct landlock_net_port_attr { > > > /** > > > * @port: Network port in host endianness. > > > * > > > - * It should be noted that port 0 passed to :manpage:`bind(2)` will > > > - * bind to an available port from a specific port range. This can be > > > - * configured thanks to the ``/proc/sys/net/ipv4/ip_local_port_range`` > > > - * sysctl (also used for IPv6). A Landlock rule with port 0 and the > > > - * ``LANDLOCK_ACCESS_NET_BIND_TCP`` right means that requesting to bind > > > - * on port 0 is allowed and it will automatically translate to binding > > > - * on the related port range. > > > + * It should be noted that some operations cause binding socket to a random > > > + * available port from a specific port range. This can be configured thanks > > > + * to the ``/proc/sys/net/ipv4/ip_local_port_range`` sysctl (also used for > > > + * IPv6). Following operation requests are automatically translate to > > > + * binding on the related port range: > > > + * > > > + * - A Landlock rule with port 0 and the ``LANDLOCK_ACCESS_NET_BIND_TCP`` > > > + * right means that binding on port 0 is allowed. > > > + * - A Landlock rule with port 0 and the ``LANDLOCK_ACCESS_NET_LISTEN_TCP`` > > > + * right means listening without an explicit binding is allowed. > > > */ > > > __u64 port; > > > }; > > > @@ -251,7 +254,7 @@ struct landlock_net_port_attr { > > > * DOC: net_access > > > * > > > * Network flags > > > - * ~~~~~~~~~~~~~~~~ > > > + * ~~~~~~~~~~~~~ > > > * > > > * These flags enable to restrict a sandboxed process to a set of network > > > * actions. This is supported since the Landlock ABI version 4. > > > @@ -261,9 +264,13 @@ struct landlock_net_port_attr { > > > * - %LANDLOCK_ACCESS_NET_BIND_TCP: Bind a TCP socket to a local port. > > > * - %LANDLOCK_ACCESS_NET_CONNECT_TCP: Connect an active TCP socket to > > > * a remote port. > > > + * - %LANDLOCK_ACCESS_NET_LISTEN_TCP: Listen for TCP socket connections on > > > + * a local port. This access right is available since the sixth version > > > + * of the Landlock ABI. > > > */ > > > /* clang-format off */ > > > #define LANDLOCK_ACCESS_NET_BIND_TCP (1ULL << 0) > > > #define LANDLOCK_ACCESS_NET_CONNECT_TCP (1ULL << 1) > > > +#define LANDLOCK_ACCESS_NET_LISTEN_TCP (1ULL << 2) > > > /* clang-format on */ > > > #endif /* _UAPI_LINUX_LANDLOCK_H */ > > > diff --git a/security/landlock/limits.h b/security/landlock/limits.h > > > index 4eb643077a2a..2ef147389474 100644 > > > --- a/security/landlock/limits.h > > > +++ b/security/landlock/limits.h > > > @@ -22,7 +22,7 @@ > > > #define LANDLOCK_MASK_ACCESS_FS ((LANDLOCK_LAST_ACCESS_FS << 1) - 1) > > > #define LANDLOCK_NUM_ACCESS_FS __const_hweight64(LANDLOCK_MASK_ACCESS_FS) > > > -#define LANDLOCK_LAST_ACCESS_NET LANDLOCK_ACCESS_NET_CONNECT_TCP > > > +#define LANDLOCK_LAST_ACCESS_NET LANDLOCK_ACCESS_NET_LISTEN_TCP > > > #define LANDLOCK_MASK_ACCESS_NET ((LANDLOCK_LAST_ACCESS_NET << 1) - 1) > > > #define LANDLOCK_NUM_ACCESS_NET __const_hweight64(LANDLOCK_MASK_ACCESS_NET) > > > diff --git a/security/landlock/net.c b/security/landlock/net.c > > > index 669ba260342f..a29cb27c3f14 100644 > > > --- a/security/landlock/net.c > > > +++ b/security/landlock/net.c > > > @@ -6,10 +6,12 @@ > > > * Copyright © 2022-2023 Microsoft Corporation > > > */ > > > +#include "net/sock.h" > > > > These should not be quotes. > > will be fixed, thanks > > > > > > #include <linux/in.h> > > > #include <linux/net.h> > > > #include <linux/socket.h> > > > #include <net/ipv6.h> > > > +#include <net/tcp.h> > > > #include "common.h" > > > #include "cred.h" > > > @@ -194,9 +196,97 @@ static int hook_socket_connect(struct socket *const sock, > > > LANDLOCK_ACCESS_NET_CONNECT_TCP); > > > } > > > +/* > > > + * Checks that socket state and attributes are correct for listen. > > > + * It is required to not wrongfully return -EACCES instead of -EINVAL. > > > + * > > > + * This checker requires sock->sk to be locked. > > > + */ > > > +static int check_tcp_socket_can_listen(struct socket *const sock) > > > > Is this function still useful with the listen LSM hook? > > Yeap, we need to validate socket structure before checking the access > right. You can see [1] and [2] where the behavior of this function is > tested. > > [1] https://lore.kernel.org/all/20240728002602.3198398-6-ivanov.mikhail1@huawei-partners.com/ > [2] https://lore.kernel.org/all/20240728002602.3198398-8-ivanov.mikhail1@huawei-partners.com/ OK, that's good. > > > > > > +{ > > > + struct sock *sk = sock->sk; > > > + unsigned char cur_sk_state = sk->sk_state; > > > + const struct tcp_ulp_ops *icsk_ulp_ops; > > > + > > > + /* Allows only unconnected TCP socket to listen (cf. inet_listen). */ > > > + if (sock->state != SS_UNCONNECTED) > > > + return -EINVAL; > > > + > > > + /* > > > + * Checks sock state. This is needed to ensure consistency with inet stack > > > + * error handling (cf. __inet_listen_sk). > > > + */ > > > + if (WARN_ON_ONCE(!((1 << cur_sk_state) & (TCPF_CLOSE | TCPF_LISTEN)))) > > > + return -EINVAL; > > > + > > > + icsk_ulp_ops = inet_csk(sk)->icsk_ulp_ops; > > > + > > > + /* > > > + * ULP (Upper Layer Protocol) stands for protocols which are higher than > > > + * transport protocol in OSI model. Linux has an infrastructure that > > > + * allows TCP sockets to support logic of some ULP (e.g. TLS ULP). > > > + * > > > + * Sockets can listen only if ULP control hook has clone method. > > > + */ > > > + if (icsk_ulp_ops && !icsk_ulp_ops->clone) > > > + return -EINVAL; > > > + return 0; > > > +} > > > + > > > +static int hook_socket_listen(struct socket *const sock, const int backlog) > > > +{ > > > > Why can't we just call current_check_access_socket()? > > I've mentioned in the message of the previous commit that this method > has address checks for bind(2) and connect(2). In the case of listen(2) > port is extracted from the socket structure, so calling > current_check_access_socket() would be pointless. Yep, I missed the check_access_socket() refactoring. > > > > > > + int err = 0; > > > + int family; > > > + __be16 port; > > > + struct sock *sk; > > > + const struct landlock_ruleset *const dom = get_current_net_domain(); > > > + > > > + if (!dom) > > > + return 0; > > > + if (WARN_ON_ONCE(dom->num_layers < 1)) > > > + return -EACCES; > > > + > > > + /* Checks if it's a (potential) TCP socket. */ > > > + if (sock->type != SOCK_STREAM) > > > + return 0; > > > + > > > + sk = sock->sk; > > > + family = sk->__sk_common.skc_family; > > > + /* > > > + * Socket cannot be assigned AF_UNSPEC because this type is used only > > > + * in the context of addresses. > > > + * > > > + * Doesn't restrict listening for non-TCP sockets. > > > + */ > > > + if (family != AF_INET && family != AF_INET6) > > > + return 0; > > > + > > > + lock_sock(sk); > > > + /* > > > + * Calling listen(2) for a listening socket does nothing with its state and > > > + * only changes backlog value (cf. __inet_listen_sk). Checking of listen > > > + * access right is not required. > > > + */ > > > + if (sk->sk_state == TCP_LISTEN) > > > + goto release_nocheck; > > > + > > > + err = check_tcp_socket_can_listen(sock); > > > + if (unlikely(err)) > > > + goto release_nocheck; > > > + > > > + port = htons(inet_sk(sk)->inet_num); > > > + release_sock(sk); > > > + return check_access_socket(dom, port, LANDLOCK_ACCESS_NET_LISTEN_TCP); > > > + > > > +release_nocheck: > > > + release_sock(sk); > > > + return err; > > > +} > > > + > > > static struct security_hook_list landlock_hooks[] __ro_after_init = { > > > LSM_HOOK_INIT(socket_bind, hook_socket_bind), > > > LSM_HOOK_INIT(socket_connect, hook_socket_connect), > > > + LSM_HOOK_INIT(socket_listen, hook_socket_listen), > > > }; > > > __init void landlock_add_net_hooks(void) > > > diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c > > > index 03b470f5a85a..3752bcc033d4 100644 > > > --- a/security/landlock/syscalls.c > > > +++ b/security/landlock/syscalls.c > > > @@ -149,7 +149,7 @@ static const struct file_operations ruleset_fops = { > > > .write = fop_dummy_write, > > > }; > > > -#define LANDLOCK_ABI_VERSION 5 > > > +#define LANDLOCK_ABI_VERSION 6 > > > /** > > > * sys_landlock_create_ruleset - Create a new ruleset > > > diff --git a/tools/testing/selftests/landlock/base_test.c b/tools/testing/selftests/landlock/base_test.c > > > index 3c1e9f35b531..52b00472a487 100644 > > > --- a/tools/testing/selftests/landlock/base_test.c > > > +++ b/tools/testing/selftests/landlock/base_test.c > > > @@ -75,7 +75,7 @@ TEST(abi_version) > > > const struct landlock_ruleset_attr ruleset_attr = { > > > .handled_access_fs = LANDLOCK_ACCESS_FS_READ_FILE, > > > }; > > > - ASSERT_EQ(5, landlock_create_ruleset(NULL, 0, > > > + ASSERT_EQ(6, landlock_create_ruleset(NULL, 0, > > > LANDLOCK_CREATE_RULESET_VERSION)); > > > ASSERT_EQ(-1, landlock_create_ruleset(&ruleset_attr, 0, > > > -- > > > 2.34.1 > > > > > > >
On Sun, Jul 28, 2024 at 08:25:55AM +0800, Mikhail Ivanov wrote: > LANDLOCK_ACCESS_NET_BIND_TCP is useful to limit the scope of "bindable" > ports to forbid a malicious sandboxed process to impersonate a legitimate > server process. However, bind(2) might be used by (TCP) clients to set the > source port to a (legitimate) value. Controlling the ports that can be > used for listening would allow (TCP) clients to explicitly bind to ports > that are forbidden for listening. > > Such control is implemented with a new LANDLOCK_ACCESS_NET_LISTEN_TCP > access right that restricts listening on undesired ports with listen(2). > > It's worth noticing that this access right doesn't affect changing > backlog value using listen(2) on already listening socket. > > * Create new LANDLOCK_ACCESS_NET_LISTEN_TCP flag. > * Add hook to socket_listen(), which checks whether the socket is allowed > to listen on a binded local port. > * Add check_tcp_socket_can_listen() helper, which validates socket > attributes before the actual access right check. > * Update `struct landlock_net_port_attr` documentation with control of > binding to ephemeral port with listen(2) description. > * Change ABI version to 6. > > Closes: https://github.com/landlock-lsm/linux/issues/15 > Signed-off-by: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com> > --- > include/uapi/linux/landlock.h | 23 +++-- > security/landlock/limits.h | 2 +- > security/landlock/net.c | 90 ++++++++++++++++++++ > security/landlock/syscalls.c | 2 +- > tools/testing/selftests/landlock/base_test.c | 2 +- > 5 files changed, 108 insertions(+), 11 deletions(-) > diff --git a/security/landlock/net.c b/security/landlock/net.c > index 669ba260342f..a29cb27c3f14 100644 > --- a/security/landlock/net.c > +++ b/security/landlock/net.c > @@ -6,10 +6,12 @@ > * Copyright © 2022-2023 Microsoft Corporation > */ > > +#include "net/sock.h" > #include <linux/in.h> > #include <linux/net.h> > #include <linux/socket.h> > #include <net/ipv6.h> > +#include <net/tcp.h> > > #include "common.h" > #include "cred.h" > @@ -194,9 +196,97 @@ static int hook_socket_connect(struct socket *const sock, > LANDLOCK_ACCESS_NET_CONNECT_TCP); > } > > +/* > + * Checks that socket state and attributes are correct for listen. > + * It is required to not wrongfully return -EACCES instead of -EINVAL. > + * > + * This checker requires sock->sk to be locked. > + */ > +static int check_tcp_socket_can_listen(struct socket *const sock) > +{ > + struct sock *sk = sock->sk; > + unsigned char cur_sk_state = sk->sk_state; > + const struct tcp_ulp_ops *icsk_ulp_ops; > + I think we can add this assert: lockdep_assert_held(&sk->sk_lock.slock); > + /* Allows only unconnected TCP socket to listen (cf. inet_listen). */ > + if (sock->state != SS_UNCONNECTED) > + return -EINVAL; > + > + /* > + * Checks sock state. This is needed to ensure consistency with inet stack > + * error handling (cf. __inet_listen_sk). > + */ > + if (WARN_ON_ONCE(!((1 << cur_sk_state) & (TCPF_CLOSE | TCPF_LISTEN)))) > + return -EINVAL; > + > + icsk_ulp_ops = inet_csk(sk)->icsk_ulp_ops; > + > + /* > + * ULP (Upper Layer Protocol) stands for protocols which are higher than > + * transport protocol in OSI model. Linux has an infrastructure that > + * allows TCP sockets to support logic of some ULP (e.g. TLS ULP). > + * > + * Sockets can listen only if ULP control hook has clone method. > + */ > + if (icsk_ulp_ops && !icsk_ulp_ops->clone) > + return -EINVAL; > + return 0; > +}
8/1/2024 5:45 PM, Mickaël Salaün wrote: > On Thu, Aug 01, 2024 at 10:52:25AM +0300, Mikhail Ivanov wrote: >> 7/31/2024 9:30 PM, Mickaël Salaün wrote: >>> On Sun, Jul 28, 2024 at 08:25:55AM +0800, Mikhail Ivanov wrote: >>>> LANDLOCK_ACCESS_NET_BIND_TCP is useful to limit the scope of "bindable" >>>> ports to forbid a malicious sandboxed process to impersonate a legitimate >>>> server process. However, bind(2) might be used by (TCP) clients to set the >>>> source port to a (legitimate) value. Controlling the ports that can be >>>> used for listening would allow (TCP) clients to explicitly bind to ports >>>> that are forbidden for listening. >>>> >>>> Such control is implemented with a new LANDLOCK_ACCESS_NET_LISTEN_TCP >>>> access right that restricts listening on undesired ports with listen(2). >>>> >>>> It's worth noticing that this access right doesn't affect changing >>>> backlog value using listen(2) on already listening socket. >>>> >>>> * Create new LANDLOCK_ACCESS_NET_LISTEN_TCP flag. >>>> * Add hook to socket_listen(), which checks whether the socket is allowed >>>> to listen on a binded local port. >>>> * Add check_tcp_socket_can_listen() helper, which validates socket >>>> attributes before the actual access right check. >>>> * Update `struct landlock_net_port_attr` documentation with control of >>>> binding to ephemeral port with listen(2) description. >>>> * Change ABI version to 6. >>>> >>>> Closes: https://github.com/landlock-lsm/linux/issues/15 >>>> Signed-off-by: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com> >>> >>> Thanks for this series! >>> >>> I cannot apply this patch series though, could you please provide the >>> base commit? BTW, this can be automatically put in the cover letter >>> with the git format-patch's --base argument. >> >> base-commit: 591561c2b47b7e7225e229e844f5de75ce0c09ec > > Thanks, the following commit makes this series to not apply. Sorry, you mean that the series are succesfully applied, right? > >> >> Günther said that I should rebase to the latest commits, so I'll do >> it in the next version of this patchset. > > Yep, currently we're on v6.11-rc1, but please specify the base commit > each time. ok > >> >>> >>>> --- >>>> include/uapi/linux/landlock.h | 23 +++-- >>>> security/landlock/limits.h | 2 +- >>>> security/landlock/net.c | 90 ++++++++++++++++++++ >>>> security/landlock/syscalls.c | 2 +- >>>> tools/testing/selftests/landlock/base_test.c | 2 +- >>>> 5 files changed, 108 insertions(+), 11 deletions(-) >>>> >>>> diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h >>>> index 68625e728f43..6b8df3293eee 100644 >>>> --- a/include/uapi/linux/landlock.h >>>> +++ b/include/uapi/linux/landlock.h >>>> @@ -104,13 +104,16 @@ struct landlock_net_port_attr { >>>> /** >>>> * @port: Network port in host endianness. >>>> * >>>> - * It should be noted that port 0 passed to :manpage:`bind(2)` will >>>> - * bind to an available port from a specific port range. This can be >>>> - * configured thanks to the ``/proc/sys/net/ipv4/ip_local_port_range`` >>>> - * sysctl (also used for IPv6). A Landlock rule with port 0 and the >>>> - * ``LANDLOCK_ACCESS_NET_BIND_TCP`` right means that requesting to bind >>>> - * on port 0 is allowed and it will automatically translate to binding >>>> - * on the related port range. >>>> + * It should be noted that some operations cause binding socket to a random >>>> + * available port from a specific port range. This can be configured thanks >>>> + * to the ``/proc/sys/net/ipv4/ip_local_port_range`` sysctl (also used for >>>> + * IPv6). Following operation requests are automatically translate to >>>> + * binding on the related port range: >>>> + * >>>> + * - A Landlock rule with port 0 and the ``LANDLOCK_ACCESS_NET_BIND_TCP`` >>>> + * right means that binding on port 0 is allowed. >>>> + * - A Landlock rule with port 0 and the ``LANDLOCK_ACCESS_NET_LISTEN_TCP`` >>>> + * right means listening without an explicit binding is allowed. >>>> */ >>>> __u64 port; >>>> }; >>>> @@ -251,7 +254,7 @@ struct landlock_net_port_attr { >>>> * DOC: net_access >>>> * >>>> * Network flags >>>> - * ~~~~~~~~~~~~~~~~ >>>> + * ~~~~~~~~~~~~~ >>>> * >>>> * These flags enable to restrict a sandboxed process to a set of network >>>> * actions. This is supported since the Landlock ABI version 4. >>>> @@ -261,9 +264,13 @@ struct landlock_net_port_attr { >>>> * - %LANDLOCK_ACCESS_NET_BIND_TCP: Bind a TCP socket to a local port. >>>> * - %LANDLOCK_ACCESS_NET_CONNECT_TCP: Connect an active TCP socket to >>>> * a remote port. >>>> + * - %LANDLOCK_ACCESS_NET_LISTEN_TCP: Listen for TCP socket connections on >>>> + * a local port. This access right is available since the sixth version >>>> + * of the Landlock ABI. >>>> */ >>>> /* clang-format off */ >>>> #define LANDLOCK_ACCESS_NET_BIND_TCP (1ULL << 0) >>>> #define LANDLOCK_ACCESS_NET_CONNECT_TCP (1ULL << 1) >>>> +#define LANDLOCK_ACCESS_NET_LISTEN_TCP (1ULL << 2) >>>> /* clang-format on */ >>>> #endif /* _UAPI_LINUX_LANDLOCK_H */ >>>> diff --git a/security/landlock/limits.h b/security/landlock/limits.h >>>> index 4eb643077a2a..2ef147389474 100644 >>>> --- a/security/landlock/limits.h >>>> +++ b/security/landlock/limits.h >>>> @@ -22,7 +22,7 @@ >>>> #define LANDLOCK_MASK_ACCESS_FS ((LANDLOCK_LAST_ACCESS_FS << 1) - 1) >>>> #define LANDLOCK_NUM_ACCESS_FS __const_hweight64(LANDLOCK_MASK_ACCESS_FS) >>>> -#define LANDLOCK_LAST_ACCESS_NET LANDLOCK_ACCESS_NET_CONNECT_TCP >>>> +#define LANDLOCK_LAST_ACCESS_NET LANDLOCK_ACCESS_NET_LISTEN_TCP >>>> #define LANDLOCK_MASK_ACCESS_NET ((LANDLOCK_LAST_ACCESS_NET << 1) - 1) >>>> #define LANDLOCK_NUM_ACCESS_NET __const_hweight64(LANDLOCK_MASK_ACCESS_NET) >>>> diff --git a/security/landlock/net.c b/security/landlock/net.c >>>> index 669ba260342f..a29cb27c3f14 100644 >>>> --- a/security/landlock/net.c >>>> +++ b/security/landlock/net.c >>>> @@ -6,10 +6,12 @@ >>>> * Copyright © 2022-2023 Microsoft Corporation >>>> */ >>>> +#include "net/sock.h" >>> >>> These should not be quotes. >> >> will be fixed, thanks >> >>> >>>> #include <linux/in.h> >>>> #include <linux/net.h> >>>> #include <linux/socket.h> >>>> #include <net/ipv6.h> >>>> +#include <net/tcp.h> >>>> #include "common.h" >>>> #include "cred.h" >>>> @@ -194,9 +196,97 @@ static int hook_socket_connect(struct socket *const sock, >>>> LANDLOCK_ACCESS_NET_CONNECT_TCP); >>>> } >>>> +/* >>>> + * Checks that socket state and attributes are correct for listen. >>>> + * It is required to not wrongfully return -EACCES instead of -EINVAL. >>>> + * >>>> + * This checker requires sock->sk to be locked. >>>> + */ >>>> +static int check_tcp_socket_can_listen(struct socket *const sock) >>> >>> Is this function still useful with the listen LSM hook? >> >> Yeap, we need to validate socket structure before checking the access >> right. You can see [1] and [2] where the behavior of this function is >> tested. >> >> [1] https://lore.kernel.org/all/20240728002602.3198398-6-ivanov.mikhail1@huawei-partners.com/ >> [2] https://lore.kernel.org/all/20240728002602.3198398-8-ivanov.mikhail1@huawei-partners.com/ > > OK, that's good. > >> >>> >>>> +{ >>>> + struct sock *sk = sock->sk; >>>> + unsigned char cur_sk_state = sk->sk_state; >>>> + const struct tcp_ulp_ops *icsk_ulp_ops; >>>> + >>>> + /* Allows only unconnected TCP socket to listen (cf. inet_listen). */ >>>> + if (sock->state != SS_UNCONNECTED) >>>> + return -EINVAL; >>>> + >>>> + /* >>>> + * Checks sock state. This is needed to ensure consistency with inet stack >>>> + * error handling (cf. __inet_listen_sk). >>>> + */ >>>> + if (WARN_ON_ONCE(!((1 << cur_sk_state) & (TCPF_CLOSE | TCPF_LISTEN)))) >>>> + return -EINVAL; >>>> + >>>> + icsk_ulp_ops = inet_csk(sk)->icsk_ulp_ops; >>>> + >>>> + /* >>>> + * ULP (Upper Layer Protocol) stands for protocols which are higher than >>>> + * transport protocol in OSI model. Linux has an infrastructure that >>>> + * allows TCP sockets to support logic of some ULP (e.g. TLS ULP). >>>> + * >>>> + * Sockets can listen only if ULP control hook has clone method. >>>> + */ >>>> + if (icsk_ulp_ops && !icsk_ulp_ops->clone) >>>> + return -EINVAL; >>>> + return 0; >>>> +} >>>> + >>>> +static int hook_socket_listen(struct socket *const sock, const int backlog) >>>> +{ >>> >>> Why can't we just call current_check_access_socket()? >> >> I've mentioned in the message of the previous commit that this method >> has address checks for bind(2) and connect(2). In the case of listen(2) >> port is extracted from the socket structure, so calling >> current_check_access_socket() would be pointless. > > Yep, I missed the check_access_socket() refactoring. > >> >>> >>>> + int err = 0; >>>> + int family; >>>> + __be16 port; >>>> + struct sock *sk; >>>> + const struct landlock_ruleset *const dom = get_current_net_domain(); >>>> + >>>> + if (!dom) >>>> + return 0; >>>> + if (WARN_ON_ONCE(dom->num_layers < 1)) >>>> + return -EACCES; >>>> + >>>> + /* Checks if it's a (potential) TCP socket. */ >>>> + if (sock->type != SOCK_STREAM) >>>> + return 0; >>>> + >>>> + sk = sock->sk; >>>> + family = sk->__sk_common.skc_family; >>>> + /* >>>> + * Socket cannot be assigned AF_UNSPEC because this type is used only >>>> + * in the context of addresses. >>>> + * >>>> + * Doesn't restrict listening for non-TCP sockets. >>>> + */ >>>> + if (family != AF_INET && family != AF_INET6) >>>> + return 0; >>>> + >>>> + lock_sock(sk); >>>> + /* >>>> + * Calling listen(2) for a listening socket does nothing with its state and >>>> + * only changes backlog value (cf. __inet_listen_sk). Checking of listen >>>> + * access right is not required. >>>> + */ >>>> + if (sk->sk_state == TCP_LISTEN) >>>> + goto release_nocheck; >>>> + >>>> + err = check_tcp_socket_can_listen(sock); >>>> + if (unlikely(err)) >>>> + goto release_nocheck; >>>> + >>>> + port = htons(inet_sk(sk)->inet_num); >>>> + release_sock(sk); >>>> + return check_access_socket(dom, port, LANDLOCK_ACCESS_NET_LISTEN_TCP); >>>> + >>>> +release_nocheck: >>>> + release_sock(sk); >>>> + return err; >>>> +} >>>> + >>>> static struct security_hook_list landlock_hooks[] __ro_after_init = { >>>> LSM_HOOK_INIT(socket_bind, hook_socket_bind), >>>> LSM_HOOK_INIT(socket_connect, hook_socket_connect), >>>> + LSM_HOOK_INIT(socket_listen, hook_socket_listen), >>>> }; >>>> __init void landlock_add_net_hooks(void) >>>> diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c >>>> index 03b470f5a85a..3752bcc033d4 100644 >>>> --- a/security/landlock/syscalls.c >>>> +++ b/security/landlock/syscalls.c >>>> @@ -149,7 +149,7 @@ static const struct file_operations ruleset_fops = { >>>> .write = fop_dummy_write, >>>> }; >>>> -#define LANDLOCK_ABI_VERSION 5 >>>> +#define LANDLOCK_ABI_VERSION 6 >>>> /** >>>> * sys_landlock_create_ruleset - Create a new ruleset >>>> diff --git a/tools/testing/selftests/landlock/base_test.c b/tools/testing/selftests/landlock/base_test.c >>>> index 3c1e9f35b531..52b00472a487 100644 >>>> --- a/tools/testing/selftests/landlock/base_test.c >>>> +++ b/tools/testing/selftests/landlock/base_test.c >>>> @@ -75,7 +75,7 @@ TEST(abi_version) >>>> const struct landlock_ruleset_attr ruleset_attr = { >>>> .handled_access_fs = LANDLOCK_ACCESS_FS_READ_FILE, >>>> }; >>>> - ASSERT_EQ(5, landlock_create_ruleset(NULL, 0, >>>> + ASSERT_EQ(6, landlock_create_ruleset(NULL, 0, >>>> LANDLOCK_CREATE_RULESET_VERSION)); >>>> ASSERT_EQ(-1, landlock_create_ruleset(&ruleset_attr, 0, >>>> -- >>>> 2.34.1 >>>> >>>> >>
On Thu, Aug 01, 2024 at 06:34:41PM +0300, Mikhail Ivanov wrote: > 8/1/2024 5:45 PM, Mickaël Salaün wrote: > > On Thu, Aug 01, 2024 at 10:52:25AM +0300, Mikhail Ivanov wrote: > > > 7/31/2024 9:30 PM, Mickaël Salaün wrote: > > > > On Sun, Jul 28, 2024 at 08:25:55AM +0800, Mikhail Ivanov wrote: > > > > > LANDLOCK_ACCESS_NET_BIND_TCP is useful to limit the scope of "bindable" > > > > > ports to forbid a malicious sandboxed process to impersonate a legitimate > > > > > server process. However, bind(2) might be used by (TCP) clients to set the > > > > > source port to a (legitimate) value. Controlling the ports that can be > > > > > used for listening would allow (TCP) clients to explicitly bind to ports > > > > > that are forbidden for listening. > > > > > > > > > > Such control is implemented with a new LANDLOCK_ACCESS_NET_LISTEN_TCP > > > > > access right that restricts listening on undesired ports with listen(2). > > > > > > > > > > It's worth noticing that this access right doesn't affect changing > > > > > backlog value using listen(2) on already listening socket. > > > > > > > > > > * Create new LANDLOCK_ACCESS_NET_LISTEN_TCP flag. > > > > > * Add hook to socket_listen(), which checks whether the socket is allowed > > > > > to listen on a binded local port. > > > > > * Add check_tcp_socket_can_listen() helper, which validates socket > > > > > attributes before the actual access right check. > > > > > * Update `struct landlock_net_port_attr` documentation with control of > > > > > binding to ephemeral port with listen(2) description. > > > > > * Change ABI version to 6. > > > > > > > > > > Closes: https://github.com/landlock-lsm/linux/issues/15 > > > > > Signed-off-by: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com> > > > > > > > > Thanks for this series! > > > > > > > > I cannot apply this patch series though, could you please provide the > > > > base commit? BTW, this can be automatically put in the cover letter > > > > with the git format-patch's --base argument. > > > > > > base-commit: 591561c2b47b7e7225e229e844f5de75ce0c09ec > > > > Thanks, the following commit makes this series to not apply. > > Sorry, you mean that the series are succesfully applied, right? Yes, it works with the commit you provided. I was talking about a next (logical) commit f4b89d8ce5a8 ("landlock: Various documentation improvements") which makes your series not apply, but that's OK now.
8/1/2024 5:45 PM, Mickaël Salaün wrote: > On Sun, Jul 28, 2024 at 08:25:55AM +0800, Mikhail Ivanov wrote: >> LANDLOCK_ACCESS_NET_BIND_TCP is useful to limit the scope of "bindable" >> ports to forbid a malicious sandboxed process to impersonate a legitimate >> server process. However, bind(2) might be used by (TCP) clients to set the >> source port to a (legitimate) value. Controlling the ports that can be >> used for listening would allow (TCP) clients to explicitly bind to ports >> that are forbidden for listening. >> >> Such control is implemented with a new LANDLOCK_ACCESS_NET_LISTEN_TCP >> access right that restricts listening on undesired ports with listen(2). >> >> It's worth noticing that this access right doesn't affect changing >> backlog value using listen(2) on already listening socket. >> >> * Create new LANDLOCK_ACCESS_NET_LISTEN_TCP flag. >> * Add hook to socket_listen(), which checks whether the socket is allowed >> to listen on a binded local port. >> * Add check_tcp_socket_can_listen() helper, which validates socket >> attributes before the actual access right check. >> * Update `struct landlock_net_port_attr` documentation with control of >> binding to ephemeral port with listen(2) description. >> * Change ABI version to 6. >> >> Closes: https://github.com/landlock-lsm/linux/issues/15 >> Signed-off-by: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com> >> --- >> include/uapi/linux/landlock.h | 23 +++-- >> security/landlock/limits.h | 2 +- >> security/landlock/net.c | 90 ++++++++++++++++++++ >> security/landlock/syscalls.c | 2 +- >> tools/testing/selftests/landlock/base_test.c | 2 +- >> 5 files changed, 108 insertions(+), 11 deletions(-) > >> diff --git a/security/landlock/net.c b/security/landlock/net.c >> index 669ba260342f..a29cb27c3f14 100644 >> --- a/security/landlock/net.c >> +++ b/security/landlock/net.c >> @@ -6,10 +6,12 @@ >> * Copyright © 2022-2023 Microsoft Corporation >> */ >> >> +#include "net/sock.h" >> #include <linux/in.h> >> #include <linux/net.h> >> #include <linux/socket.h> >> #include <net/ipv6.h> >> +#include <net/tcp.h> >> >> #include "common.h" >> #include "cred.h" >> @@ -194,9 +196,97 @@ static int hook_socket_connect(struct socket *const sock, >> LANDLOCK_ACCESS_NET_CONNECT_TCP); >> } >> >> +/* >> + * Checks that socket state and attributes are correct for listen. >> + * It is required to not wrongfully return -EACCES instead of -EINVAL. >> + * >> + * This checker requires sock->sk to be locked. >> + */ >> +static int check_tcp_socket_can_listen(struct socket *const sock) >> +{ >> + struct sock *sk = sock->sk; >> + unsigned char cur_sk_state = sk->sk_state; >> + const struct tcp_ulp_ops *icsk_ulp_ops; >> + > > I think we can add this assert: > lockdep_assert_held(&sk->sk_lock.slock); Ok, let's add it. I just haven't seen this being a common practice in the network stack. > >> + /* Allows only unconnected TCP socket to listen (cf. inet_listen). */ >> + if (sock->state != SS_UNCONNECTED) >> + return -EINVAL; >> + >> + /* >> + * Checks sock state. This is needed to ensure consistency with inet stack >> + * error handling (cf. __inet_listen_sk). >> + */ >> + if (WARN_ON_ONCE(!((1 << cur_sk_state) & (TCPF_CLOSE | TCPF_LISTEN)))) >> + return -EINVAL; >> + >> + icsk_ulp_ops = inet_csk(sk)->icsk_ulp_ops; >> + >> + /* >> + * ULP (Upper Layer Protocol) stands for protocols which are higher than >> + * transport protocol in OSI model. Linux has an infrastructure that >> + * allows TCP sockets to support logic of some ULP (e.g. TLS ULP). >> + * >> + * Sockets can listen only if ULP control hook has clone method. >> + */ >> + if (icsk_ulp_ops && !icsk_ulp_ops->clone) >> + return -EINVAL; >> + return 0; >> +}
8/1/2024 7:01 PM, Mickaël Salaün wrote: > On Thu, Aug 01, 2024 at 06:34:41PM +0300, Mikhail Ivanov wrote: >> 8/1/2024 5:45 PM, Mickaël Salaün wrote: >>> On Thu, Aug 01, 2024 at 10:52:25AM +0300, Mikhail Ivanov wrote: >>>> 7/31/2024 9:30 PM, Mickaël Salaün wrote: >>>>> On Sun, Jul 28, 2024 at 08:25:55AM +0800, Mikhail Ivanov wrote: >>>>>> LANDLOCK_ACCESS_NET_BIND_TCP is useful to limit the scope of "bindable" >>>>>> ports to forbid a malicious sandboxed process to impersonate a legitimate >>>>>> server process. However, bind(2) might be used by (TCP) clients to set the >>>>>> source port to a (legitimate) value. Controlling the ports that can be >>>>>> used for listening would allow (TCP) clients to explicitly bind to ports >>>>>> that are forbidden for listening. >>>>>> >>>>>> Such control is implemented with a new LANDLOCK_ACCESS_NET_LISTEN_TCP >>>>>> access right that restricts listening on undesired ports with listen(2). >>>>>> >>>>>> It's worth noticing that this access right doesn't affect changing >>>>>> backlog value using listen(2) on already listening socket. >>>>>> >>>>>> * Create new LANDLOCK_ACCESS_NET_LISTEN_TCP flag. >>>>>> * Add hook to socket_listen(), which checks whether the socket is allowed >>>>>> to listen on a binded local port. >>>>>> * Add check_tcp_socket_can_listen() helper, which validates socket >>>>>> attributes before the actual access right check. >>>>>> * Update `struct landlock_net_port_attr` documentation with control of >>>>>> binding to ephemeral port with listen(2) description. >>>>>> * Change ABI version to 6. >>>>>> >>>>>> Closes: https://github.com/landlock-lsm/linux/issues/15 >>>>>> Signed-off-by: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com> >>>>> >>>>> Thanks for this series! >>>>> >>>>> I cannot apply this patch series though, could you please provide the >>>>> base commit? BTW, this can be automatically put in the cover letter >>>>> with the git format-patch's --base argument. >>>> >>>> base-commit: 591561c2b47b7e7225e229e844f5de75ce0c09ec >>> >>> Thanks, the following commit makes this series to not apply. >> >> Sorry, you mean that the series are succesfully applied, right? > > Yes, it works with the commit you provided. I was talking about a next > (logical) commit f4b89d8ce5a8 ("landlock: Various documentation > improvements") which makes your series not apply, but that's OK now. Nice
diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h index 68625e728f43..6b8df3293eee 100644 --- a/include/uapi/linux/landlock.h +++ b/include/uapi/linux/landlock.h @@ -104,13 +104,16 @@ struct landlock_net_port_attr { /** * @port: Network port in host endianness. * - * It should be noted that port 0 passed to :manpage:`bind(2)` will - * bind to an available port from a specific port range. This can be - * configured thanks to the ``/proc/sys/net/ipv4/ip_local_port_range`` - * sysctl (also used for IPv6). A Landlock rule with port 0 and the - * ``LANDLOCK_ACCESS_NET_BIND_TCP`` right means that requesting to bind - * on port 0 is allowed and it will automatically translate to binding - * on the related port range. + * It should be noted that some operations cause binding socket to a random + * available port from a specific port range. This can be configured thanks + * to the ``/proc/sys/net/ipv4/ip_local_port_range`` sysctl (also used for + * IPv6). Following operation requests are automatically translate to + * binding on the related port range: + * + * - A Landlock rule with port 0 and the ``LANDLOCK_ACCESS_NET_BIND_TCP`` + * right means that binding on port 0 is allowed. + * - A Landlock rule with port 0 and the ``LANDLOCK_ACCESS_NET_LISTEN_TCP`` + * right means listening without an explicit binding is allowed. */ __u64 port; }; @@ -251,7 +254,7 @@ struct landlock_net_port_attr { * DOC: net_access * * Network flags - * ~~~~~~~~~~~~~~~~ + * ~~~~~~~~~~~~~ * * These flags enable to restrict a sandboxed process to a set of network * actions. This is supported since the Landlock ABI version 4. @@ -261,9 +264,13 @@ struct landlock_net_port_attr { * - %LANDLOCK_ACCESS_NET_BIND_TCP: Bind a TCP socket to a local port. * - %LANDLOCK_ACCESS_NET_CONNECT_TCP: Connect an active TCP socket to * a remote port. + * - %LANDLOCK_ACCESS_NET_LISTEN_TCP: Listen for TCP socket connections on + * a local port. This access right is available since the sixth version + * of the Landlock ABI. */ /* clang-format off */ #define LANDLOCK_ACCESS_NET_BIND_TCP (1ULL << 0) #define LANDLOCK_ACCESS_NET_CONNECT_TCP (1ULL << 1) +#define LANDLOCK_ACCESS_NET_LISTEN_TCP (1ULL << 2) /* clang-format on */ #endif /* _UAPI_LINUX_LANDLOCK_H */ diff --git a/security/landlock/limits.h b/security/landlock/limits.h index 4eb643077a2a..2ef147389474 100644 --- a/security/landlock/limits.h +++ b/security/landlock/limits.h @@ -22,7 +22,7 @@ #define LANDLOCK_MASK_ACCESS_FS ((LANDLOCK_LAST_ACCESS_FS << 1) - 1) #define LANDLOCK_NUM_ACCESS_FS __const_hweight64(LANDLOCK_MASK_ACCESS_FS) -#define LANDLOCK_LAST_ACCESS_NET LANDLOCK_ACCESS_NET_CONNECT_TCP +#define LANDLOCK_LAST_ACCESS_NET LANDLOCK_ACCESS_NET_LISTEN_TCP #define LANDLOCK_MASK_ACCESS_NET ((LANDLOCK_LAST_ACCESS_NET << 1) - 1) #define LANDLOCK_NUM_ACCESS_NET __const_hweight64(LANDLOCK_MASK_ACCESS_NET) diff --git a/security/landlock/net.c b/security/landlock/net.c index 669ba260342f..a29cb27c3f14 100644 --- a/security/landlock/net.c +++ b/security/landlock/net.c @@ -6,10 +6,12 @@ * Copyright © 2022-2023 Microsoft Corporation */ +#include "net/sock.h" #include <linux/in.h> #include <linux/net.h> #include <linux/socket.h> #include <net/ipv6.h> +#include <net/tcp.h> #include "common.h" #include "cred.h" @@ -194,9 +196,97 @@ static int hook_socket_connect(struct socket *const sock, LANDLOCK_ACCESS_NET_CONNECT_TCP); } +/* + * Checks that socket state and attributes are correct for listen. + * It is required to not wrongfully return -EACCES instead of -EINVAL. + * + * This checker requires sock->sk to be locked. + */ +static int check_tcp_socket_can_listen(struct socket *const sock) +{ + struct sock *sk = sock->sk; + unsigned char cur_sk_state = sk->sk_state; + const struct tcp_ulp_ops *icsk_ulp_ops; + + /* Allows only unconnected TCP socket to listen (cf. inet_listen). */ + if (sock->state != SS_UNCONNECTED) + return -EINVAL; + + /* + * Checks sock state. This is needed to ensure consistency with inet stack + * error handling (cf. __inet_listen_sk). + */ + if (WARN_ON_ONCE(!((1 << cur_sk_state) & (TCPF_CLOSE | TCPF_LISTEN)))) + return -EINVAL; + + icsk_ulp_ops = inet_csk(sk)->icsk_ulp_ops; + + /* + * ULP (Upper Layer Protocol) stands for protocols which are higher than + * transport protocol in OSI model. Linux has an infrastructure that + * allows TCP sockets to support logic of some ULP (e.g. TLS ULP). + * + * Sockets can listen only if ULP control hook has clone method. + */ + if (icsk_ulp_ops && !icsk_ulp_ops->clone) + return -EINVAL; + return 0; +} + +static int hook_socket_listen(struct socket *const sock, const int backlog) +{ + int err = 0; + int family; + __be16 port; + struct sock *sk; + const struct landlock_ruleset *const dom = get_current_net_domain(); + + if (!dom) + return 0; + if (WARN_ON_ONCE(dom->num_layers < 1)) + return -EACCES; + + /* Checks if it's a (potential) TCP socket. */ + if (sock->type != SOCK_STREAM) + return 0; + + sk = sock->sk; + family = sk->__sk_common.skc_family; + /* + * Socket cannot be assigned AF_UNSPEC because this type is used only + * in the context of addresses. + * + * Doesn't restrict listening for non-TCP sockets. + */ + if (family != AF_INET && family != AF_INET6) + return 0; + + lock_sock(sk); + /* + * Calling listen(2) for a listening socket does nothing with its state and + * only changes backlog value (cf. __inet_listen_sk). Checking of listen + * access right is not required. + */ + if (sk->sk_state == TCP_LISTEN) + goto release_nocheck; + + err = check_tcp_socket_can_listen(sock); + if (unlikely(err)) + goto release_nocheck; + + port = htons(inet_sk(sk)->inet_num); + release_sock(sk); + return check_access_socket(dom, port, LANDLOCK_ACCESS_NET_LISTEN_TCP); + +release_nocheck: + release_sock(sk); + return err; +} + static struct security_hook_list landlock_hooks[] __ro_after_init = { LSM_HOOK_INIT(socket_bind, hook_socket_bind), LSM_HOOK_INIT(socket_connect, hook_socket_connect), + LSM_HOOK_INIT(socket_listen, hook_socket_listen), }; __init void landlock_add_net_hooks(void) diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c index 03b470f5a85a..3752bcc033d4 100644 --- a/security/landlock/syscalls.c +++ b/security/landlock/syscalls.c @@ -149,7 +149,7 @@ static const struct file_operations ruleset_fops = { .write = fop_dummy_write, }; -#define LANDLOCK_ABI_VERSION 5 +#define LANDLOCK_ABI_VERSION 6 /** * sys_landlock_create_ruleset - Create a new ruleset diff --git a/tools/testing/selftests/landlock/base_test.c b/tools/testing/selftests/landlock/base_test.c index 3c1e9f35b531..52b00472a487 100644 --- a/tools/testing/selftests/landlock/base_test.c +++ b/tools/testing/selftests/landlock/base_test.c @@ -75,7 +75,7 @@ TEST(abi_version) const struct landlock_ruleset_attr ruleset_attr = { .handled_access_fs = LANDLOCK_ACCESS_FS_READ_FILE, }; - ASSERT_EQ(5, landlock_create_ruleset(NULL, 0, + ASSERT_EQ(6, landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION)); ASSERT_EQ(-1, landlock_create_ruleset(&ruleset_attr, 0,
LANDLOCK_ACCESS_NET_BIND_TCP is useful to limit the scope of "bindable" ports to forbid a malicious sandboxed process to impersonate a legitimate server process. However, bind(2) might be used by (TCP) clients to set the source port to a (legitimate) value. Controlling the ports that can be used for listening would allow (TCP) clients to explicitly bind to ports that are forbidden for listening. Such control is implemented with a new LANDLOCK_ACCESS_NET_LISTEN_TCP access right that restricts listening on undesired ports with listen(2). It's worth noticing that this access right doesn't affect changing backlog value using listen(2) on already listening socket. * Create new LANDLOCK_ACCESS_NET_LISTEN_TCP flag. * Add hook to socket_listen(), which checks whether the socket is allowed to listen on a binded local port. * Add check_tcp_socket_can_listen() helper, which validates socket attributes before the actual access right check. * Update `struct landlock_net_port_attr` documentation with control of binding to ephemeral port with listen(2) description. * Change ABI version to 6. Closes: https://github.com/landlock-lsm/linux/issues/15 Signed-off-by: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com> --- include/uapi/linux/landlock.h | 23 +++-- security/landlock/limits.h | 2 +- security/landlock/net.c | 90 ++++++++++++++++++++ security/landlock/syscalls.c | 2 +- tools/testing/selftests/landlock/base_test.c | 2 +- 5 files changed, 108 insertions(+), 11 deletions(-)