mbox series

[v9,00/12] Network support for Landlock

Message ID 20230116085818.165539-1-konstantin.meskhidze@huawei.com (mailing list archive)
Headers show
Series Network support for Landlock | expand

Message

Konstantin Meskhidze (A) Jan. 16, 2023, 8:58 a.m. UTC
Hi,
This is a new V9 patch related to Landlock LSM network confinement.
It is based on the landlock's -next branch on top of v6.2-rc3 kernel version:
https://git.kernel.org/pub/scm/linux/kernel/git/mic/linux.git/log/?h=next

It brings refactoring of previous patch version V8.
Mostly there are fixes of logic and typos, adding new tests.

All test were run in QEMU evironment and compiled with
 -static flag.
 1. network_test: 32/32 tests passed.
 2. base_test: 7/7 tests passed.
 3. fs_test: 78/78 tests passed.
 4. ptrace_test: 8/8 tests passed.

Previous versions:
v8: https://lore.kernel.org/linux-security-module/20221021152644.155136-1-konstantin.meskhidze@huawei.com/
v7: https://lore.kernel.org/linux-security-module/20220829170401.834298-1-konstantin.meskhidze@huawei.com/
v6: https://lore.kernel.org/linux-security-module/20220621082313.3330667-1-konstantin.meskhidze@huawei.com/
v5: https://lore.kernel.org/linux-security-module/20220516152038.39594-1-konstantin.meskhidze@huawei.com
v4: https://lore.kernel.org/linux-security-module/20220309134459.6448-1-konstantin.meskhidze@huawei.com/
v3: https://lore.kernel.org/linux-security-module/20220124080215.265538-1-konstantin.meskhidze@huawei.com/
v2: https://lore.kernel.org/linux-security-module/20211228115212.703084-1-konstantin.meskhidze@huawei.com/
v1: https://lore.kernel.org/linux-security-module/20211210072123.386713-1-konstantin.meskhidze@huawei.com/

Konstantin Meskhidze (11):
  landlock: Make ruleset's access masks more generic
  landlock: Refactor landlock_find_rule/insert_rule
  landlock: Refactor merge/inherit_ruleset functions
  landlock: Move and rename umask_layers() and init_layer_masks()
  landlock: Refactor _unmask_layers() and _init_layer_masks()
  landlock: Refactor landlock_add_rule() syscall
  landlock: Add network rules and TCP hooks support
  selftests/landlock: Share enforce_ruleset()
  selftests/landlock: Add 10 new test suites dedicated to network
  samples/landlock: Add network demo
  landlock: Document Landlock's network support

Mickaël Salaün (1):
  landlock: Allow filesystem layout changes for domains without such
    rule type

 Documentation/userspace-api/landlock.rst     |   72 +-
 include/uapi/linux/landlock.h                |   49 +
 samples/landlock/sandboxer.c                 |  131 +-
 security/landlock/Kconfig                    |    1 +
 security/landlock/Makefile                   |    2 +
 security/landlock/fs.c                       |  255 ++--
 security/landlock/limits.h                   |    7 +-
 security/landlock/net.c                      |  200 +++
 security/landlock/net.h                      |   26 +
 security/landlock/ruleset.c                  |  409 +++++--
 security/landlock/ruleset.h                  |  185 ++-
 security/landlock/setup.c                    |    2 +
 security/landlock/syscalls.c                 |  165 ++-
 tools/testing/selftests/landlock/base_test.c |    2 +-
 tools/testing/selftests/landlock/common.h    |   10 +
 tools/testing/selftests/landlock/config      |    4 +
 tools/testing/selftests/landlock/fs_test.c   |   75 +-
 tools/testing/selftests/landlock/net_test.c  | 1157 ++++++++++++++++++
 18 files changed, 2398 insertions(+), 354 deletions(-)
 create mode 100644 security/landlock/net.c
 create mode 100644 security/landlock/net.h
 create mode 100644 tools/testing/selftests/landlock/net_test.c

Comments

Günther Noack Feb. 23, 2023, 10:17 p.m. UTC | #1
Hello Konstantin!

Sorry for asking such fundamental questions again so late in the review.

After playing with patch V9 with the Go-Landlock library, I'm still
having trouble understanding these questions -- they probably have
good answers, but I also did not see them explained in the
documentation. Maybe it would help to clarify it there?

* What is the use case for permitting processes to connect to a given
  TCP port, but leaving unspecified what the IP address is?

  Example: If a Landlock ruleset permits connecting to TCP port 53,
  that makes it possible to talk to any IP address on the internet (at
  least if the process runs on a normal Linux desktop machine), and we
  can't really control whether that is the system's proper (TCP-)DNS
  server or whether that is an attacker-controlled service for
  accepting leaked secrets from the process...?

  Is the plan that IP address support should be added in a follow-up
  patch?  Will it become part of the landlock_net_service_attr struct?

* Given the list of obscure network protocols listed in the socket(2)
  man page, I find it slightly weird to have rules for the use of TCP,
  but to leave less prominent protocols unrestricted.

  For example, a process with an enabled Landlock network ruleset may
  connect only to certain TCP ports, but at the same time it can
  happily use Bluetooth/CAN bus/DECnet/IPX or other protocols?

  I'm mentioning these more obscure protocols, because I doubt that
  Landlock will grow more sophisticated support for them anytime soon,
  so maybe the best option would be to just make it possible to
  disable these?  Is that also part of the plan?

  (I think there would be a lot of value in restricting network
  access, even when it's done very broadly.  There are many programs
  that don't need network at all, and among those that do need
  network, most only require IP networking.

  Btw, the argument for more broad disabling of network access was
  already made at https://cr.yp.to/unix/disablenetwork.html in the
  past.)

* This one is more of an implementation question: I don't understand
  why we are storing the networking rules in the same RB tree as the
  file system rules. - It looks a bit like "YAGNI" to me...?

  Would it be more efficient to keep the file system rules in the
  existing RB tree, and store the networking rules *separately* next
  to it in a different RB tree, or even in a more optimized data
  structure? In pseudocode:

    struct fast_lookup_int_set bind_tcp_ports;
    struct fast_lookup_int_set connect_tcp_ports;
    struct landlock_rb_tree fs_rules;

  It seems that there should be a data structure that supports this
  well and which uses the fact that we only need to store small
  integers?

Thanks,
–Günther

P.S.: Apologies if some of it was discussed previously. I did my best
to catch up on previous threads, but it's long, and it's possible that
I missed parts of the discussion.

On Mon, Jan 16, 2023 at 04:58:06PM +0800, Konstantin Meskhidze wrote:
> Hi,
> This is a new V9 patch related to Landlock LSM network confinement.
> It is based on the landlock's -next branch on top of v6.2-rc3 kernel version:
> https://git.kernel.org/pub/scm/linux/kernel/git/mic/linux.git/log/?h=next
> 
> It brings refactoring of previous patch version V8.
> Mostly there are fixes of logic and typos, adding new tests.
> 
> All test were run in QEMU evironment and compiled with
>  -static flag.
>  1. network_test: 32/32 tests passed.
>  2. base_test: 7/7 tests passed.
>  3. fs_test: 78/78 tests passed.
>  4. ptrace_test: 8/8 tests passed.
> 
> Previous versions:
> v8: https://lore.kernel.org/linux-security-module/20221021152644.155136-1-konstantin.meskhidze@huawei.com/
> v7: https://lore.kernel.org/linux-security-module/20220829170401.834298-1-konstantin.meskhidze@huawei.com/
> v6: https://lore.kernel.org/linux-security-module/20220621082313.3330667-1-konstantin.meskhidze@huawei.com/
> v5: https://lore.kernel.org/linux-security-module/20220516152038.39594-1-konstantin.meskhidze@huawei.com
> v4: https://lore.kernel.org/linux-security-module/20220309134459.6448-1-konstantin.meskhidze@huawei.com/
> v3: https://lore.kernel.org/linux-security-module/20220124080215.265538-1-konstantin.meskhidze@huawei.com/
> v2: https://lore.kernel.org/linux-security-module/20211228115212.703084-1-konstantin.meskhidze@huawei.com/
> v1: https://lore.kernel.org/linux-security-module/20211210072123.386713-1-konstantin.meskhidze@huawei.com/
> 
> Konstantin Meskhidze (11):
>   landlock: Make ruleset's access masks more generic
>   landlock: Refactor landlock_find_rule/insert_rule
>   landlock: Refactor merge/inherit_ruleset functions
>   landlock: Move and rename umask_layers() and init_layer_masks()
>   landlock: Refactor _unmask_layers() and _init_layer_masks()
>   landlock: Refactor landlock_add_rule() syscall
>   landlock: Add network rules and TCP hooks support
>   selftests/landlock: Share enforce_ruleset()
>   selftests/landlock: Add 10 new test suites dedicated to network
>   samples/landlock: Add network demo
>   landlock: Document Landlock's network support
> 
> Mickaël Salaün (1):
>   landlock: Allow filesystem layout changes for domains without such
>     rule type
> 
>  Documentation/userspace-api/landlock.rst     |   72 +-
>  include/uapi/linux/landlock.h                |   49 +
>  samples/landlock/sandboxer.c                 |  131 +-
>  security/landlock/Kconfig                    |    1 +
>  security/landlock/Makefile                   |    2 +
>  security/landlock/fs.c                       |  255 ++--
>  security/landlock/limits.h                   |    7 +-
>  security/landlock/net.c                      |  200 +++
>  security/landlock/net.h                      |   26 +
>  security/landlock/ruleset.c                  |  409 +++++--
>  security/landlock/ruleset.h                  |  185 ++-
>  security/landlock/setup.c                    |    2 +
>  security/landlock/syscalls.c                 |  165 ++-
>  tools/testing/selftests/landlock/base_test.c |    2 +-
>  tools/testing/selftests/landlock/common.h    |   10 +
>  tools/testing/selftests/landlock/config      |    4 +
>  tools/testing/selftests/landlock/fs_test.c   |   75 +-
>  tools/testing/selftests/landlock/net_test.c  | 1157 ++++++++++++++++++
>  18 files changed, 2398 insertions(+), 354 deletions(-)
>  create mode 100644 security/landlock/net.c
>  create mode 100644 security/landlock/net.h
>  create mode 100644 tools/testing/selftests/landlock/net_test.c
> 
> -- 
> 2.25.1
>
Konstantin Meskhidze (A) March 6, 2023, 7:45 a.m. UTC | #2
2/24/2023 1:17 AM, Günther Noack пишет:
> Hello Konstantin!

   Hi, Günther.
   Sorry for late reply. I was in short vacation.
   Thanks for the questions. I will try to give you answers ASAP.
> 
> Sorry for asking such fundamental questions again so late in the review.
> 
> After playing with patch V9 with the Go-Landlock library, I'm still
> having trouble understanding these questions -- they probably have
> good answers, but I also did not see them explained in the
> documentation. Maybe it would help to clarify it there?
> 
> * What is the use case for permitting processes to connect to a given
>    TCP port, but leaving unspecified what the IP address is?
> 
>    Example: If a Landlock ruleset permits connecting to TCP port 53,
>    that makes it possible to talk to any IP address on the internet (at
>    least if the process runs on a normal Linux desktop machine), and we
>    can't really control whether that is the system's proper (TCP-)DNS
>    server or whether that is an attacker-controlled service for
>    accepting leaked secrets from the process...?
> 
>    Is the plan that IP address support should be added in a follow-up
>    patch?  Will it become part of the landlock_net_service_attr struct?
> 
> * Given the list of obscure network protocols listed in the socket(2)
>    man page, I find it slightly weird to have rules for the use of TCP,
>    but to leave less prominent protocols unrestricted.
> 
>    For example, a process with an enabled Landlock network ruleset may
>    connect only to certain TCP ports, but at the same time it can
>    happily use Bluetooth/CAN bus/DECnet/IPX or other protocols?
> 
>    I'm mentioning these more obscure protocols, because I doubt that
>    Landlock will grow more sophisticated support for them anytime soon,
>    so maybe the best option would be to just make it possible to
>    disable these?  Is that also part of the plan?
> 
>    (I think there would be a lot of value in restricting network
>    access, even when it's done very broadly.  There are many programs
>    that don't need network at all, and among those that do need
>    network, most only require IP networking.
> 
>    Btw, the argument for more broad disabling of network access was
>    already made at https://cr.yp.to/unix/disablenetwork.html in the
>    past.)
> 
> * This one is more of an implementation question: I don't understand
>    why we are storing the networking rules in the same RB tree as the
>    file system rules. - It looks a bit like "YAGNI" to me...?
> 
>    Would it be more efficient to keep the file system rules in the
>    existing RB tree, and store the networking rules *separately* next
>    to it in a different RB tree, or even in a more optimized data
>    structure? In pseudocode:
> 
>      struct fast_lookup_int_set bind_tcp_ports;
>      struct fast_lookup_int_set connect_tcp_ports;
>      struct landlock_rb_tree fs_rules;
> 
>    It seems that there should be a data structure that supports this
>    well and which uses the fact that we only need to store small
>    integers?
> 
> Thanks,
> –Günther
> 
> P.S.: Apologies if some of it was discussed previously. I did my best
> to catch up on previous threads, but it's long, and it's possible that
> I missed parts of the discussion.
> 
> On Mon, Jan 16, 2023 at 04:58:06PM +0800, Konstantin Meskhidze wrote:
>> Hi,
>> This is a new V9 patch related to Landlock LSM network confinement.
>> It is based on the landlock's -next branch on top of v6.2-rc3 kernel version:
>> https://git.kernel.org/pub/scm/linux/kernel/git/mic/linux.git/log/?h=next
>> 
>> It brings refactoring of previous patch version V8.
>> Mostly there are fixes of logic and typos, adding new tests.
>> 
>> All test were run in QEMU evironment and compiled with
>>  -static flag.
>>  1. network_test: 32/32 tests passed.
>>  2. base_test: 7/7 tests passed.
>>  3. fs_test: 78/78 tests passed.
>>  4. ptrace_test: 8/8 tests passed.
>> 
>> Previous versions:
>> v8: https://lore.kernel.org/linux-security-module/20221021152644.155136-1-konstantin.meskhidze@huawei.com/
>> v7: https://lore.kernel.org/linux-security-module/20220829170401.834298-1-konstantin.meskhidze@huawei.com/
>> v6: https://lore.kernel.org/linux-security-module/20220621082313.3330667-1-konstantin.meskhidze@huawei.com/
>> v5: https://lore.kernel.org/linux-security-module/20220516152038.39594-1-konstantin.meskhidze@huawei.com
>> v4: https://lore.kernel.org/linux-security-module/20220309134459.6448-1-konstantin.meskhidze@huawei.com/
>> v3: https://lore.kernel.org/linux-security-module/20220124080215.265538-1-konstantin.meskhidze@huawei.com/
>> v2: https://lore.kernel.org/linux-security-module/20211228115212.703084-1-konstantin.meskhidze@huawei.com/
>> v1: https://lore.kernel.org/linux-security-module/20211210072123.386713-1-konstantin.meskhidze@huawei.com/
>> 
>> Konstantin Meskhidze (11):
>>   landlock: Make ruleset's access masks more generic
>>   landlock: Refactor landlock_find_rule/insert_rule
>>   landlock: Refactor merge/inherit_ruleset functions
>>   landlock: Move and rename umask_layers() and init_layer_masks()
>>   landlock: Refactor _unmask_layers() and _init_layer_masks()
>>   landlock: Refactor landlock_add_rule() syscall
>>   landlock: Add network rules and TCP hooks support
>>   selftests/landlock: Share enforce_ruleset()
>>   selftests/landlock: Add 10 new test suites dedicated to network
>>   samples/landlock: Add network demo
>>   landlock: Document Landlock's network support
>> 
>> Mickaël Salaün (1):
>>   landlock: Allow filesystem layout changes for domains without such
>>     rule type
>> 
>>  Documentation/userspace-api/landlock.rst     |   72 +-
>>  include/uapi/linux/landlock.h                |   49 +
>>  samples/landlock/sandboxer.c                 |  131 +-
>>  security/landlock/Kconfig                    |    1 +
>>  security/landlock/Makefile                   |    2 +
>>  security/landlock/fs.c                       |  255 ++--
>>  security/landlock/limits.h                   |    7 +-
>>  security/landlock/net.c                      |  200 +++
>>  security/landlock/net.h                      |   26 +
>>  security/landlock/ruleset.c                  |  409 +++++--
>>  security/landlock/ruleset.h                  |  185 ++-
>>  security/landlock/setup.c                    |    2 +
>>  security/landlock/syscalls.c                 |  165 ++-
>>  tools/testing/selftests/landlock/base_test.c |    2 +-
>>  tools/testing/selftests/landlock/common.h    |   10 +
>>  tools/testing/selftests/landlock/config      |    4 +
>>  tools/testing/selftests/landlock/fs_test.c   |   75 +-
>>  tools/testing/selftests/landlock/net_test.c  | 1157 ++++++++++++++++++
>>  18 files changed, 2398 insertions(+), 354 deletions(-)
>>  create mode 100644 security/landlock/net.c
>>  create mode 100644 security/landlock/net.h
>>  create mode 100644 tools/testing/selftests/landlock/net_test.c
>> 
>> -- 
>> 2.25.1
>> 
> .
Konstantin Meskhidze (A) March 13, 2023, 5:16 p.m. UTC | #3
2/24/2023 1:17 AM, Günther Noack пишет:
> Hello Konstantin!
> 
> Sorry for asking such fundamental questions again so late in the review.
> 
> After playing with patch V9 with the Go-Landlock library, I'm still
> having trouble understanding these questions -- they probably have
> good answers, but I also did not see them explained in the
> documentation. Maybe it would help to clarify it there?
> 
> * What is the use case for permitting processes to connect to a given
>    TCP port, but leaving unspecified what the IP address is?
> 
>    Example: If a Landlock ruleset permits connecting to TCP port 53,
>    that makes it possible to talk to any IP address on the internet (at
>    least if the process runs on a normal Linux desktop machine), and we
>    can't really control whether that is the system's proper (TCP-)DNS
>    server or whether that is an attacker-controlled service for
>    accepting leaked secrets from the process...?
> 
>    Is the plan that IP address support should be added in a follow-up
>    patch?  Will it become part of the landlock_net_service_attr struct?

      In the beginning I introduced the idea with IP address to
Mickaël but he suggested to use port-based granularity. So with ports 
it's worth using Landlock in containerized applications working within 
one IP address. Anyway it's possible to use netfilter to control 
incoming traffic. It's a good question - we should discuss it carefuly.
> 
> * Given the list of obscure network protocols listed in the socket(2)
>    man page, I find it slightly weird to have rules for the use of TCP,
>    but to leave less prominent protocols unrestricted.
> 
>    For example, a process with an enabled Landlock network ruleset may
>    connect only to certain TCP ports, but at the same time it can
>    happily use Bluetooth/CAN bus/DECnet/IPX or other protocols?

      We also have started a discussion about UDP protocol, but it's 
more complicated since UDP sockets does not establish connections 
between each other. There is a performance problem on the first place here.

I'm not familiar with Bluetooth/CAN bus/DECnet/IPX but let's discuss it.
Any ideas here?

> 
>    I'm mentioning these more obscure protocols, because I doubt that
>    Landlock will grow more sophisticated support for them anytime soon,
>    so maybe the best option would be to just make it possible to
>    disable these?  Is that also part of the plan?
> 
>    (I think there would be a lot of value in restricting network
>    access, even when it's done very broadly.  There are many programs
>    that don't need network at all, and among those that do need
>    network, most only require IP networking.
> 
>    Btw, the argument for more broad disabling of network access was
>    already made at https://cr.yp.to/unix/disablenetwork.html in the
>    past.)
      Thanks for the link. I will read it.
> 
> * This one is more of an implementation question: I don't understand
>    why we are storing the networking rules in the same RB tree as the
>    file system rules. - It looks a bit like "YAGNI" to me...?

      Actually network rules are stored in a different RB tree.
      You can check it in struct landlock_ruleset (ruleset.h):
      - struct rb_root root_inodeis for fs rules

      - struct rb_root root_net_port is for network rules;
> 
>    Would it be more efficient to keep the file system rules in the
>    existing RB tree, and store the networking rules *separately* next
>    to it in a different RB tree, or even in a more optimized data
>    structure? In pseudocode:
> 
>      struct fast_lookup_int_set bind_tcp_ports;
>      struct fast_lookup_int_set connect_tcp_ports;
>      struct landlock_rb_tree fs_rules;
> 
>    It seems that there should be a data structure that supports this
>    well and which uses the fact that we only need to store small
>    integers?

      Thnaks for the question. From my point of view it depends on a 
real scenario - how many ports we want to allow by Landlock for a 
proccess - thousands, hundreds or less. If it's just 10 ports - do we 
really need some optimized data structure? Do we get some performance 
gain here?
What do you think?
> 
> Thanks,
> –Günther
> 
> P.S.: Apologies if some of it was discussed previously. I did my best
> to catch up on previous threads, but it's long, and it's possible that
> I missed parts of the discussion.
> 
> On Mon, Jan 16, 2023 at 04:58:06PM +0800, Konstantin Meskhidze wrote:
>> Hi,
>> This is a new V9 patch related to Landlock LSM network confinement.
>> It is based on the landlock's -next branch on top of v6.2-rc3 kernel version:
>> https://git.kernel.org/pub/scm/linux/kernel/git/mic/linux.git/log/?h=next
>> 
>> It brings refactoring of previous patch version V8.
>> Mostly there are fixes of logic and typos, adding new tests.
>> 
>> All test were run in QEMU evironment and compiled with
>>  -static flag.
>>  1. network_test: 32/32 tests passed.
>>  2. base_test: 7/7 tests passed.
>>  3. fs_test: 78/78 tests passed.
>>  4. ptrace_test: 8/8 tests passed.
>> 
>> Previous versions:
>> v8: https://lore.kernel.org/linux-security-module/20221021152644.155136-1-konstantin.meskhidze@huawei.com/
>> v7: https://lore.kernel.org/linux-security-module/20220829170401.834298-1-konstantin.meskhidze@huawei.com/
>> v6: https://lore.kernel.org/linux-security-module/20220621082313.3330667-1-konstantin.meskhidze@huawei.com/
>> v5: https://lore.kernel.org/linux-security-module/20220516152038.39594-1-konstantin.meskhidze@huawei.com
>> v4: https://lore.kernel.org/linux-security-module/20220309134459.6448-1-konstantin.meskhidze@huawei.com/
>> v3: https://lore.kernel.org/linux-security-module/20220124080215.265538-1-konstantin.meskhidze@huawei.com/
>> v2: https://lore.kernel.org/linux-security-module/20211228115212.703084-1-konstantin.meskhidze@huawei.com/
>> v1: https://lore.kernel.org/linux-security-module/20211210072123.386713-1-konstantin.meskhidze@huawei.com/
>> 
>> Konstantin Meskhidze (11):
>>   landlock: Make ruleset's access masks more generic
>>   landlock: Refactor landlock_find_rule/insert_rule
>>   landlock: Refactor merge/inherit_ruleset functions
>>   landlock: Move and rename umask_layers() and init_layer_masks()
>>   landlock: Refactor _unmask_layers() and _init_layer_masks()
>>   landlock: Refactor landlock_add_rule() syscall
>>   landlock: Add network rules and TCP hooks support
>>   selftests/landlock: Share enforce_ruleset()
>>   selftests/landlock: Add 10 new test suites dedicated to network
>>   samples/landlock: Add network demo
>>   landlock: Document Landlock's network support
>> 
>> Mickaël Salaün (1):
>>   landlock: Allow filesystem layout changes for domains without such
>>     rule type
>> 
>>  Documentation/userspace-api/landlock.rst     |   72 +-
>>  include/uapi/linux/landlock.h                |   49 +
>>  samples/landlock/sandboxer.c                 |  131 +-
>>  security/landlock/Kconfig                    |    1 +
>>  security/landlock/Makefile                   |    2 +
>>  security/landlock/fs.c                       |  255 ++--
>>  security/landlock/limits.h                   |    7 +-
>>  security/landlock/net.c                      |  200 +++
>>  security/landlock/net.h                      |   26 +
>>  security/landlock/ruleset.c                  |  409 +++++--
>>  security/landlock/ruleset.h                  |  185 ++-
>>  security/landlock/setup.c                    |    2 +
>>  security/landlock/syscalls.c                 |  165 ++-
>>  tools/testing/selftests/landlock/base_test.c |    2 +-
>>  tools/testing/selftests/landlock/common.h    |   10 +
>>  tools/testing/selftests/landlock/config      |    4 +
>>  tools/testing/selftests/landlock/fs_test.c   |   75 +-
>>  tools/testing/selftests/landlock/net_test.c  | 1157 ++++++++++++++++++
>>  18 files changed, 2398 insertions(+), 354 deletions(-)
>>  create mode 100644 security/landlock/net.c
>>  create mode 100644 security/landlock/net.h
>>  create mode 100644 tools/testing/selftests/landlock/net_test.c
>> 
>> -- 
>> 2.25.1
>> 
> .
Mickaël Salaün March 14, 2023, 1:28 p.m. UTC | #4
On 13/03/2023 18:16, Konstantin Meskhidze (A) wrote:
> 
> 
> 2/24/2023 1:17 AM, Günther Noack пишет:
>> Hello Konstantin!
>>
>> Sorry for asking such fundamental questions again so late in the review.
>>
>> After playing with patch V9 with the Go-Landlock library, I'm still
>> having trouble understanding these questions -- they probably have
>> good answers, but I also did not see them explained in the
>> documentation. Maybe it would help to clarify it there?
>>
>> * What is the use case for permitting processes to connect to a given
>>     TCP port, but leaving unspecified what the IP address is?
>>
>>     Example: If a Landlock ruleset permits connecting to TCP port 53,
>>     that makes it possible to talk to any IP address on the internet (at
>>     least if the process runs on a normal Linux desktop machine), and we
>>     can't really control whether that is the system's proper (TCP-)DNS
>>     server or whether that is an attacker-controlled service for
>>     accepting leaked secrets from the process...?
>>
>>     Is the plan that IP address support should be added in a follow-up
>>     patch?  Will it become part of the landlock_net_service_attr struct?
> 
>        In the beginning I introduced the idea with IP address to
> Mickaël but he suggested to use port-based granularity. So with ports
> it's worth using Landlock in containerized applications working within
> one IP address. Anyway it's possible to use netfilter to control
> incoming traffic. It's a good question - we should discuss it carefuly.

Limiting access rule definition to TCP ports is because of two reasons:
- practical one: start with something small and improve it. All new 
features should be covered by tests, and it takes time to write and 
review them, especially to cover IPv4, IPv6 and any other type of 
address to identify a server. Being able to control TCP connect and bind 
is useful and brings the scaffolding for other non-kernel-object 
restrictions.
- semantic one: ports are tied to well-known (or configured) services, 
whatever the network where a process is (e.g. Internet, LAN, container's 
network namespace, VM). However, IP addresses are not well-known but 
(most of the time) tied to names/DNS, which is not handled by the kernel 
but user space. Moreover, I think it makes sense for app/service 
developers to think about reachable services, but much less about 
servers, which depend on the local network and a system-wide configuration.

For some use cases, there is definitely a need to restrict access to a 
set of servers though. I think a new dedicated attr struct would be 
easier to handle and it would make more sense to compose them (ANDing 
all network rule types to make a final decision). This new struct could 
define different kind of subnets (IPv4, IPv6, ethernet, bluetooth…). One 
of this type could be the local link, and especially if the server is 
local to the system or not (i.e. loopback interface), and if the server 
is in a specified network namespace (e.g. specific container/pod).

Anyway, this should indeed be documented.


>>
>> * Given the list of obscure network protocols listed in the socket(2)
>>     man page, I find it slightly weird to have rules for the use of TCP,
>>     but to leave less prominent protocols unrestricted.
>>
>>     For example, a process with an enabled Landlock network ruleset may
>>     connect only to certain TCP ports, but at the same time it can
>>     happily use Bluetooth/CAN bus/DECnet/IPX or other protocols?
> 
>        We also have started a discussion about UDP protocol, but it's
> more complicated since UDP sockets does not establish connections
> between each other. There is a performance problem on the first place here.
> 
> I'm not familiar with Bluetooth/CAN bus/DECnet/IPX but let's discuss it.
> Any ideas here?

All these protocols should be handled one way or another someday. ;)


> 
>>
>>     I'm mentioning these more obscure protocols, because I doubt that
>>     Landlock will grow more sophisticated support for them anytime soon,
>>     so maybe the best option would be to just make it possible to
>>     disable these?  Is that also part of the plan?
>>
>>     (I think there would be a lot of value in restricting network
>>     access, even when it's done very broadly.  There are many programs
>>     that don't need network at all, and among those that do need
>>     network, most only require IP networking.

Indeed, protocols that nobody care to make Landlock supports them will 
probably not have fine-grained control. We could extend the ruleset 
attributes to disable the use (i.e. not only the creation of new related 
sockets/resources) of network protocol families, in a way that would 
make sandboxes simulate a kernel without such protocol support. In this 
case, this should be an allowed list of protocols, and everything not in 
that list should be denied. This approach could be used for other kernel 
features (unrelated to network).


>>
>>     Btw, the argument for more broad disabling of network access was
>>     already made at https://cr.yp.to/unix/disablenetwork.html in the
>>     past.)

This is interesting but scoped to a single use case. As specified at the 
beginning of this linked page, there must be exceptions, not only with 
AF_UNIX but also for (the newer) AF_VSOCK, and probably future ones. 
This is why I don't think a binary approach is a good one for Linux. 
Users should be able to specify what they need, and block the rest.


>        Thanks for the link. I will read it.
>>
>> * This one is more of an implementation question: I don't understand
>>     why we are storing the networking rules in the same RB tree as the
>>     file system rules. - It looks a bit like "YAGNI" to me...?
> 
>        Actually network rules are stored in a different RB tree.
>        You can check it in struct landlock_ruleset (ruleset.h):
>        - struct rb_root root_inodeis for fs rules
> 
>        - struct rb_root root_net_port is for network rules;
>>
>>     Would it be more efficient to keep the file system rules in the
>>     existing RB tree, and store the networking rules *separately* next
>>     to it in a different RB tree, or even in a more optimized data
>>     structure? In pseudocode:
>>
>>       struct fast_lookup_int_set bind_tcp_ports;
>>       struct fast_lookup_int_set connect_tcp_ports;
>>       struct landlock_rb_tree fs_rules;
>>
>>     It seems that there should be a data structure that supports this
>>     well and which uses the fact that we only need to store small
>>     integers?
> 
>        Thnaks for the question. From my point of view it depends on a
> real scenario - how many ports we want to allow by Landlock for a
> proccess - thousands, hundreds or less. If it's just 10 ports - do we
> really need some optimized data structure? Do we get some performance
> gain here?
> What do you think?

As Konstantin explained, there are two different red-black trees. This 
data structure may not be optimal but it is much easier to start with that.

Using one tree per right would increase the size, especially for each 
new access right, but it is worth thinking about a new data structure 
dealing with sets (and ranges) of numbers.

Talking about performance optimization, the first step would be to use a 
hash table for domain's inode identification.


>>
>> Thanks,
>> –Günther
>>
>> P.S.: Apologies if some of it was discussed previously. I did my best
>> to catch up on previous threads, but it's long, and it's possible that
>> I missed parts of the discussion.
>>
>> On Mon, Jan 16, 2023 at 04:58:06PM +0800, Konstantin Meskhidze wrote:
>>> Hi,
>>> This is a new V9 patch related to Landlock LSM network confinement.
>>> It is based on the landlock's -next branch on top of v6.2-rc3 kernel version:
>>> https://git.kernel.org/pub/scm/linux/kernel/git/mic/linux.git/log/?h=next
>>>
>>> It brings refactoring of previous patch version V8.
>>> Mostly there are fixes of logic and typos, adding new tests.
>>>
>>> All test were run in QEMU evironment and compiled with
>>>   -static flag.
>>>   1. network_test: 32/32 tests passed.
>>>   2. base_test: 7/7 tests passed.
>>>   3. fs_test: 78/78 tests passed.
>>>   4. ptrace_test: 8/8 tests passed.
>>>
>>> Previous versions:
>>> v8: https://lore.kernel.org/linux-security-module/20221021152644.155136-1-konstantin.meskhidze@huawei.com/
>>> v7: https://lore.kernel.org/linux-security-module/20220829170401.834298-1-konstantin.meskhidze@huawei.com/
>>> v6: https://lore.kernel.org/linux-security-module/20220621082313.3330667-1-konstantin.meskhidze@huawei.com/
>>> v5: https://lore.kernel.org/linux-security-module/20220516152038.39594-1-konstantin.meskhidze@huawei.com
>>> v4: https://lore.kernel.org/linux-security-module/20220309134459.6448-1-konstantin.meskhidze@huawei.com/
>>> v3: https://lore.kernel.org/linux-security-module/20220124080215.265538-1-konstantin.meskhidze@huawei.com/
>>> v2: https://lore.kernel.org/linux-security-module/20211228115212.703084-1-konstantin.meskhidze@huawei.com/
>>> v1: https://lore.kernel.org/linux-security-module/20211210072123.386713-1-konstantin.meskhidze@huawei.com/
>>>
>>> Konstantin Meskhidze (11):
>>>    landlock: Make ruleset's access masks more generic
>>>    landlock: Refactor landlock_find_rule/insert_rule
>>>    landlock: Refactor merge/inherit_ruleset functions
>>>    landlock: Move and rename umask_layers() and init_layer_masks()
>>>    landlock: Refactor _unmask_layers() and _init_layer_masks()
>>>    landlock: Refactor landlock_add_rule() syscall
>>>    landlock: Add network rules and TCP hooks support
>>>    selftests/landlock: Share enforce_ruleset()
>>>    selftests/landlock: Add 10 new test suites dedicated to network
>>>    samples/landlock: Add network demo
>>>    landlock: Document Landlock's network support
>>>
>>> Mickaël Salaün (1):
>>>    landlock: Allow filesystem layout changes for domains without such
>>>      rule type
>>>
>>>   Documentation/userspace-api/landlock.rst     |   72 +-
>>>   include/uapi/linux/landlock.h                |   49 +
>>>   samples/landlock/sandboxer.c                 |  131 +-
>>>   security/landlock/Kconfig                    |    1 +
>>>   security/landlock/Makefile                   |    2 +
>>>   security/landlock/fs.c                       |  255 ++--
>>>   security/landlock/limits.h                   |    7 +-
>>>   security/landlock/net.c                      |  200 +++
>>>   security/landlock/net.h                      |   26 +
>>>   security/landlock/ruleset.c                  |  409 +++++--
>>>   security/landlock/ruleset.h                  |  185 ++-
>>>   security/landlock/setup.c                    |    2 +
>>>   security/landlock/syscalls.c                 |  165 ++-
>>>   tools/testing/selftests/landlock/base_test.c |    2 +-
>>>   tools/testing/selftests/landlock/common.h    |   10 +
>>>   tools/testing/selftests/landlock/config      |    4 +
>>>   tools/testing/selftests/landlock/fs_test.c   |   75 +-
>>>   tools/testing/selftests/landlock/net_test.c  | 1157 ++++++++++++++++++
>>>   18 files changed, 2398 insertions(+), 354 deletions(-)
>>>   create mode 100644 security/landlock/net.c
>>>   create mode 100644 security/landlock/net.h
>>>   create mode 100644 tools/testing/selftests/landlock/net_test.c
>>>
>>> -- 
>>> 2.25.1
>>>
>> .
Mickaël Salaün June 26, 2023, 3:29 p.m. UTC | #5
Reviving Günther's suggestion to deny a set of network protocols:

On 14/03/2023 14:28, Mickaël Salaün wrote:
> 
> On 13/03/2023 18:16, Konstantin Meskhidze (A) wrote:
>>
>>
>> 2/24/2023 1:17 AM, Günther Noack пишет:

[...]

>>>
>>> * Given the list of obscure network protocols listed in the socket(2)
>>>      man page, I find it slightly weird to have rules for the use of TCP,
>>>      but to leave less prominent protocols unrestricted.
>>>
>>>      For example, a process with an enabled Landlock network ruleset may
>>>      connect only to certain TCP ports, but at the same time it can
>>>      happily use Bluetooth/CAN bus/DECnet/IPX or other protocols?
>>
>>         We also have started a discussion about UDP protocol, but it's
>> more complicated since UDP sockets does not establish connections
>> between each other. There is a performance problem on the first place here.
>>
>> I'm not familiar with Bluetooth/CAN bus/DECnet/IPX but let's discuss it.
>> Any ideas here?
> 
> All these protocols should be handled one way or another someday. ;)
> 
> 
>>
>>>
>>>      I'm mentioning these more obscure protocols, because I doubt that
>>>      Landlock will grow more sophisticated support for them anytime soon,
>>>      so maybe the best option would be to just make it possible to
>>>      disable these?  Is that also part of the plan?
>>>
>>>      (I think there would be a lot of value in restricting network
>>>      access, even when it's done very broadly.  There are many programs
>>>      that don't need network at all, and among those that do need
>>>      network, most only require IP networking.
> 
> Indeed, protocols that nobody care to make Landlock supports them will
> probably not have fine-grained control. We could extend the ruleset
> attributes to disable the use (i.e. not only the creation of new related
> sockets/resources) of network protocol families, in a way that would
> make sandboxes simulate a kernel without such protocol support. In this
> case, this should be an allowed list of protocols, and everything not in
> that list should be denied. This approach could be used for other kernel
> features (unrelated to network).
> 
> 
>>>
>>>      Btw, the argument for more broad disabling of network access was
>>>      already made at https://cr.yp.to/unix/disablenetwork.html in the
>>>      past.)
> 
> This is interesting but scoped to a single use case. As specified at the
> beginning of this linked page, there must be exceptions, not only with
> AF_UNIX but also for (the newer) AF_VSOCK, and probably future ones.
> This is why I don't think a binary approach is a good one for Linux.
> Users should be able to specify what they need, and block the rest.

Here is a design to be able to only allow a set of network protocols and 
deny everything else. This would be complementary to Konstantin's patch 
series which addresses fine-grained access control.

First, I want to remind that Landlock follows an allowed list approach 
with a set of (growing) supported actions (for compatibility reasons), 
which is kind of an allow-list-on-a-deny-list. But with this proposal, 
we want to be able to deny everything, which means: supported, not 
supported, known and unknown protocols.

We could add a new "handled_access_socket" field to the landlock_ruleset 
struct, which could contain a LANDLOCK_ACCESS_SOCKET_CREATE flag.

If this field is set, users could add a new type of rules:
struct landlock_socket_attr {
     __u64 allowed_access;
     int domain; // see socket(2)
     int type; // see socket(2)
}

The allowed_access field would only contain 
LANDLOCK_ACCESS_SOCKET_CREATE at first, but it could grow with other 
actions (which cannot be handled with seccomp):
- use: walk through all opened FDs and mark them as allowed or denied
- receive: hook on received FDs
- send: hook on sent FDs

We might also use the same approach for non-socket objects that can be 
identified with some meaningful properties.

What do you think?
Jeff Xu June 28, 2023, 2:33 a.m. UTC | #6
On Mon, Jun 26, 2023 at 8:29 AM Mickaël Salaün <mic@digikod.net> wrote:
>
> Reviving Günther's suggestion to deny a set of network protocols:
>
> On 14/03/2023 14:28, Mickaël Salaün wrote:
> >
> > On 13/03/2023 18:16, Konstantin Meskhidze (A) wrote:
> >>
> >>
> >> 2/24/2023 1:17 AM, Günther Noack пишет:
>
> [...]
>
> >>>
> >>> * Given the list of obscure network protocols listed in the socket(2)
> >>>      man page, I find it slightly weird to have rules for the use of TCP,
> >>>      but to leave less prominent protocols unrestricted.
> >>>
> >>>      For example, a process with an enabled Landlock network ruleset may
> >>>      connect only to certain TCP ports, but at the same time it can
> >>>      happily use Bluetooth/CAN bus/DECnet/IPX or other protocols?
> >>
> >>         We also have started a discussion about UDP protocol, but it's
> >> more complicated since UDP sockets does not establish connections
> >> between each other. There is a performance problem on the first place here.
> >>
> >> I'm not familiar with Bluetooth/CAN bus/DECnet/IPX but let's discuss it.
> >> Any ideas here?
> >
> > All these protocols should be handled one way or another someday. ;)
> >
> >
> >>
> >>>
> >>>      I'm mentioning these more obscure protocols, because I doubt that
> >>>      Landlock will grow more sophisticated support for them anytime soon,
> >>>      so maybe the best option would be to just make it possible to
> >>>      disable these?  Is that also part of the plan?
> >>>
> >>>      (I think there would be a lot of value in restricting network
> >>>      access, even when it's done very broadly.  There are many programs
> >>>      that don't need network at all, and among those that do need
> >>>      network, most only require IP networking.
> >
> > Indeed, protocols that nobody care to make Landlock supports them will
> > probably not have fine-grained control. We could extend the ruleset
> > attributes to disable the use (i.e. not only the creation of new related
> > sockets/resources) of network protocol families, in a way that would
> > make sandboxes simulate a kernel without such protocol support. In this
> > case, this should be an allowed list of protocols, and everything not in
> > that list should be denied. This approach could be used for other kernel
> > features (unrelated to network).
> >
> >
> >>>
> >>>      Btw, the argument for more broad disabling of network access was
> >>>      already made at https://cr.yp.to/unix/disablenetwork.html in the
> >>>      past.)
> >
> > This is interesting but scoped to a single use case. As specified at the
> > beginning of this linked page, there must be exceptions, not only with
> > AF_UNIX but also for (the newer) AF_VSOCK, and probably future ones.
> > This is why I don't think a binary approach is a good one for Linux.
> > Users should be able to specify what they need, and block the rest.
>
> Here is a design to be able to only allow a set of network protocols and
> deny everything else. This would be complementary to Konstantin's patch
> series which addresses fine-grained access control.
>
> First, I want to remind that Landlock follows an allowed list approach
> with a set of (growing) supported actions (for compatibility reasons),
> which is kind of an allow-list-on-a-deny-list. But with this proposal,
> we want to be able to deny everything, which means: supported, not
> supported, known and unknown protocols.
>
I think this makes sense.  ChomeOS can use it at the process level:
disable network, allow VSOCK only, allow TCP only, etc.

> We could add a new "handled_access_socket" field to the landlock_ruleset
> struct, which could contain a LANDLOCK_ACCESS_SOCKET_CREATE flag.
>
> If this field is set, users could add a new type of rules:
> struct landlock_socket_attr {
>      __u64 allowed_access;
>      int domain; // see socket(2)
>      int type; // see socket(2)
> }
>
Do you want to add "int protocol" ? which is the third parameter of socket(2)
According to protocols(5), the protocols are defined in:
https://www.iana.org/assignments/protocol-numbers/protocol-numbers.xhtml

It is part of IPv4/IPV6 header:
https://www.rfc-editor.org/rfc/rfc791.html#section-3.1
https://www.rfc-editor.org/rfc/rfc8200.html#section-3

> The allowed_access field would only contain
> LANDLOCK_ACCESS_SOCKET_CREATE at first, but it could grow with other
> actions (which cannot be handled with seccomp):
> - use: walk through all opened FDs and mark them as allowed or denied
> - receive: hook on received FDs
> - send: hook on sent FDs
>
also bind, connect, accept.

> We might also use the same approach for non-socket objects that can be
> identified with some meaningful properties.
>
> What do you think?

-Jeff
Günther Noack June 28, 2023, 8:44 a.m. UTC | #7
Hello!

On Mon, Jun 26, 2023 at 05:29:34PM +0200, Mickaël Salaün wrote:
> Here is a design to be able to only allow a set of network protocols and
> deny everything else. This would be complementary to Konstantin's patch
> series which addresses fine-grained access control.
> 
> First, I want to remind that Landlock follows an allowed list approach with
> a set of (growing) supported actions (for compatibility reasons), which is
> kind of an allow-list-on-a-deny-list. But with this proposal, we want to be
> able to deny everything, which means: supported, not supported, known and
> unknown protocols.
> 
> We could add a new "handled_access_socket" field to the landlock_ruleset
> struct, which could contain a LANDLOCK_ACCESS_SOCKET_CREATE flag.
> 
> If this field is set, users could add a new type of rules:
> struct landlock_socket_attr {
>     __u64 allowed_access;
>     int domain; // see socket(2)
>     int type; // see socket(2)
> }
> 
> The allowed_access field would only contain LANDLOCK_ACCESS_SOCKET_CREATE at
> first, but it could grow with other actions (which cannot be handled with
> seccomp):
> - use: walk through all opened FDs and mark them as allowed or denied
> - receive: hook on received FDs
> - send: hook on sent FDs
> 
> We might also use the same approach for non-socket objects that can be
> identified with some meaningful properties.
> 
> What do you think?

This sounds like a good plan to me - it would make it possible to restrict new
socket creation using protocols that were not intended to be used, and I also
think it would fit the Landlock model nicely.

Small remark on the side: The security_socket_create() hook does not only get
invoked as a result of socket(2), but also as a part of accept(2) - so this
approach might already prevent new connections very effectively.

Spelling out some scenarios, so that we are sure that we are on the same page:

A)

A program that does not need networking could specify a ruleset where
LANDLOCK_ACCESS_SOCKET_CREATE is handled, and simply not permit anything.

B)

A program that runs a TCP server could specify a ruleset where
LANDLOCK_NET_BIND_TCP, LANDLOCK_NET_CONNECT_TCP and
LANDLOCK_ACCESS_SOCKET_CREATE are handled, and where the following rules are added:

  /* From Konstantin's patch set */
  struct landlock_net_service_attr bind_attr = {
    .allowed_access = LANDLOCK_NET_BIND_TCP,
    .port = 8080,
  };

  /* From Mickaël's proposal */
  struct landlock_socket_attr sock_inet_attr = {
    .allowed_access = LANDLOCK_ACCESS_SOCKET_CREATE,
    .domain = AF_INET,
    .type = SOCK_STREAM,
  }

  struct landlock_socket_attr sock_inet6_attr = {
    .allowed_access = LANDLOCK_ACCESS_SOCKET_CREATE,
    .domain = AF_INET6,
     .type = SOCK_STREAM,
  }

That should then be enough to bind and listen on ports, whereas outgoing
connections with TCP and anything using other network protocols would not be
permitted.

(Alternatively, it could bind() the socket early, *then enable Landlock* and
leave out the rule for BIND_TCP, only permitting SOCKET_CREATE for IPv4 and
IPv6, so that listen() and accept() work on the already-bound socket.)

Overall, this sounds like an excellent approach to me. 
Jeff Xu June 28, 2023, 5:03 p.m. UTC | #8
Hello,

Thanks for writing up the example for an incoming TCP connection ! It
helps with the context.

Since I'm late to this thread, one thing I want to ask:  all the APIs
proposed so far are at the process level, we don't have any API that
applies restriction to socket fd itself, right ? this is what I
thought, but I would like to get confirmation.

On Wed, Jun 28, 2023 at 2:09 AM Günther Noack <gnoack@google.com> wrote:
>
> Hello!
>
> On Mon, Jun 26, 2023 at 05:29:34PM +0200, Mickaël Salaün wrote:
> > Here is a design to be able to only allow a set of network protocols and
> > deny everything else. This would be complementary to Konstantin's patch
> > series which addresses fine-grained access control.
> >
> > First, I want to remind that Landlock follows an allowed list approach with
> > a set of (growing) supported actions (for compatibility reasons), which is
> > kind of an allow-list-on-a-deny-list. But with this proposal, we want to be
> > able to deny everything, which means: supported, not supported, known and
> > unknown protocols.
> >
> > We could add a new "handled_access_socket" field to the landlock_ruleset
> > struct, which could contain a LANDLOCK_ACCESS_SOCKET_CREATE flag.
> >
> > If this field is set, users could add a new type of rules:
> > struct landlock_socket_attr {
> >     __u64 allowed_access;
> >     int domain; // see socket(2)
> >     int type; // see socket(2)
> > }
> >
> > The allowed_access field would only contain LANDLOCK_ACCESS_SOCKET_CREATE at
> > first, but it could grow with other actions (which cannot be handled with
> > seccomp):
> > - use: walk through all opened FDs and mark them as allowed or denied
> > - receive: hook on received FDs
> > - send: hook on sent FDs
> >
> > We might also use the same approach for non-socket objects that can be
> > identified with some meaningful properties.
> >
> > What do you think?
>
> This sounds like a good plan to me - it would make it possible to restrict new
> socket creation using protocols that were not intended to be used, and I also
> think it would fit the Landlock model nicely.
>
> Small remark on the side: The security_socket_create() hook does not only get
> invoked as a result of socket(2), but also as a part of accept(2) - so this
> approach might already prevent new connections very effectively.
>
That is an interesting aspect that might be worth discussing more.
seccomp is per syscall, landlock doesn't necessarily follow the same,
another design is to add more logic in Landlock, e.g.
LANDLOCK_ACCESS_SOCKET_PROTOCOL which will apply to all of the socket
calls (socket/bind/listen/accept/connect). App dev might feel it is
easier to use.

> Spelling out some scenarios, so that we are sure that we are on the same page:
>
> A)
>
> A program that does not need networking could specify a ruleset where
> LANDLOCK_ACCESS_SOCKET_CREATE is handled, and simply not permit anything.
>
> B)
>
> A program that runs a TCP server could specify a ruleset where
> LANDLOCK_NET_BIND_TCP, LANDLOCK_NET_CONNECT_TCP and
> LANDLOCK_ACCESS_SOCKET_CREATE are handled, and where the following rules are added:
>
>   /* From Konstantin's patch set */
>   struct landlock_net_service_attr bind_attr = {
>     .allowed_access = LANDLOCK_NET_BIND_TCP,
>     .port = 8080,
>   };
>
>   /* From Mickaël's proposal */
>   struct landlock_socket_attr sock_inet_attr = {
>     .allowed_access = LANDLOCK_ACCESS_SOCKET_CREATE,
>     .domain = AF_INET,
>     .type = SOCK_STREAM,
>   }
>
>   struct landlock_socket_attr sock_inet6_attr = {
>     .allowed_access = LANDLOCK_ACCESS_SOCKET_CREATE,
>     .domain = AF_INET6,
>      .type = SOCK_STREAM,
>   }
>
> That should then be enough to bind and listen on ports, whereas outgoing
> connections with TCP and anything using other network protocols would not be
> permitted.
>
TCP server is an interesting case. From a security perspective, a
process cares if it is acting as a server or client in TCP, a server
might only want to accept an incoming TCP connection, never initiate
an outgoing TCP connection, and a client is the opposite.

Processes can restrict outgoing/incoming TCP connection by seccomp for
accept(2) or connect(2),  though I feel Landlock can do this more
naturally for app dev, and at per-protocol level.  seccomp doesn't
provide per-protocol granularity.

For bind(2), iirc, it can be used for a server to assign dst port of
incoming TCP connection, also by a client to assign a src port of an
outgoing TCP connection. LANDLOCK_NET_BIND_TCP will apply to both
cases, right ? this might not be a problem, just something to keep
note.

> (Alternatively, it could bind() the socket early, *then enable Landlock* and
> leave out the rule for BIND_TCP, only permitting SOCKET_CREATE for IPv4 and
> IPv6, so that listen() and accept() work on the already-bound socket.)
>
For this approach, LANDLOCK_ACCESS_SOCKET_PROTOCOL is a better name,
so dev is fully aware it is not just applied to socket create.

> Overall, this sounds like an excellent approach to me. 
Mickaël Salaün June 28, 2023, 7:03 p.m. UTC | #9
On 28/06/2023 04:33, Jeff Xu wrote:
> On Mon, Jun 26, 2023 at 8:29 AM Mickaël Salaün <mic@digikod.net> wrote:
>>
>> Reviving Günther's suggestion to deny a set of network protocols:
>>
>> On 14/03/2023 14:28, Mickaël Salaün wrote:
>>>
>>> On 13/03/2023 18:16, Konstantin Meskhidze (A) wrote:
>>>>
>>>>
>>>> 2/24/2023 1:17 AM, Günther Noack пишет:
>>
>> [...]
>>
>>>>>
>>>>> * Given the list of obscure network protocols listed in the socket(2)
>>>>>       man page, I find it slightly weird to have rules for the use of TCP,
>>>>>       but to leave less prominent protocols unrestricted.
>>>>>
>>>>>       For example, a process with an enabled Landlock network ruleset may
>>>>>       connect only to certain TCP ports, but at the same time it can
>>>>>       happily use Bluetooth/CAN bus/DECnet/IPX or other protocols?
>>>>
>>>>          We also have started a discussion about UDP protocol, but it's
>>>> more complicated since UDP sockets does not establish connections
>>>> between each other. There is a performance problem on the first place here.
>>>>
>>>> I'm not familiar with Bluetooth/CAN bus/DECnet/IPX but let's discuss it.
>>>> Any ideas here?
>>>
>>> All these protocols should be handled one way or another someday. ;)
>>>
>>>
>>>>
>>>>>
>>>>>       I'm mentioning these more obscure protocols, because I doubt that
>>>>>       Landlock will grow more sophisticated support for them anytime soon,
>>>>>       so maybe the best option would be to just make it possible to
>>>>>       disable these?  Is that also part of the plan?
>>>>>
>>>>>       (I think there would be a lot of value in restricting network
>>>>>       access, even when it's done very broadly.  There are many programs
>>>>>       that don't need network at all, and among those that do need
>>>>>       network, most only require IP networking.
>>>
>>> Indeed, protocols that nobody care to make Landlock supports them will
>>> probably not have fine-grained control. We could extend the ruleset
>>> attributes to disable the use (i.e. not only the creation of new related
>>> sockets/resources) of network protocol families, in a way that would
>>> make sandboxes simulate a kernel without such protocol support. In this
>>> case, this should be an allowed list of protocols, and everything not in
>>> that list should be denied. This approach could be used for other kernel
>>> features (unrelated to network).
>>>
>>>
>>>>>
>>>>>       Btw, the argument for more broad disabling of network access was
>>>>>       already made at https://cr.yp.to/unix/disablenetwork.html in the
>>>>>       past.)
>>>
>>> This is interesting but scoped to a single use case. As specified at the
>>> beginning of this linked page, there must be exceptions, not only with
>>> AF_UNIX but also for (the newer) AF_VSOCK, and probably future ones.
>>> This is why I don't think a binary approach is a good one for Linux.
>>> Users should be able to specify what they need, and block the rest.
>>
>> Here is a design to be able to only allow a set of network protocols and
>> deny everything else. This would be complementary to Konstantin's patch
>> series which addresses fine-grained access control.
>>
>> First, I want to remind that Landlock follows an allowed list approach
>> with a set of (growing) supported actions (for compatibility reasons),
>> which is kind of an allow-list-on-a-deny-list. But with this proposal,
>> we want to be able to deny everything, which means: supported, not
>> supported, known and unknown protocols.
>>
> I think this makes sense.  ChomeOS can use it at the process level:
> disable network, allow VSOCK only, allow TCP only, etc.
> 
>> We could add a new "handled_access_socket" field to the landlock_ruleset
>> struct, which could contain a LANDLOCK_ACCESS_SOCKET_CREATE flag.
>>
>> If this field is set, users could add a new type of rules:
>> struct landlock_socket_attr {
>>       __u64 allowed_access;
>>       int domain; // see socket(2)

I guess "family" would also make sense. It's the name used in the 
kernel, the "AF" prefixes, and address_families(7). I'm not sure why 
"domain" was chosen for socket(2).


>>       int type; // see socket(2)
>> }
>>
> Do you want to add "int protocol" ? which is the third parameter of socket(2)
> According to protocols(5), the protocols are defined in:
> https://www.iana.org/assignments/protocol-numbers/protocol-numbers.xhtml
> 
> It is part of IPv4/IPV6 header:
> https://www.rfc-editor.org/rfc/rfc791.html#section-3.1
> https://www.rfc-editor.org/rfc/rfc8200.html#section-3

I understand the rationale but I'm not sure if this would be useful. Do 
you have use cases?


> 
>> The allowed_access field would only contain
>> LANDLOCK_ACCESS_SOCKET_CREATE at first, but it could grow with other
>> actions (which cannot be handled with seccomp):
>> - use: walk through all opened FDs and mark them as allowed or denied
>> - receive: hook on received FDs
>> - send: hook on sent FDs
>>
> also bind, connect, accept.

I don't think "accept" would be useful, and I'm not sure if "bind" and 
"connect" would not be redundant with LANDLOCK_ACCESS_NET_{CONNECT,BIND}_TCP
Bind and connect for a datagram socket is optional, so this might lead 
to a false sense of security. If we want to support protocols other than 
TCP to restrict bind/connect, then they deserve to be controlled 
according to a port (or similar).

> 
>> We might also use the same approach for non-socket objects that can be
>> identified with some meaningful properties.
>>
>> What do you think?
> 
> -Jeff
Mickaël Salaün June 28, 2023, 7:07 p.m. UTC | #10
On 28/06/2023 10:44, Günther Noack wrote:
> Hello!
> 
> On Mon, Jun 26, 2023 at 05:29:34PM +0200, Mickaël Salaün wrote:
>> Here is a design to be able to only allow a set of network protocols and
>> deny everything else. This would be complementary to Konstantin's patch
>> series which addresses fine-grained access control.
>>
>> First, I want to remind that Landlock follows an allowed list approach with
>> a set of (growing) supported actions (for compatibility reasons), which is
>> kind of an allow-list-on-a-deny-list. But with this proposal, we want to be
>> able to deny everything, which means: supported, not supported, known and
>> unknown protocols.
>>
>> We could add a new "handled_access_socket" field to the landlock_ruleset
>> struct, which could contain a LANDLOCK_ACCESS_SOCKET_CREATE flag.
>>
>> If this field is set, users could add a new type of rules:
>> struct landlock_socket_attr {
>>      __u64 allowed_access;
>>      int domain; // see socket(2)
>>      int type; // see socket(2)
>> }
>>
>> The allowed_access field would only contain LANDLOCK_ACCESS_SOCKET_CREATE at
>> first, but it could grow with other actions (which cannot be handled with
>> seccomp):
>> - use: walk through all opened FDs and mark them as allowed or denied
>> - receive: hook on received FDs
>> - send: hook on sent FDs
>>
>> We might also use the same approach for non-socket objects that can be
>> identified with some meaningful properties.
>>
>> What do you think?
> 
> This sounds like a good plan to me - it would make it possible to restrict new
> socket creation using protocols that were not intended to be used, and I also
> think it would fit the Landlock model nicely.
> 
> Small remark on the side: The security_socket_create() hook does not only get
> invoked as a result of socket(2), but also as a part of accept(2) - so this
> approach might already prevent new connections very effectively.

Indeed. We could also differentiate socket(2) from accept(2) with a 
dedicated LANDLOCK_ACCESS_SOCKET_ACCEPT right. This would enable to 
create a bind socket, sandbox the process and deny new socket(2) calls, 
but still allows to call accept(2) and receive new connections.

BTW, unix socket path opening should be considered too.

> 
> Spelling out some scenarios, so that we are sure that we are on the same page:
> 
> A)
> 
> A program that does not need networking could specify a ruleset where
> LANDLOCK_ACCESS_SOCKET_CREATE is handled, and simply not permit anything.

This is correct, except if the process receive a socket FD or open a 
unix socket path.


> 
> B)
> 
> A program that runs a TCP server could specify a ruleset where
> LANDLOCK_NET_BIND_TCP, LANDLOCK_NET_CONNECT_TCP and

s/LANDLOCK_NET_CONNECT_TCP/LANDLOCK_ACCESS_NET_CONNECT_TCP/

> LANDLOCK_ACCESS_SOCKET_CREATE are handled, and where the following rules are added:
> 
>    /* From Konstantin's patch set */
>    struct landlock_net_service_attr bind_attr = {
>      .allowed_access = LANDLOCK_NET_BIND_TCP,
>      .port = 8080,
>    };
> 
>    /* From Mickaël's proposal */
>    struct landlock_socket_attr sock_inet_attr = {
>      .allowed_access = LANDLOCK_ACCESS_SOCKET_CREATE,
>      .domain = AF_INET,
>      .type = SOCK_STREAM,
>    }
> 
>    struct landlock_socket_attr sock_inet6_attr = {
>      .allowed_access = LANDLOCK_ACCESS_SOCKET_CREATE,
>      .domain = AF_INET6,
>       .type = SOCK_STREAM,
>    }
> 
> That should then be enough to bind and listen on ports, whereas outgoing
> connections with TCP and anything using other network protocols would not be
> permitted.
> 
> (Alternatively, it could bind() the socket early, *then enable Landlock* and
> leave out the rule for BIND_TCP, only permitting SOCKET_CREATE for IPv4 and
> IPv6, so that listen() and accept() work on the already-bound socket.)

correct

> 
> Overall, this sounds like an excellent approach to me. 
Mickaël Salaün June 28, 2023, 7:29 p.m. UTC | #11
On 28/06/2023 19:03, Jeff Xu wrote:
> Hello,
> 
> Thanks for writing up the example for an incoming TCP connection ! It
> helps with the context.
> 
> Since I'm late to this thread, one thing I want to ask:  all the APIs
> proposed so far are at the process level, we don't have any API that
> applies restriction to socket fd itself, right ? this is what I
> thought, but I would like to get confirmation.

Restriction are applied to actions, not to already existing/opened FDs. 
We could add a way to restrict opened FDs, but I don't think this is the 
right approach because sandboxing is a deliberate action from a process, 
and it should already take care of its FDs.


> 
> On Wed, Jun 28, 2023 at 2:09 AM Günther Noack <gnoack@google.com> wrote:
>>
>> Hello!
>>
>> On Mon, Jun 26, 2023 at 05:29:34PM +0200, Mickaël Salaün wrote:
>>> Here is a design to be able to only allow a set of network protocols and
>>> deny everything else. This would be complementary to Konstantin's patch
>>> series which addresses fine-grained access control.
>>>
>>> First, I want to remind that Landlock follows an allowed list approach with
>>> a set of (growing) supported actions (for compatibility reasons), which is
>>> kind of an allow-list-on-a-deny-list. But with this proposal, we want to be
>>> able to deny everything, which means: supported, not supported, known and
>>> unknown protocols.
>>>
>>> We could add a new "handled_access_socket" field to the landlock_ruleset
>>> struct, which could contain a LANDLOCK_ACCESS_SOCKET_CREATE flag.
>>>
>>> If this field is set, users could add a new type of rules:
>>> struct landlock_socket_attr {
>>>      __u64 allowed_access;
>>>      int domain; // see socket(2)
>>>      int type; // see socket(2)
>>> }
>>>
>>> The allowed_access field would only contain LANDLOCK_ACCESS_SOCKET_CREATE at
>>> first, but it could grow with other actions (which cannot be handled with
>>> seccomp):
>>> - use: walk through all opened FDs and mark them as allowed or denied
>>> - receive: hook on received FDs
>>> - send: hook on sent FDs
>>>
>>> We might also use the same approach for non-socket objects that can be
>>> identified with some meaningful properties.
>>>
>>> What do you think?
>>
>> This sounds like a good plan to me - it would make it possible to restrict new
>> socket creation using protocols that were not intended to be used, and I also
>> think it would fit the Landlock model nicely.
>>
>> Small remark on the side: The security_socket_create() hook does not only get
>> invoked as a result of socket(2), but also as a part of accept(2) - so this
>> approach might already prevent new connections very effectively.
>>
> That is an interesting aspect that might be worth discussing more.
> seccomp is per syscall, landlock doesn't necessarily follow the same,
> another design is to add more logic in Landlock, e.g.
> LANDLOCK_ACCESS_SOCKET_PROTOCOL which will apply to all of the socket
> calls (socket/bind/listen/accept/connect). App dev might feel it is
> easier to use.

seccomp restricts the use of the syscall interface, whereas Landlock 
restricts the use of kernel objects (i.e. the semantic).

We need to find a good tradeoff between a lot of access rights and a few 
grouping different actions. This should make sense from a developer 
point of view according to its knowledge and use of the kernel 
interfaces (potential wrapped with high level libraries), but also to 
the semantic of the sandbox and the security guarantees we want to provide.

We should also keep in mind that high level Landlock libraries can take 
care of potential coarse-grained use of restrictions.


> 
>> Spelling out some scenarios, so that we are sure that we are on the same page:
>>
>> A)
>>
>> A program that does not need networking could specify a ruleset where
>> LANDLOCK_ACCESS_SOCKET_CREATE is handled, and simply not permit anything.
>>
>> B)
>>
>> A program that runs a TCP server could specify a ruleset where
>> LANDLOCK_NET_BIND_TCP, LANDLOCK_NET_CONNECT_TCP and
>> LANDLOCK_ACCESS_SOCKET_CREATE are handled, and where the following rules are added:
>>
>>    /* From Konstantin's patch set */
>>    struct landlock_net_service_attr bind_attr = {
>>      .allowed_access = LANDLOCK_NET_BIND_TCP,
>>      .port = 8080,
>>    };
>>
>>    /* From Mickaël's proposal */
>>    struct landlock_socket_attr sock_inet_attr = {
>>      .allowed_access = LANDLOCK_ACCESS_SOCKET_CREATE,
>>      .domain = AF_INET,
>>      .type = SOCK_STREAM,
>>    }
>>
>>    struct landlock_socket_attr sock_inet6_attr = {
>>      .allowed_access = LANDLOCK_ACCESS_SOCKET_CREATE,
>>      .domain = AF_INET6,
>>       .type = SOCK_STREAM,
>>    }
>>
>> That should then be enough to bind and listen on ports, whereas outgoing
>> connections with TCP and anything using other network protocols would not be
>> permitted.
>>
> TCP server is an interesting case. From a security perspective, a
> process cares if it is acting as a server or client in TCP, a server
> might only want to accept an incoming TCP connection, never initiate
> an outgoing TCP connection, and a client is the opposite.
> 
> Processes can restrict outgoing/incoming TCP connection by seccomp for
> accept(2) or connect(2),  though I feel Landlock can do this more
> naturally for app dev, and at per-protocol level.  seccomp doesn't
> provide per-protocol granularity.

Right, seccomp cannot filter TCP ports.

> 
> For bind(2), iirc, it can be used for a server to assign dst port of
> incoming TCP connection, also by a client to assign a src port of an
> outgoing TCP connection. LANDLOCK_NET_BIND_TCP will apply to both
> cases, right ? this might not be a problem, just something to keep
> note.

Good point. I think it is in line with the rule definition: to allow to 
bind on a specific port. However, if clients want to set the source port 
to a (legitimate) value, then that would be an issue because we cannot 
allow a whole range of ports (e.g., >= 1024). I'm not sure if this 
practice would be deemed "legitimate" though. Do you know client 
applications using bind?

Konstantin, we should have a test for this case anyway.


> 
>> (Alternatively, it could bind() the socket early, *then enable Landlock* and
>> leave out the rule for BIND_TCP, only permitting SOCKET_CREATE for IPv4 and
>> IPv6, so that listen() and accept() work on the already-bound socket.)
>>
> For this approach, LANDLOCK_ACCESS_SOCKET_PROTOCOL is a better name,
> so dev is fully aware it is not just applied to socket create.

I don't get the semantic of LANDLOCK_ACCESS_SOCKET_PROTOCOL. What does 
PROTOCOL mean?

> 
>> Overall, this sounds like an excellent approach to me. 
Jeff Xu June 28, 2023, 9:56 p.m. UTC | #12
On Wed, Jun 28, 2023 at 12:03 PM Mickaël Salaün <mic@digikod.net> wrote:
>
>
> On 28/06/2023 04:33, Jeff Xu wrote:
> > On Mon, Jun 26, 2023 at 8:29 AM Mickaël Salaün <mic@digikod.net> wrote:
> >>
> >> Reviving Günther's suggestion to deny a set of network protocols:
> >>
> >> On 14/03/2023 14:28, Mickaël Salaün wrote:
> >>>
> >>> On 13/03/2023 18:16, Konstantin Meskhidze (A) wrote:
> >>>>
> >>>>
> >>>> 2/24/2023 1:17 AM, Günther Noack пишет:
> >>
> >> [...]
> >>
> >>>>>
> >>>>> * Given the list of obscure network protocols listed in the socket(2)
> >>>>>       man page, I find it slightly weird to have rules for the use of TCP,
> >>>>>       but to leave less prominent protocols unrestricted.
> >>>>>
> >>>>>       For example, a process with an enabled Landlock network ruleset may
> >>>>>       connect only to certain TCP ports, but at the same time it can
> >>>>>       happily use Bluetooth/CAN bus/DECnet/IPX or other protocols?
> >>>>
> >>>>          We also have started a discussion about UDP protocol, but it's
> >>>> more complicated since UDP sockets does not establish connections
> >>>> between each other. There is a performance problem on the first place here.
> >>>>
> >>>> I'm not familiar with Bluetooth/CAN bus/DECnet/IPX but let's discuss it.
> >>>> Any ideas here?
> >>>
> >>> All these protocols should be handled one way or another someday. ;)
> >>>
> >>>
> >>>>
> >>>>>
> >>>>>       I'm mentioning these more obscure protocols, because I doubt that
> >>>>>       Landlock will grow more sophisticated support for them anytime soon,
> >>>>>       so maybe the best option would be to just make it possible to
> >>>>>       disable these?  Is that also part of the plan?
> >>>>>
> >>>>>       (I think there would be a lot of value in restricting network
> >>>>>       access, even when it's done very broadly.  There are many programs
> >>>>>       that don't need network at all, and among those that do need
> >>>>>       network, most only require IP networking.
> >>>
> >>> Indeed, protocols that nobody care to make Landlock supports them will
> >>> probably not have fine-grained control. We could extend the ruleset
> >>> attributes to disable the use (i.e. not only the creation of new related
> >>> sockets/resources) of network protocol families, in a way that would
> >>> make sandboxes simulate a kernel without such protocol support. In this
> >>> case, this should be an allowed list of protocols, and everything not in
> >>> that list should be denied. This approach could be used for other kernel
> >>> features (unrelated to network).
> >>>
> >>>
> >>>>>
> >>>>>       Btw, the argument for more broad disabling of network access was
> >>>>>       already made at https://cr.yp.to/unix/disablenetwork.html in the
> >>>>>       past.)
> >>>
> >>> This is interesting but scoped to a single use case. As specified at the
> >>> beginning of this linked page, there must be exceptions, not only with
> >>> AF_UNIX but also for (the newer) AF_VSOCK, and probably future ones.
> >>> This is why I don't think a binary approach is a good one for Linux.
> >>> Users should be able to specify what they need, and block the rest.
> >>
> >> Here is a design to be able to only allow a set of network protocols and
> >> deny everything else. This would be complementary to Konstantin's patch
> >> series which addresses fine-grained access control.
> >>
> >> First, I want to remind that Landlock follows an allowed list approach
> >> with a set of (growing) supported actions (for compatibility reasons),
> >> which is kind of an allow-list-on-a-deny-list. But with this proposal,
> >> we want to be able to deny everything, which means: supported, not
> >> supported, known and unknown protocols.
> >>
> > I think this makes sense.  ChomeOS can use it at the process level:
> > disable network, allow VSOCK only, allow TCP only, etc.
> >
> >> We could add a new "handled_access_socket" field to the landlock_ruleset
> >> struct, which could contain a LANDLOCK_ACCESS_SOCKET_CREATE flag.
> >>
> >> If this field is set, users could add a new type of rules:
> >> struct landlock_socket_attr {
> >>       __u64 allowed_access;
> >>       int domain; // see socket(2)
>
> I guess "family" would also make sense. It's the name used in the
> kernel, the "AF" prefixes, and address_families(7). I'm not sure why
> "domain" was chosen for socket(2).
>
Agree also.

>
> >>       int type; // see socket(2)
> >> }
> >>
> > Do you want to add "int protocol" ? which is the third parameter of socket(2)
> > According to protocols(5), the protocols are defined in:
> > https://www.iana.org/assignments/protocol-numbers/protocol-numbers.xhtml
> >
> > It is part of IPv4/IPV6 header:
> > https://www.rfc-editor.org/rfc/rfc791.html#section-3.1
> > https://www.rfc-editor.org/rfc/rfc8200.html#section-3
>
> I understand the rationale but I'm not sure if this would be useful. Do
> you have use cases?
>
I agree this field is not commonly used, so might not be that useful.
In most cases, the protocol field will just be 0.

One case I thought of previously is building an icmp or DHCP packet
with raw socket,  but now I'm not sure what kind of support/enforce
the kernel has for the protocol field with raw socket.

We can drop this for now, if there is a clearer requirement in future,
it is easy to add a new rule.

>
> >
> >> The allowed_access field would only contain
> >> LANDLOCK_ACCESS_SOCKET_CREATE at first, but it could grow with other
> >> actions (which cannot be handled with seccomp):
> >> - use: walk through all opened FDs and mark them as allowed or denied
> >> - receive: hook on received FDs
> >> - send: hook on sent FDs
> >>
> > also bind, connect, accept.
>
> I don't think "accept" would be useful, and I'm not sure if "bind" and
> "connect" would not be redundant with LANDLOCK_ACCESS_NET_{CONNECT,BIND}_TCP
> Bind and connect for a datagram socket is optional, so this might lead
> to a false sense of security. If we want to support protocols other than
> TCP to restrict bind/connect, then they deserve to be controlled
> according to a port (or similar).
>
> >
> >> We might also use the same approach for non-socket objects that can be
> >> identified with some meaningful properties.
> >>
> >> What do you think?
> >
> > -Jeff
Jeff Xu June 29, 2023, 3:18 a.m. UTC | #13
resend.

On Wed, Jun 28, 2023 at 12:29 PM Mickaël Salaün <mic@digikod.net> wrote:
>
>
> On 28/06/2023 19:03, Jeff Xu wrote:
> > Hello,
> >
> > Thanks for writing up the example for an incoming TCP connection ! It
> > helps with the context.
> >
> > Since I'm late to this thread, one thing I want to ask:  all the APIs
> > proposed so far are at the process level, we don't have any API that
> > applies restriction to socket fd itself, right ? this is what I
> > thought, but I would like to get confirmation.
>
> Restriction are applied to actions, not to already existing/opened FDs.
> We could add a way to restrict opened FDs, but I don't think this is the
> right approach because sandboxing is a deliberate action from a process,
> and it should already take care of its FDs.
>
>
> >
> > On Wed, Jun 28, 2023 at 2:09 AM Günther Noack <gnoack@google.com> wrote:
> >>
> >> Hello!
> >>
> >> On Mon, Jun 26, 2023 at 05:29:34PM +0200, Mickaël Salaün wrote:
> >>> Here is a design to be able to only allow a set of network protocols and
> >>> deny everything else. This would be complementary to Konstantin's patch
> >>> series which addresses fine-grained access control.
> >>>
> >>> First, I want to remind that Landlock follows an allowed list approach with
> >>> a set of (growing) supported actions (for compatibility reasons), which is
> >>> kind of an allow-list-on-a-deny-list. But with this proposal, we want to be
> >>> able to deny everything, which means: supported, not supported, known and
> >>> unknown protocols.
> >>>
> >>> We could add a new "handled_access_socket" field to the landlock_ruleset
> >>> struct, which could contain a LANDLOCK_ACCESS_SOCKET_CREATE flag.
> >>>
> >>> If this field is set, users could add a new type of rules:
> >>> struct landlock_socket_attr {
> >>>      __u64 allowed_access;
> >>>      int domain; // see socket(2)
> >>>      int type; // see socket(2)
> >>> }
> >>>
> >>> The allowed_access field would only contain LANDLOCK_ACCESS_SOCKET_CREATE at
> >>> first, but it could grow with other actions (which cannot be handled with
> >>> seccomp):
> >>> - use: walk through all opened FDs and mark them as allowed or denied
> >>> - receive: hook on received FDs
> >>> - send: hook on sent FDs
> >>>
> >>> We might also use the same approach for non-socket objects that can be
> >>> identified with some meaningful properties.
> >>>
> >>> What do you think?
> >>
> >> This sounds like a good plan to me - it would make it possible to restrict new
> >> socket creation using protocols that were not intended to be used, and I also
> >> think it would fit the Landlock model nicely.
> >>
> >> Small remark on the side: The security_socket_create() hook does not only get
> >> invoked as a result of socket(2), but also as a part of accept(2) - so this
> >> approach might already prevent new connections very effectively.
> >>
> > That is an interesting aspect that might be worth discussing more.
> > seccomp is per syscall, landlock doesn't necessarily follow the same,
> > another design is to add more logic in Landlock, e.g.
> > LANDLOCK_ACCESS_SOCKET_PROTOCOL which will apply to all of the socket
> > calls (socket/bind/listen/accept/connect). App dev might feel it is
> > easier to use.
>
> seccomp restricts the use of the syscall interface, whereas Landlock
> restricts the use of kernel objects (i.e. the semantic).
>
> We need to find a good tradeoff between a lot of access rights and a few
> grouping different actions. This should make sense from a developer
> point of view according to its knowledge and use of the kernel
> interfaces (potential wrapped with high level libraries), but also to
> the semantic of the sandbox and the security guarantees we want to provide.
>
> We should also keep in mind that high level Landlock libraries can take
> care of potential coarse-grained use of restrictions.
>
>
> >
> >> Spelling out some scenarios, so that we are sure that we are on the same page:
> >>
> >> A)
> >>
> >> A program that does not need networking could specify a ruleset where
> >> LANDLOCK_ACCESS_SOCKET_CREATE is handled, and simply not permit anything.
> >>
> >> B)
> >>
> >> A program that runs a TCP server could specify a ruleset where
> >> LANDLOCK_NET_BIND_TCP, LANDLOCK_NET_CONNECT_TCP and
> >> LANDLOCK_ACCESS_SOCKET_CREATE are handled, and where the following rules are added:
> >>
> >>    /* From Konstantin's patch set */
> >>    struct landlock_net_service_attr bind_attr = {
> >>      .allowed_access = LANDLOCK_NET_BIND_TCP,
> >>      .port = 8080,
> >>    };
> >>
> >>    /* From Mickaël's proposal */
> >>    struct landlock_socket_attr sock_inet_attr = {
> >>      .allowed_access = LANDLOCK_ACCESS_SOCKET_CREATE,
> >>      .domain = AF_INET,
> >>      .type = SOCK_STREAM,
> >>    }
> >>
> >>    struct landlock_socket_attr sock_inet6_attr = {
> >>      .allowed_access = LANDLOCK_ACCESS_SOCKET_CREATE,
> >>      .domain = AF_INET6,
> >>       .type = SOCK_STREAM,
> >>    }
> >>
> >> That should then be enough to bind and listen on ports, whereas outgoing
> >> connections with TCP and anything using other network protocols would not be
> >> permitted.
> >>
> > TCP server is an interesting case. From a security perspective, a
> > process cares if it is acting as a server or client in TCP, a server
> > might only want to accept an incoming TCP connection, never initiate
> > an outgoing TCP connection, and a client is the opposite.
> >
> > Processes can restrict outgoing/incoming TCP connection by seccomp for
> > accept(2) or connect(2),  though I feel Landlock can do this more
> > naturally for app dev, and at per-protocol level.  seccomp doesn't
> > provide per-protocol granularity.
>
> Right, seccomp cannot filter TCP ports.
>
> >
> > For bind(2), iirc, it can be used for a server to assign dst port of
> > incoming TCP connection, also by a client to assign a src port of an
> > outgoing TCP connection. LANDLOCK_NET_BIND_TCP will apply to both
> > cases, right ? this might not be a problem, just something to keep
> > note.
>
> Good point. I think it is in line with the rule definition: to allow to
> bind on a specific port. However, if clients want to set the source port
> to a (legitimate) value, then that would be an issue because we cannot
> allow a whole range of ports (e.g., >= 1024). I'm not sure if this
> practice would be deemed "legitimate" though. Do you know client
> applications using bind?
>
> Konstantin, we should have a test for this case anyway.
>
>
> >
> >> (Alternatively, it could bind() the socket early, *then enable Landlock* and
> >> leave out the rule for BIND_TCP, only permitting SOCKET_CREATE for IPv4 and
> >> IPv6, so that listen() and accept() work on the already-bound socket.)
> >>
> > For this approach, LANDLOCK_ACCESS_SOCKET_PROTOCOL is a better name,
> > so dev is fully aware it is not just applied to socket create.
>
> I don't get the semantic of LANDLOCK_ACCESS_SOCKET_PROTOCOL. What does
> PROTOCOL mean?
>
I meant checking family + type of socket, and apply to all of
socket(2),bind(2),accept(2),connect(2),listen(2), maybe
send(2)/recv(2) too.

s/LANDLOCK_ACCESS_SOCKET_CREATE/LANDLOCK_ACCESS_SOCKET_TYPE.

This implies the kernel will check on socket fd's property (family +
type) at those calls, this applies to
a - the socket fd is created within the process, after landlock is applied.
b - created in process prior to landlock is applied.
c - created out of process then passed into this process,

> >
> >> Overall, this sounds like an excellent approach to me. 
Mickaël Salaün June 29, 2023, 11:07 a.m. UTC | #14
On 29/06/2023 05:18, Jeff Xu wrote:
> resend.
> 
> On Wed, Jun 28, 2023 at 12:29 PM Mickaël Salaün <mic@digikod.net> wrote:
>>
>>
>> On 28/06/2023 19:03, Jeff Xu wrote:
>>> Hello,
>>>
>>> Thanks for writing up the example for an incoming TCP connection ! It
>>> helps with the context.
>>>
>>> Since I'm late to this thread, one thing I want to ask:  all the APIs
>>> proposed so far are at the process level, we don't have any API that
>>> applies restriction to socket fd itself, right ? this is what I
>>> thought, but I would like to get confirmation.
>>
>> Restriction are applied to actions, not to already existing/opened FDs.
>> We could add a way to restrict opened FDs, but I don't think this is the
>> right approach because sandboxing is a deliberate action from a process,
>> and it should already take care of its FDs.
>>
>>
>>>
>>> On Wed, Jun 28, 2023 at 2:09 AM Günther Noack <gnoack@google.com> wrote:
>>>>
>>>> Hello!
>>>>
>>>> On Mon, Jun 26, 2023 at 05:29:34PM +0200, Mickaël Salaün wrote:
>>>>> Here is a design to be able to only allow a set of network protocols and
>>>>> deny everything else. This would be complementary to Konstantin's patch
>>>>> series which addresses fine-grained access control.
>>>>>
>>>>> First, I want to remind that Landlock follows an allowed list approach with
>>>>> a set of (growing) supported actions (for compatibility reasons), which is
>>>>> kind of an allow-list-on-a-deny-list. But with this proposal, we want to be
>>>>> able to deny everything, which means: supported, not supported, known and
>>>>> unknown protocols.
>>>>>
>>>>> We could add a new "handled_access_socket" field to the landlock_ruleset
>>>>> struct, which could contain a LANDLOCK_ACCESS_SOCKET_CREATE flag.
>>>>>
>>>>> If this field is set, users could add a new type of rules:
>>>>> struct landlock_socket_attr {
>>>>>       __u64 allowed_access;
>>>>>       int domain; // see socket(2)
>>>>>       int type; // see socket(2)
>>>>> }
>>>>>
>>>>> The allowed_access field would only contain LANDLOCK_ACCESS_SOCKET_CREATE at
>>>>> first, but it could grow with other actions (which cannot be handled with
>>>>> seccomp):
>>>>> - use: walk through all opened FDs and mark them as allowed or denied
>>>>> - receive: hook on received FDs
>>>>> - send: hook on sent FDs
>>>>>
>>>>> We might also use the same approach for non-socket objects that can be
>>>>> identified with some meaningful properties.
>>>>>
>>>>> What do you think?
>>>>
>>>> This sounds like a good plan to me - it would make it possible to restrict new
>>>> socket creation using protocols that were not intended to be used, and I also
>>>> think it would fit the Landlock model nicely.
>>>>
>>>> Small remark on the side: The security_socket_create() hook does not only get
>>>> invoked as a result of socket(2), but also as a part of accept(2) - so this
>>>> approach might already prevent new connections very effectively.
>>>>
>>> That is an interesting aspect that might be worth discussing more.
>>> seccomp is per syscall, landlock doesn't necessarily follow the same,
>>> another design is to add more logic in Landlock, e.g.
>>> LANDLOCK_ACCESS_SOCKET_PROTOCOL which will apply to all of the socket
>>> calls (socket/bind/listen/accept/connect). App dev might feel it is
>>> easier to use.
>>
>> seccomp restricts the use of the syscall interface, whereas Landlock
>> restricts the use of kernel objects (i.e. the semantic).
>>
>> We need to find a good tradeoff between a lot of access rights and a few
>> grouping different actions. This should make sense from a developer
>> point of view according to its knowledge and use of the kernel
>> interfaces (potential wrapped with high level libraries), but also to
>> the semantic of the sandbox and the security guarantees we want to provide.
>>
>> We should also keep in mind that high level Landlock libraries can take
>> care of potential coarse-grained use of restrictions.
>>
>>
>>>
>>>> Spelling out some scenarios, so that we are sure that we are on the same page:
>>>>
>>>> A)
>>>>
>>>> A program that does not need networking could specify a ruleset where
>>>> LANDLOCK_ACCESS_SOCKET_CREATE is handled, and simply not permit anything.
>>>>
>>>> B)
>>>>
>>>> A program that runs a TCP server could specify a ruleset where
>>>> LANDLOCK_NET_BIND_TCP, LANDLOCK_NET_CONNECT_TCP and
>>>> LANDLOCK_ACCESS_SOCKET_CREATE are handled, and where the following rules are added:
>>>>
>>>>     /* From Konstantin's patch set */
>>>>     struct landlock_net_service_attr bind_attr = {
>>>>       .allowed_access = LANDLOCK_NET_BIND_TCP,
>>>>       .port = 8080,
>>>>     };
>>>>
>>>>     /* From Mickaël's proposal */
>>>>     struct landlock_socket_attr sock_inet_attr = {
>>>>       .allowed_access = LANDLOCK_ACCESS_SOCKET_CREATE,
>>>>       .domain = AF_INET,
>>>>       .type = SOCK_STREAM,
>>>>     }
>>>>
>>>>     struct landlock_socket_attr sock_inet6_attr = {
>>>>       .allowed_access = LANDLOCK_ACCESS_SOCKET_CREATE,
>>>>       .domain = AF_INET6,
>>>>        .type = SOCK_STREAM,
>>>>     }
>>>>
>>>> That should then be enough to bind and listen on ports, whereas outgoing
>>>> connections with TCP and anything using other network protocols would not be
>>>> permitted.
>>>>
>>> TCP server is an interesting case. From a security perspective, a
>>> process cares if it is acting as a server or client in TCP, a server
>>> might only want to accept an incoming TCP connection, never initiate
>>> an outgoing TCP connection, and a client is the opposite.
>>>
>>> Processes can restrict outgoing/incoming TCP connection by seccomp for
>>> accept(2) or connect(2),  though I feel Landlock can do this more
>>> naturally for app dev, and at per-protocol level.  seccomp doesn't
>>> provide per-protocol granularity.
>>
>> Right, seccomp cannot filter TCP ports.
>>
>>>
>>> For bind(2), iirc, it can be used for a server to assign dst port of
>>> incoming TCP connection, also by a client to assign a src port of an
>>> outgoing TCP connection. LANDLOCK_NET_BIND_TCP will apply to both
>>> cases, right ? this might not be a problem, just something to keep
>>> note.
>>
>> Good point. I think it is in line with the rule definition: to allow to
>> bind on a specific port. However, if clients want to set the source port
>> to a (legitimate) value, then that would be an issue because we cannot
>> allow a whole range of ports (e.g., >= 1024). I'm not sure if this
>> practice would be deemed "legitimate" though. Do you know client
>> applications using bind?
>>
>> Konstantin, we should have a test for this case anyway.

Thinking more about TCP clients binding sockets, a 
LANDLOCK_ACCESS_NET_LISTEN_TCP would be more useful than 
LANDLOCK_ACCESS_NET_BIND_TCP, but being able to limit the scope of 
"bindable" ports is also valuable to forbid a malicious sandboxed 
process to impersonate a legitimate server process. This also means that 
it might be interesting to be able to handle port ranges.

We already have a LANDLOCK_ACCESS_NET_BIND_TCP implementation and 
related tests, so I think we should proceed with that. The next 
network-related patch series should implement this 
LANDLOCK_ACCESS_NET_LISTEN_TCP access right though, which should not be 
difficult thanks to the framework implemented with current patch series.

Konstantin, would you like to develop the TCP listening access control 
once this patch series land?


>>>> (Alternatively, it could bind() the socket early, *then enable Landlock* and
>>>> leave out the rule for BIND_TCP, only permitting SOCKET_CREATE for IPv4 and
>>>> IPv6, so that listen() and accept() work on the already-bound socket.)
>>>>
>>> For this approach, LANDLOCK_ACCESS_SOCKET_PROTOCOL is a better name,
>>> so dev is fully aware it is not just applied to socket create.
>>
>> I don't get the semantic of LANDLOCK_ACCESS_SOCKET_PROTOCOL. What does
>> PROTOCOL mean?
>>
> I meant checking family + type of socket, and apply to all of
> socket(2),bind(2),accept(2),connect(2),listen(2), maybe
> send(2)/recv(2) too.

OK, that would be kind of similar to the LANDLOCK_ACCESS_SOCKET_USE 
description. However, I think this kind of global approach has several 
issues:
- This covers a lot of different aspects and would increase the cost of 
development/testing/review.
- Whereas it wraps different actions, it will not let user space have a 
fine-grained access control on these, which could be useful for some use 
cases.
- I don't see the point of restricting accept(2) if we can already 
restrict bind(2) and listen(2). accept(2) could be useful to identify 
the remote peer but I'm not convinced this would make sense, and if it 
would, then this can be postponed until we have a way to identify peers.
- For performance reasons, we should avoid restricting 
send/recv/read/write but instead only restrict the control plane: object 
creation and configuration.

I'm not convinced that being able to control all kind of socket bind, 
listen and connect actions might be worth implementing instead of a 
fine-grained access control for the main protocols (TCP, UDP, unix and 
vsock maybe), with the related tests and guarantees.

However, this landlock_socket_attr struct could have an allowed_access 
field that could contain LANDLOCK_ACCESS_NET_{CONNECT,LISTEN,BIND}_TCP 
rights (which would just not be constrained by any port, except if a 
landlock_net_port_attr rule matches). It would then make sense to rename 
LANDLOCK_ACCESS_SOCKET_CREATE to LANDLOCK_ACCESS_NET_CREATE_SOCKET. This 
right would not be accepted in a landlock_net_port_attr.allowed_access 
though.

> 
> s/LANDLOCK_ACCESS_SOCKET_CREATE/LANDLOCK_ACCESS_SOCKET_TYPE.
> 
> This implies the kernel will check on socket fd's property (family +
> type) at those calls, this applies to
> a - the socket fd is created within the process, after landlock is applied.
> b - created in process prior to landlock is applied.
> c - created out of process then passed into this process,

OK, these are the same rules as for LANDLOCK_ACCESS_NET_{CONNECT,BIND}_TCP.
Jeff Xu June 30, 2023, 4:18 a.m. UTC | #15
On Thu, Jun 29, 2023 at 4:07 AM Mickaël Salaün <mic@digikod.net> wrote:
>
>
> On 29/06/2023 05:18, Jeff Xu wrote:
> > resend.
> >
> > On Wed, Jun 28, 2023 at 12:29 PM Mickaël Salaün <mic@digikod.net> wrote:
> >>
> >>
> >> On 28/06/2023 19:03, Jeff Xu wrote:
> >>> Hello,
> >>>
> >>> Thanks for writing up the example for an incoming TCP connection ! It
> >>> helps with the context.
> >>>
> >>> Since I'm late to this thread, one thing I want to ask:  all the APIs
> >>> proposed so far are at the process level, we don't have any API that
> >>> applies restriction to socket fd itself, right ? this is what I
> >>> thought, but I would like to get confirmation.
> >>
> >> Restriction are applied to actions, not to already existing/opened FDs.
> >> We could add a way to restrict opened FDs, but I don't think this is the
> >> right approach because sandboxing is a deliberate action from a process,
> >> and it should already take care of its FDs.
> >>
> >>
> >>>
> >>> On Wed, Jun 28, 2023 at 2:09 AM Günther Noack <gnoack@google.com> wrote:
> >>>>
> >>>> Hello!
> >>>>
> >>>> On Mon, Jun 26, 2023 at 05:29:34PM +0200, Mickaël Salaün wrote:
> >>>>> Here is a design to be able to only allow a set of network protocols and
> >>>>> deny everything else. This would be complementary to Konstantin's patch
> >>>>> series which addresses fine-grained access control.
> >>>>>
> >>>>> First, I want to remind that Landlock follows an allowed list approach with
> >>>>> a set of (growing) supported actions (for compatibility reasons), which is
> >>>>> kind of an allow-list-on-a-deny-list. But with this proposal, we want to be
> >>>>> able to deny everything, which means: supported, not supported, known and
> >>>>> unknown protocols.
> >>>>>
> >>>>> We could add a new "handled_access_socket" field to the landlock_ruleset
> >>>>> struct, which could contain a LANDLOCK_ACCESS_SOCKET_CREATE flag.
> >>>>>
> >>>>> If this field is set, users could add a new type of rules:
> >>>>> struct landlock_socket_attr {
> >>>>>       __u64 allowed_access;
> >>>>>       int domain; // see socket(2)
> >>>>>       int type; // see socket(2)
> >>>>> }
> >>>>>
> >>>>> The allowed_access field would only contain LANDLOCK_ACCESS_SOCKET_CREATE at
> >>>>> first, but it could grow with other actions (which cannot be handled with
> >>>>> seccomp):
> >>>>> - use: walk through all opened FDs and mark them as allowed or denied
> >>>>> - receive: hook on received FDs
> >>>>> - send: hook on sent FDs
> >>>>>
> >>>>> We might also use the same approach for non-socket objects that can be
> >>>>> identified with some meaningful properties.
> >>>>>
> >>>>> What do you think?
> >>>>
> >>>> This sounds like a good plan to me - it would make it possible to restrict new
> >>>> socket creation using protocols that were not intended to be used, and I also
> >>>> think it would fit the Landlock model nicely.
> >>>>
> >>>> Small remark on the side: The security_socket_create() hook does not only get
> >>>> invoked as a result of socket(2), but also as a part of accept(2) - so this
> >>>> approach might already prevent new connections very effectively.
> >>>>
> >>> That is an interesting aspect that might be worth discussing more.
> >>> seccomp is per syscall, landlock doesn't necessarily follow the same,
> >>> another design is to add more logic in Landlock, e.g.
> >>> LANDLOCK_ACCESS_SOCKET_PROTOCOL which will apply to all of the socket
> >>> calls (socket/bind/listen/accept/connect). App dev might feel it is
> >>> easier to use.
> >>
> >> seccomp restricts the use of the syscall interface, whereas Landlock
> >> restricts the use of kernel objects (i.e. the semantic).
> >>
> >> We need to find a good tradeoff between a lot of access rights and a few
> >> grouping different actions. This should make sense from a developer
> >> point of view according to its knowledge and use of the kernel
> >> interfaces (potential wrapped with high level libraries), but also to
> >> the semantic of the sandbox and the security guarantees we want to provide.
> >>
> >> We should also keep in mind that high level Landlock libraries can take
> >> care of potential coarse-grained use of restrictions.
> >>
> >>
> >>>
> >>>> Spelling out some scenarios, so that we are sure that we are on the same page:
> >>>>
> >>>> A)
> >>>>
> >>>> A program that does not need networking could specify a ruleset where
> >>>> LANDLOCK_ACCESS_SOCKET_CREATE is handled, and simply not permit anything.
> >>>>
> >>>> B)
> >>>>
> >>>> A program that runs a TCP server could specify a ruleset where
> >>>> LANDLOCK_NET_BIND_TCP, LANDLOCK_NET_CONNECT_TCP and
> >>>> LANDLOCK_ACCESS_SOCKET_CREATE are handled, and where the following rules are added:
> >>>>
> >>>>     /* From Konstantin's patch set */
> >>>>     struct landlock_net_service_attr bind_attr = {
> >>>>       .allowed_access = LANDLOCK_NET_BIND_TCP,
> >>>>       .port = 8080,
> >>>>     };
> >>>>
> >>>>     /* From Mickaël's proposal */
> >>>>     struct landlock_socket_attr sock_inet_attr = {
> >>>>       .allowed_access = LANDLOCK_ACCESS_SOCKET_CREATE,
> >>>>       .domain = AF_INET,
> >>>>       .type = SOCK_STREAM,
> >>>>     }
> >>>>
> >>>>     struct landlock_socket_attr sock_inet6_attr = {
> >>>>       .allowed_access = LANDLOCK_ACCESS_SOCKET_CREATE,
> >>>>       .domain = AF_INET6,
> >>>>        .type = SOCK_STREAM,
> >>>>     }
> >>>>
> >>>> That should then be enough to bind and listen on ports, whereas outgoing
> >>>> connections with TCP and anything using other network protocols would not be
> >>>> permitted.
> >>>>
> >>> TCP server is an interesting case. From a security perspective, a
> >>> process cares if it is acting as a server or client in TCP, a server
> >>> might only want to accept an incoming TCP connection, never initiate
> >>> an outgoing TCP connection, and a client is the opposite.
> >>>
> >>> Processes can restrict outgoing/incoming TCP connection by seccomp for
> >>> accept(2) or connect(2),  though I feel Landlock can do this more
> >>> naturally for app dev, and at per-protocol level.  seccomp doesn't
> >>> provide per-protocol granularity.
> >>
> >> Right, seccomp cannot filter TCP ports.
> >>
> >>>
> >>> For bind(2), iirc, it can be used for a server to assign dst port of
> >>> incoming TCP connection, also by a client to assign a src port of an
> >>> outgoing TCP connection. LANDLOCK_NET_BIND_TCP will apply to both
> >>> cases, right ? this might not be a problem, just something to keep
> >>> note.
> >>
> >> Good point. I think it is in line with the rule definition: to allow to
> >> bind on a specific port. However, if clients want to set the source port
> >> to a (legitimate) value, then that would be an issue because we cannot
> >> allow a whole range of ports (e.g., >= 1024). I'm not sure if this
> >> practice would be deemed "legitimate" though. Do you know client
> >> applications using bind?
> >>
> >> Konstantin, we should have a test for this case anyway.
>
> Thinking more about TCP clients binding sockets, a
> LANDLOCK_ACCESS_NET_LISTEN_TCP would be more useful than
> LANDLOCK_ACCESS_NET_BIND_TCP, but being able to limit the scope of
> "bindable" ports is also valuable to forbid a malicious sandboxed
> process to impersonate a legitimate server process. This also means that
> it might be interesting to be able to handle port ranges.
>
> We already have a LANDLOCK_ACCESS_NET_BIND_TCP implementation and
> related tests, so I think we should proceed with that. The next
> network-related patch series should implement this
> LANDLOCK_ACCESS_NET_LISTEN_TCP access right though, which should not be
> difficult thanks to the framework implemented with current patch series.
>
> Konstantin, would you like to develop the TCP listening access control
> once this patch series land?
>
>
> >>>> (Alternatively, it could bind() the socket early, *then enable Landlock* and
> >>>> leave out the rule for BIND_TCP, only permitting SOCKET_CREATE for IPv4 and
> >>>> IPv6, so that listen() and accept() work on the already-bound socket.)
> >>>>
> >>> For this approach, LANDLOCK_ACCESS_SOCKET_PROTOCOL is a better name,
> >>> so dev is fully aware it is not just applied to socket create.
> >>
> >> I don't get the semantic of LANDLOCK_ACCESS_SOCKET_PROTOCOL. What does
> >> PROTOCOL mean?
> >>
> > I meant checking family + type of socket, and apply to all of
> > socket(2),bind(2),accept(2),connect(2),listen(2), maybe
> > send(2)/recv(2) too.
>
> OK, that would be kind of similar to the LANDLOCK_ACCESS_SOCKET_USE
> description. However, I think this kind of global approach has several
> issues:
> - This covers a lot of different aspects and would increase the cost of
> development/testing/review.
True.

> - Whereas it wraps different actions, it will not let user space have a
> fine-grained access control on these, which could be useful for some use
> cases.
Make sense.

> - I don't see the point of restricting accept(2) if we can already
> restrict bind(2) and listen(2). accept(2) could be useful to identify
> the remote peer but I'm not convinced this would make sense, and if it
> would, then this can be postponed until we have a way to identify peers.

I was thinking about a case that the socket was created/bind/listen in
another process, then passed into the current process,

For example:
Process A has :
LANDLOCK_ACCESS_SOCKET_CREATE (family = f1, type = t1)
socket s1 is created in process A with family = f1, type = t1, and
bind/listen to port p1.

socket s1 is passed to process B
Process B has:
LANDLOCK_ACCESS_SOCKET_CREATE (family =f1, type = t2) (note the type
is different than A)
LANDLOCK_ACCESS_NET_{CONNECT,BIND}_TCP (port = p2)

However, those rules in B don't restrict process B from using
accept(s1), s1 is another type.

In accept(2), struct sockaddr contains sa_family_t (AF_xx)  but no
type, which is strange to me, the API should either include both, or
none (accept whatever it is already in socket fd, which is set during
creation time).

looking into accept(2) implementation: it calls: sock->ops->accept
iiuc, sock->ops is set during socket(2), allowing each protocol to
have its own implementation.

When we consider a> our intention to restrict family + type of socket,
with b> socket can be passed between processes,
there can be a need to harden the check (family + type) for all of
bind/listen/accept/connect. Otherwise, there is still a possibility
that the process to accept a socket of different type unintentionally.

This means:
LANDLOCK_ACCESS_SOCKET_ATTR_CREATE (family =f1, type = t2)
LANDLOCK_ACCESS_SOCKET_ATTR_BIND (family =f1, type = t2)
LANDLOCK_ACCESS_SOCKET_ATTR_ACCEPT (family =f1, type = t2)
LANDLOCK_ACCESS_SOCKET_ATTR_ LISTEN (family =f1, type = t2)
LANDLOCK_ACCESS_SOCKET_ATTR_CONNECT (family =f1, type = t2)
Note: this checks family+type only, not port.
The check is applied to all protocols, so not specific to TCP/UDP

> - For performance reasons, we should avoid restricting
> send/recv/read/write but instead only restrict the control plane: object
> creation and configuration.
>
Performance is a valid concern.

As example of server, usually the main process listens/accepts incoming
connections, and forked processes do send/recv, the main process can
be viewed as a control plane, and send/recv can be viewed as a data
plane. It makes sense that we start with the control plane.

We might like to keep a note that by not restricting send/recv, a
socket can be created OOP, then passed into current process and call
send/recv, so the network is not fully disabled by landlock alone
(still need seccomp)

Things might get more complicated, say: a forked process is intended
to send/recv UDP, but was confused and got a TCP socket from
OOP, etc. This is not different than accept(2) case above. There might
be an opportunity for Landlock to harden this when we design for
data-plane.

> I'm not convinced that being able to control all kind of socket bind,
> listen and connect actions might be worth implementing instead of a
> fine-grained access control for the main protocols (TCP, UDP, unix and
> vsock maybe), with the related tests and guarantees.
>
> However, this landlock_socket_attr struct could have an allowed_access
> field that could contain LANDLOCK_ACCESS_NET_{CONNECT,LISTEN,BIND}_TCP
> rights (which would just not be constrained by any port, except if a
> landlock_net_port_attr rule matches). It would then make sense to rename
> LANDLOCK_ACCESS_SOCKET_CREATE to LANDLOCK_ACCESS_NET_CREATE_SOCKET. This
> right would not be accepted in a landlock_net_port_attr.allowed_access
> though.
>
I'm not sure if my view is fully explained. I don't mean to control
all kinds of socket bind/listen/connect actions.
My view is:
1> have a rule to check family + type, to make sure the process is
using the socket type they intend to use, such as
LANDLOCK_ACCESS_SOCKET_ATTR_{CREATE|CONNECT|BIND|ACCEPT|LISTEN}, as
discussed in accept(2) case.
2> have protocol specific rules, such as LANDLOCK_ACCESS_NET_{CONNECT,BIND}_TCP.
So bind(2) will be checked by both 1 and 2.

As example of TCP server, the process will use:
LANDLOCK_ACCESS_SOCKET_ATTR_{CREATE|BIND|ACCEPT|LISTEN}
LANDLOCK_ACCESS_NET_{BIND}_TCP

> >
> > s/LANDLOCK_ACCESS_SOCKET_CREATE/LANDLOCK_ACCESS_SOCKET_TYPE.
> >
> > This implies the kernel will check on socket fd's property (family +
> > type) at those calls, this applies to
> > a - the socket fd is created within the process, after landlock is applied.
> > b - created in process prior to landlock is applied.
> > c - created out of process then passed into this process,
>
> OK, these are the same rules as for LANDLOCK_ACCESS_NET_{CONNECT,BIND}_TCP.

I don't mean this to be _TCP specific, this is still the family + type
discussion above.
Mickaël Salaün June 30, 2023, 6:23 p.m. UTC | #16
On 30/06/2023 06:18, Jeff Xu wrote:
> On Thu, Jun 29, 2023 at 4:07 AM Mickaël Salaün <mic@digikod.net> wrote:
>>
>>
>> On 29/06/2023 05:18, Jeff Xu wrote:
>>> resend.
>>>
>>> On Wed, Jun 28, 2023 at 12:29 PM Mickaël Salaün <mic@digikod.net> wrote:
>>>>
>>>>
>>>> On 28/06/2023 19:03, Jeff Xu wrote:
>>>>> Hello,
>>>>>
>>>>> Thanks for writing up the example for an incoming TCP connection ! It
>>>>> helps with the context.
>>>>>
>>>>> Since I'm late to this thread, one thing I want to ask:  all the APIs
>>>>> proposed so far are at the process level, we don't have any API that
>>>>> applies restriction to socket fd itself, right ? this is what I
>>>>> thought, but I would like to get confirmation.
>>>>
>>>> Restriction are applied to actions, not to already existing/opened FDs.
>>>> We could add a way to restrict opened FDs, but I don't think this is the
>>>> right approach because sandboxing is a deliberate action from a process,
>>>> and it should already take care of its FDs.
>>>>
>>>>
>>>>>
>>>>> On Wed, Jun 28, 2023 at 2:09 AM Günther Noack <gnoack@google.com> wrote:
>>>>>>
>>>>>> Hello!
>>>>>>
>>>>>> On Mon, Jun 26, 2023 at 05:29:34PM +0200, Mickaël Salaün wrote:
>>>>>>> Here is a design to be able to only allow a set of network protocols and
>>>>>>> deny everything else. This would be complementary to Konstantin's patch
>>>>>>> series which addresses fine-grained access control.
>>>>>>>
>>>>>>> First, I want to remind that Landlock follows an allowed list approach with
>>>>>>> a set of (growing) supported actions (for compatibility reasons), which is
>>>>>>> kind of an allow-list-on-a-deny-list. But with this proposal, we want to be
>>>>>>> able to deny everything, which means: supported, not supported, known and
>>>>>>> unknown protocols.
>>>>>>>
>>>>>>> We could add a new "handled_access_socket" field to the landlock_ruleset
>>>>>>> struct, which could contain a LANDLOCK_ACCESS_SOCKET_CREATE flag.
>>>>>>>
>>>>>>> If this field is set, users could add a new type of rules:
>>>>>>> struct landlock_socket_attr {
>>>>>>>        __u64 allowed_access;
>>>>>>>        int domain; // see socket(2)
>>>>>>>        int type; // see socket(2)
>>>>>>> }
>>>>>>>
>>>>>>> The allowed_access field would only contain LANDLOCK_ACCESS_SOCKET_CREATE at
>>>>>>> first, but it could grow with other actions (which cannot be handled with
>>>>>>> seccomp):
>>>>>>> - use: walk through all opened FDs and mark them as allowed or denied
>>>>>>> - receive: hook on received FDs
>>>>>>> - send: hook on sent FDs
>>>>>>>
>>>>>>> We might also use the same approach for non-socket objects that can be
>>>>>>> identified with some meaningful properties.
>>>>>>>
>>>>>>> What do you think?
>>>>>>
>>>>>> This sounds like a good plan to me - it would make it possible to restrict new
>>>>>> socket creation using protocols that were not intended to be used, and I also
>>>>>> think it would fit the Landlock model nicely.
>>>>>>
>>>>>> Small remark on the side: The security_socket_create() hook does not only get
>>>>>> invoked as a result of socket(2), but also as a part of accept(2) - so this
>>>>>> approach might already prevent new connections very effectively.
>>>>>>
>>>>> That is an interesting aspect that might be worth discussing more.
>>>>> seccomp is per syscall, landlock doesn't necessarily follow the same,
>>>>> another design is to add more logic in Landlock, e.g.
>>>>> LANDLOCK_ACCESS_SOCKET_PROTOCOL which will apply to all of the socket
>>>>> calls (socket/bind/listen/accept/connect). App dev might feel it is
>>>>> easier to use.
>>>>
>>>> seccomp restricts the use of the syscall interface, whereas Landlock
>>>> restricts the use of kernel objects (i.e. the semantic).
>>>>
>>>> We need to find a good tradeoff between a lot of access rights and a few
>>>> grouping different actions. This should make sense from a developer
>>>> point of view according to its knowledge and use of the kernel
>>>> interfaces (potential wrapped with high level libraries), but also to
>>>> the semantic of the sandbox and the security guarantees we want to provide.
>>>>
>>>> We should also keep in mind that high level Landlock libraries can take
>>>> care of potential coarse-grained use of restrictions.
>>>>
>>>>
>>>>>
>>>>>> Spelling out some scenarios, so that we are sure that we are on the same page:
>>>>>>
>>>>>> A)
>>>>>>
>>>>>> A program that does not need networking could specify a ruleset where
>>>>>> LANDLOCK_ACCESS_SOCKET_CREATE is handled, and simply not permit anything.
>>>>>>
>>>>>> B)
>>>>>>
>>>>>> A program that runs a TCP server could specify a ruleset where
>>>>>> LANDLOCK_NET_BIND_TCP, LANDLOCK_NET_CONNECT_TCP and
>>>>>> LANDLOCK_ACCESS_SOCKET_CREATE are handled, and where the following rules are added:
>>>>>>
>>>>>>      /* From Konstantin's patch set */
>>>>>>      struct landlock_net_service_attr bind_attr = {
>>>>>>        .allowed_access = LANDLOCK_NET_BIND_TCP,
>>>>>>        .port = 8080,
>>>>>>      };
>>>>>>
>>>>>>      /* From Mickaël's proposal */
>>>>>>      struct landlock_socket_attr sock_inet_attr = {
>>>>>>        .allowed_access = LANDLOCK_ACCESS_SOCKET_CREATE,
>>>>>>        .domain = AF_INET,
>>>>>>        .type = SOCK_STREAM,
>>>>>>      }
>>>>>>
>>>>>>      struct landlock_socket_attr sock_inet6_attr = {
>>>>>>        .allowed_access = LANDLOCK_ACCESS_SOCKET_CREATE,
>>>>>>        .domain = AF_INET6,
>>>>>>         .type = SOCK_STREAM,
>>>>>>      }
>>>>>>
>>>>>> That should then be enough to bind and listen on ports, whereas outgoing
>>>>>> connections with TCP and anything using other network protocols would not be
>>>>>> permitted.
>>>>>>
>>>>> TCP server is an interesting case. From a security perspective, a
>>>>> process cares if it is acting as a server or client in TCP, a server
>>>>> might only want to accept an incoming TCP connection, never initiate
>>>>> an outgoing TCP connection, and a client is the opposite.
>>>>>
>>>>> Processes can restrict outgoing/incoming TCP connection by seccomp for
>>>>> accept(2) or connect(2),  though I feel Landlock can do this more
>>>>> naturally for app dev, and at per-protocol level.  seccomp doesn't
>>>>> provide per-protocol granularity.
>>>>
>>>> Right, seccomp cannot filter TCP ports.
>>>>
>>>>>
>>>>> For bind(2), iirc, it can be used for a server to assign dst port of
>>>>> incoming TCP connection, also by a client to assign a src port of an
>>>>> outgoing TCP connection. LANDLOCK_NET_BIND_TCP will apply to both
>>>>> cases, right ? this might not be a problem, just something to keep
>>>>> note.
>>>>
>>>> Good point. I think it is in line with the rule definition: to allow to
>>>> bind on a specific port. However, if clients want to set the source port
>>>> to a (legitimate) value, then that would be an issue because we cannot
>>>> allow a whole range of ports (e.g., >= 1024). I'm not sure if this
>>>> practice would be deemed "legitimate" though. Do you know client
>>>> applications using bind?
>>>>
>>>> Konstantin, we should have a test for this case anyway.
>>
>> Thinking more about TCP clients binding sockets, a
>> LANDLOCK_ACCESS_NET_LISTEN_TCP would be more useful than
>> LANDLOCK_ACCESS_NET_BIND_TCP, but being able to limit the scope of
>> "bindable" ports is also valuable to forbid a malicious sandboxed
>> process to impersonate a legitimate server process. This also means that
>> it might be interesting to be able to handle port ranges.
>>
>> We already have a LANDLOCK_ACCESS_NET_BIND_TCP implementation and
>> related tests, so I think we should proceed with that. The next
>> network-related patch series should implement this
>> LANDLOCK_ACCESS_NET_LISTEN_TCP access right though, which should not be
>> difficult thanks to the framework implemented with current patch series.
>>
>> Konstantin, would you like to develop the TCP listening access control
>> once this patch series land?
>>
>>
>>>>>> (Alternatively, it could bind() the socket early, *then enable Landlock* and
>>>>>> leave out the rule for BIND_TCP, only permitting SOCKET_CREATE for IPv4 and
>>>>>> IPv6, so that listen() and accept() work on the already-bound socket.)
>>>>>>
>>>>> For this approach, LANDLOCK_ACCESS_SOCKET_PROTOCOL is a better name,
>>>>> so dev is fully aware it is not just applied to socket create.
>>>>
>>>> I don't get the semantic of LANDLOCK_ACCESS_SOCKET_PROTOCOL. What does
>>>> PROTOCOL mean?
>>>>
>>> I meant checking family + type of socket, and apply to all of
>>> socket(2),bind(2),accept(2),connect(2),listen(2), maybe
>>> send(2)/recv(2) too.
>>
>> OK, that would be kind of similar to the LANDLOCK_ACCESS_SOCKET_USE
>> description. However, I think this kind of global approach has several
>> issues:
>> - This covers a lot of different aspects and would increase the cost of
>> development/testing/review.
> True.
> 
>> - Whereas it wraps different actions, it will not let user space have a
>> fine-grained access control on these, which could be useful for some use
>> cases.
> Make sense.
> 
>> - I don't see the point of restricting accept(2) if we can already
>> restrict bind(2) and listen(2). accept(2) could be useful to identify
>> the remote peer but I'm not convinced this would make sense, and if it
>> would, then this can be postponed until we have a way to identify peers.
> 
> I was thinking about a case that the socket was created/bind/listen in
> another process, then passed into the current process,
> 
> For example:
> Process A has :
> LANDLOCK_ACCESS_SOCKET_CREATE (family = f1, type = t1)
> socket s1 is created in process A with family = f1, type = t1, and
> bind/listen to port p1.
> 
> socket s1 is passed to process B
> Process B has:
> LANDLOCK_ACCESS_SOCKET_CREATE (family =f1, type = t2) (note the type
> is different than A)
> LANDLOCK_ACCESS_NET_{CONNECT,BIND}_TCP (port = p2)
> 
> However, those rules in B don't restrict process B from using
> accept(s1), s1 is another type.

Indeed, but why process A would pass this FD to B? Do you have real use 
cases in mind?

In case of confuse deputy attack, I'm convinced there is much more 
chance for B to just ask A to do nasty thing, no need to receive an FD, 
just to write data to the socket/IPC.

> 
> In accept(2), struct sockaddr contains sa_family_t (AF_xx)  but no
> type, which is strange to me, the API should either include both, or
> none (accept whatever it is already in socket fd, which is set during
> creation time).

I think sockaddr defines the minimal requirement to deal with 
accept/bind/connect. The sin_family is require to define the type of 
address and port according, but the type is not.

> 
> looking into accept(2) implementation: it calls: sock->ops->accept
> iiuc, sock->ops is set during socket(2), allowing each protocol to
> have its own implementation.
> 
> When we consider a> our intention to restrict family + type of socket,
> with b> socket can be passed between processes,
> there can be a need to harden the check (family + type) for all of
> bind/listen/accept/connect. Otherwise, there is still a possibility
> that the process to accept a socket of different type unintentionally.
> 
> This means:
> LANDLOCK_ACCESS_SOCKET_ATTR_CREATE (family =f1, type = t2)
> LANDLOCK_ACCESS_SOCKET_ATTR_BIND (family =f1, type = t2)
> LANDLOCK_ACCESS_SOCKET_ATTR_ACCEPT (family =f1, type = t2)
> LANDLOCK_ACCESS_SOCKET_ATTR_ LISTEN (family =f1, type = t2)
> LANDLOCK_ACCESS_SOCKET_ATTR_CONNECT (family =f1, type = t2)
> Note: this checks family+type only, not port.
> The check is applied to all protocols, so not specific to TCP/UDP

The sandboxing/Landlock threat model is to restrict a process when it is 
sandboxed, but this sandboxing is a request from the same process (or 
one of its parent) that happen when it is more trustworthy (or at least 
has more privileges) than after it sandbox itself.

The process sandboxing itself can use several kernel features, and one 
of it is Landlock. In any case, it should take care of closing file 
descriptors that should not be passed to the sandboxed process.

The limits of sandboxing are the communication channels from and to 
outside the sandbox. The peers talking with sandboxed processes should 
then not be subject to confused deputy attacks, which means they must 
not enable to bypass the user-defined security policy (from which the 
Landlock policy is only a part). Receiving file descriptors should then 
not be more important than controlling the communication channels. If a 
not-sandboxed process is willing to give more right to a sandboxed 
process, by passing FDs or just receiving commands, then this 
not-sandboxed process need to be fixed.

This is the rationale to not care about received nor sent file 
descriptors. The communication channels and the remote peers must be 
trusted to not give more privileges to the sandboxed processes.

If a peer is malicious, it doesn't need to pass a file descriptor to the 
sandboxed process, it can just read (data) commands and apply them to 
its file descriptors. I think the ability to pass file descriptors 
should be seen as a way to improve performance by avoiding a user space 
process to act as a proxy receiving read/write commands and managing 
file descriptors itself. On the other hand, file descriptors could be 
used as real capabilities/tokens to manage access, but senders still 
need to be careful to only pass the required ones.

All this to say that being able to restrict actions on file descriptors 
would be useful for senders/services to send a subset of the file 
descriptor capabilities (cf. Capsicum), but not the other way around.


> 
>> - For performance reasons, we should avoid restricting
>> send/recv/read/write but instead only restrict the control plane: object
>> creation and configuration.
>>
> Performance is a valid concern.
> 
> As example of server, usually the main process listens/accepts incoming
> connections, and forked processes do send/recv, the main process can
> be viewed as a control plane, and send/recv can be viewed as a data
> plane. It makes sense that we start with the control plane.
> 
> We might like to keep a note that by not restricting send/recv, a
> socket can be created OOP, then passed into current process and call
> send/recv, so the network is not fully disabled by landlock alone
> (still need seccomp)

Right, the kernel (and then Landlock) is not enough to sandbox a 
complete environment, user space needs to be aware and be configured for 
that too.

I understand the desire to restrict as much as possible, but this 
require to add more code and then it increase the risk of bugs, whereas 
it might not be a big deal for attackers. I don't think the cost is 
worth it and I don't want to give a false sense of security that could 
let users think their application cannot communicate with the network if 
it can communicate with local processes connected to the network.


> 
> Things might get more complicated, say: a forked process is intended
> to send/recv UDP, but was confused and got a TCP socket from
> OOP, etc. This is not different than accept(2) case above. There might
> be an opportunity for Landlock to harden this when we design for
> data-plane.
> 
>> I'm not convinced that being able to control all kind of socket bind,
>> listen and connect actions might be worth implementing instead of a
>> fine-grained access control for the main protocols (TCP, UDP, unix and
>> vsock maybe), with the related tests and guarantees.
>>
>> However, this landlock_socket_attr struct could have an allowed_access
>> field that could contain LANDLOCK_ACCESS_NET_{CONNECT,LISTEN,BIND}_TCP
>> rights (which would just not be constrained by any port, except if a
>> landlock_net_port_attr rule matches). It would then make sense to rename
>> LANDLOCK_ACCESS_SOCKET_CREATE to LANDLOCK_ACCESS_NET_CREATE_SOCKET. This
>> right would not be accepted in a landlock_net_port_attr.allowed_access
>> though.
>>
> I'm not sure if my view is fully explained. I don't mean to control
> all kinds of socket bind/listen/connect actions.
> My view is:
> 1> have a rule to check family + type, to make sure the process is
> using the socket type they intend to use, such as
> LANDLOCK_ACCESS_SOCKET_ATTR_{CREATE|CONNECT|BIND|ACCEPT|LISTEN}, as
> discussed in accept(2) case.
> 2> have protocol specific rules, such as LANDLOCK_ACCESS_NET_{CONNECT,BIND}_TCP.
> So bind(2) will be checked by both 1 and 2.

Right, I understand your point.

> 
> As example of TCP server, the process will use:
> LANDLOCK_ACCESS_SOCKET_ATTR_{CREATE|BIND|ACCEPT|LISTEN}
> LANDLOCK_ACCESS_NET_{BIND}_TCP
> 
>>>
>>> s/LANDLOCK_ACCESS_SOCKET_CREATE/LANDLOCK_ACCESS_SOCKET_TYPE.
>>>
>>> This implies the kernel will check on socket fd's property (family +
>>> type) at those calls, this applies to
>>> a - the socket fd is created within the process, after landlock is applied.
>>> b - created in process prior to landlock is applied.
>>> c - created out of process then passed into this process,
>>
>> OK, these are the same rules as for LANDLOCK_ACCESS_NET_{CONNECT,BIND}_TCP.
> 
> I don't mean this to be _TCP specific, this is still the family + type
> discussion above.

Yes, I meant that your a/b/c rules would apply for the current 
LANDLOCK_ACCESS_NET_{CONNECT,BIND}_TCP types as well.
Jeff Xu July 5, 2023, 3 p.m. UTC | #17
On Fri, Jun 30, 2023 at 11:23 AM Mickaël Salaün <mic@digikod.net> wrote:
>
>
> On 30/06/2023 06:18, Jeff Xu wrote:
> > On Thu, Jun 29, 2023 at 4:07 AM Mickaël Salaün <mic@digikod.net> wrote:
> >>
> >>
> >> On 29/06/2023 05:18, Jeff Xu wrote:
> >>> resend.
> >>>
> >>> On Wed, Jun 28, 2023 at 12:29 PM Mickaël Salaün <mic@digikod.net> wrote:
> >>>>
> >>>>
> >>>> On 28/06/2023 19:03, Jeff Xu wrote:
> >>>>> Hello,
> >>>>>
> >>>>> Thanks for writing up the example for an incoming TCP connection ! It
> >>>>> helps with the context.
> >>>>>
> >>>>> Since I'm late to this thread, one thing I want to ask:  all the APIs
> >>>>> proposed so far are at the process level, we don't have any API that
> >>>>> applies restriction to socket fd itself, right ? this is what I
> >>>>> thought, but I would like to get confirmation.
> >>>>
> >>>> Restriction are applied to actions, not to already existing/opened FDs.
> >>>> We could add a way to restrict opened FDs, but I don't think this is the
> >>>> right approach because sandboxing is a deliberate action from a process,
> >>>> and it should already take care of its FDs.
> >>>>
> >>>>
> >>>>>
> >>>>> On Wed, Jun 28, 2023 at 2:09 AM Günther Noack <gnoack@google.com> wrote:
> >>>>>>
> >>>>>> Hello!
> >>>>>>
> >>>>>> On Mon, Jun 26, 2023 at 05:29:34PM +0200, Mickaël Salaün wrote:
> >>>>>>> Here is a design to be able to only allow a set of network protocols and
> >>>>>>> deny everything else. This would be complementary to Konstantin's patch
> >>>>>>> series which addresses fine-grained access control.
> >>>>>>>
> >>>>>>> First, I want to remind that Landlock follows an allowed list approach with
> >>>>>>> a set of (growing) supported actions (for compatibility reasons), which is
> >>>>>>> kind of an allow-list-on-a-deny-list. But with this proposal, we want to be
> >>>>>>> able to deny everything, which means: supported, not supported, known and
> >>>>>>> unknown protocols.
> >>>>>>>
> >>>>>>> We could add a new "handled_access_socket" field to the landlock_ruleset
> >>>>>>> struct, which could contain a LANDLOCK_ACCESS_SOCKET_CREATE flag.
> >>>>>>>
> >>>>>>> If this field is set, users could add a new type of rules:
> >>>>>>> struct landlock_socket_attr {
> >>>>>>>        __u64 allowed_access;
> >>>>>>>        int domain; // see socket(2)
> >>>>>>>        int type; // see socket(2)
> >>>>>>> }
> >>>>>>>
> >>>>>>> The allowed_access field would only contain LANDLOCK_ACCESS_SOCKET_CREATE at
> >>>>>>> first, but it could grow with other actions (which cannot be handled with
> >>>>>>> seccomp):
> >>>>>>> - use: walk through all opened FDs and mark them as allowed or denied
> >>>>>>> - receive: hook on received FDs
> >>>>>>> - send: hook on sent FDs
> >>>>>>>
> >>>>>>> We might also use the same approach for non-socket objects that can be
> >>>>>>> identified with some meaningful properties.
> >>>>>>>
> >>>>>>> What do you think?
> >>>>>>
> >>>>>> This sounds like a good plan to me - it would make it possible to restrict new
> >>>>>> socket creation using protocols that were not intended to be used, and I also
> >>>>>> think it would fit the Landlock model nicely.
> >>>>>>
> >>>>>> Small remark on the side: The security_socket_create() hook does not only get
> >>>>>> invoked as a result of socket(2), but also as a part of accept(2) - so this
> >>>>>> approach might already prevent new connections very effectively.
> >>>>>>
> >>>>> That is an interesting aspect that might be worth discussing more.
> >>>>> seccomp is per syscall, landlock doesn't necessarily follow the same,
> >>>>> another design is to add more logic in Landlock, e.g.
> >>>>> LANDLOCK_ACCESS_SOCKET_PROTOCOL which will apply to all of the socket
> >>>>> calls (socket/bind/listen/accept/connect). App dev might feel it is
> >>>>> easier to use.
> >>>>
> >>>> seccomp restricts the use of the syscall interface, whereas Landlock
> >>>> restricts the use of kernel objects (i.e. the semantic).
> >>>>
> >>>> We need to find a good tradeoff between a lot of access rights and a few
> >>>> grouping different actions. This should make sense from a developer
> >>>> point of view according to its knowledge and use of the kernel
> >>>> interfaces (potential wrapped with high level libraries), but also to
> >>>> the semantic of the sandbox and the security guarantees we want to provide.
> >>>>
> >>>> We should also keep in mind that high level Landlock libraries can take
> >>>> care of potential coarse-grained use of restrictions.
> >>>>
> >>>>
> >>>>>
> >>>>>> Spelling out some scenarios, so that we are sure that we are on the same page:
> >>>>>>
> >>>>>> A)
> >>>>>>
> >>>>>> A program that does not need networking could specify a ruleset where
> >>>>>> LANDLOCK_ACCESS_SOCKET_CREATE is handled, and simply not permit anything.
> >>>>>>
> >>>>>> B)
> >>>>>>
> >>>>>> A program that runs a TCP server could specify a ruleset where
> >>>>>> LANDLOCK_NET_BIND_TCP, LANDLOCK_NET_CONNECT_TCP and
> >>>>>> LANDLOCK_ACCESS_SOCKET_CREATE are handled, and where the following rules are added:
> >>>>>>
> >>>>>>      /* From Konstantin's patch set */
> >>>>>>      struct landlock_net_service_attr bind_attr = {
> >>>>>>        .allowed_access = LANDLOCK_NET_BIND_TCP,
> >>>>>>        .port = 8080,
> >>>>>>      };
> >>>>>>
> >>>>>>      /* From Mickaël's proposal */
> >>>>>>      struct landlock_socket_attr sock_inet_attr = {
> >>>>>>        .allowed_access = LANDLOCK_ACCESS_SOCKET_CREATE,
> >>>>>>        .domain = AF_INET,
> >>>>>>        .type = SOCK_STREAM,
> >>>>>>      }
> >>>>>>
> >>>>>>      struct landlock_socket_attr sock_inet6_attr = {
> >>>>>>        .allowed_access = LANDLOCK_ACCESS_SOCKET_CREATE,
> >>>>>>        .domain = AF_INET6,
> >>>>>>         .type = SOCK_STREAM,
> >>>>>>      }
> >>>>>>
> >>>>>> That should then be enough to bind and listen on ports, whereas outgoing
> >>>>>> connections with TCP and anything using other network protocols would not be
> >>>>>> permitted.
> >>>>>>
> >>>>> TCP server is an interesting case. From a security perspective, a
> >>>>> process cares if it is acting as a server or client in TCP, a server
> >>>>> might only want to accept an incoming TCP connection, never initiate
> >>>>> an outgoing TCP connection, and a client is the opposite.
> >>>>>
> >>>>> Processes can restrict outgoing/incoming TCP connection by seccomp for
> >>>>> accept(2) or connect(2),  though I feel Landlock can do this more
> >>>>> naturally for app dev, and at per-protocol level.  seccomp doesn't
> >>>>> provide per-protocol granularity.
> >>>>
> >>>> Right, seccomp cannot filter TCP ports.
> >>>>
> >>>>>
> >>>>> For bind(2), iirc, it can be used for a server to assign dst port of
> >>>>> incoming TCP connection, also by a client to assign a src port of an
> >>>>> outgoing TCP connection. LANDLOCK_NET_BIND_TCP will apply to both
> >>>>> cases, right ? this might not be a problem, just something to keep
> >>>>> note.
> >>>>
> >>>> Good point. I think it is in line with the rule definition: to allow to
> >>>> bind on a specific port. However, if clients want to set the source port
> >>>> to a (legitimate) value, then that would be an issue because we cannot
> >>>> allow a whole range of ports (e.g., >= 1024). I'm not sure if this
> >>>> practice would be deemed "legitimate" though. Do you know client
> >>>> applications using bind?

My understanding is that the higher protocol might negotiate and
assign both ports for a new connection (I think SIP does this for
RTP, but that is UDP. I don't know any case for TCP).

> >>>>
> >>>> Konstantin, we should have a test for this case anyway.
> >>
> >> Thinking more about TCP clients binding sockets, a
> >> LANDLOCK_ACCESS_NET_LISTEN_TCP would be more useful than
> >> LANDLOCK_ACCESS_NET_BIND_TCP, but being able to limit the scope of
> >> "bindable" ports is also valuable to forbid a malicious sandboxed
> >> process to impersonate a legitimate server process. This also means that
> >> it might be interesting to be able to handle port ranges.
> >>
> >> We already have a LANDLOCK_ACCESS_NET_BIND_TCP implementation and
> >> related tests, so I think we should proceed with that. The next
> >> network-related patch series should implement this
> >> LANDLOCK_ACCESS_NET_LISTEN_TCP access right though, which should not be
> >> difficult thanks to the framework implemented with current patch series.
> >>
> >> Konstantin, would you like to develop the TCP listening access control
> >> once this patch series land?
> >>
> >>
> >>>>>> (Alternatively, it could bind() the socket early, *then enable Landlock* and
> >>>>>> leave out the rule for BIND_TCP, only permitting SOCKET_CREATE for IPv4 and
> >>>>>> IPv6, so that listen() and accept() work on the already-bound socket.)
> >>>>>>
> >>>>> For this approach, LANDLOCK_ACCESS_SOCKET_PROTOCOL is a better name,
> >>>>> so dev is fully aware it is not just applied to socket create.
> >>>>
> >>>> I don't get the semantic of LANDLOCK_ACCESS_SOCKET_PROTOCOL. What does
> >>>> PROTOCOL mean?
> >>>>
> >>> I meant checking family + type of socket, and apply to all of
> >>> socket(2),bind(2),accept(2),connect(2),listen(2), maybe
> >>> send(2)/recv(2) too.
> >>
> >> OK, that would be kind of similar to the LANDLOCK_ACCESS_SOCKET_USE
> >> description. However, I think this kind of global approach has several
> >> issues:
> >> - This covers a lot of different aspects and would increase the cost of
> >> development/testing/review.
> > True.
> >
> >> - Whereas it wraps different actions, it will not let user space have a
> >> fine-grained access control on these, which could be useful for some use
> >> cases.
> > Make sense.
> >
> >> - I don't see the point of restricting accept(2) if we can already
> >> restrict bind(2) and listen(2). accept(2) could be useful to identify
> >> the remote peer but I'm not convinced this would make sense, and if it
> >> would, then this can be postponed until we have a way to identify peers.
> >
> > I was thinking about a case that the socket was created/bind/listen in
> > another process, then passed into the current process,
> >
> > For example:
> > Process A has :
> > LANDLOCK_ACCESS_SOCKET_CREATE (family = f1, type = t1)
> > socket s1 is created in process A with family = f1, type = t1, and
> > bind/listen to port p1.
> >
> > socket s1 is passed to process B
> > Process B has:
> > LANDLOCK_ACCESS_SOCKET_CREATE (family =f1, type = t2) (note the type
> > is different than A)
> > LANDLOCK_ACCESS_NET_{CONNECT,BIND}_TCP (port = p2)
> >
> > However, those rules in B don't restrict process B from using
> > accept(s1), s1 is another type.
>
> Indeed, but why process A would pass this FD to B? Do you have real use
> cases in mind?
>
> In case of confuse deputy attack, I'm convinced there is much more
> chance for B to just ask A to do nasty thing, no need to receive an FD,
> just to write data to the socket/IPC.
>
Yes. It is the confuse deputy attack I was referring to.
Though, it is more of a design options type of questions than a real
world attack case.

> >
> > In accept(2), struct sockaddr contains sa_family_t (AF_xx)  but no
> > type, which is strange to me, the API should either include both, or
> > none (accept whatever it is already in socket fd, which is set during
> > creation time).
>
> I think sockaddr defines the minimal requirement to deal with
> accept/bind/connect. The sin_family is require to define the type of
> address and port according, but the type is not.
>
> >
> > looking into accept(2) implementation: it calls: sock->ops->accept
> > iiuc, sock->ops is set during socket(2), allowing each protocol to
> > have its own implementation.
> >
> > When we consider a> our intention to restrict family + type of socket,
> > with b> socket can be passed between processes,
> > there can be a need to harden the check (family + type) for all of
> > bind/listen/accept/connect. Otherwise, there is still a possibility
> > that the process to accept a socket of different type unintentionally.
> >
> > This means:
> > LANDLOCK_ACCESS_SOCKET_ATTR_CREATE (family =f1, type = t2)
> > LANDLOCK_ACCESS_SOCKET_ATTR_BIND (family =f1, type = t2)
> > LANDLOCK_ACCESS_SOCKET_ATTR_ACCEPT (family =f1, type = t2)
> > LANDLOCK_ACCESS_SOCKET_ATTR_ LISTEN (family =f1, type = t2)
> > LANDLOCK_ACCESS_SOCKET_ATTR_CONNECT (family =f1, type = t2)
> > Note: this checks family+type only, not port.
> > The check is applied to all protocols, so not specific to TCP/UDP
>
> The sandboxing/Landlock threat model is to restrict a process when it is
> sandboxed, but this sandboxing is a request from the same process (or
> one of its parent) that happen when it is more trustworthy (or at least
> has more privileges) than after it sandbox itself.
>
> The process sandboxing itself can use several kernel features, and one
> of it is Landlock. In any case, it should take care of closing file
> descriptors that should not be passed to the sandboxed process.
>
Agree.

> The limits of sandboxing are the communication channels from and to
> outside the sandbox. The peers talking with sandboxed processes should
> then not be subject to confused deputy attacks, which means they must
> not enable to bypass the user-defined security policy (from which the
> Landlock policy is only a part). Receiving file descriptors should then
> not be more important than controlling the communication channels. If a
> not-sandboxed process is willing to give more right to a sandboxed
> process, by passing FDs or just receiving commands, then this
> not-sandboxed process need to be fixed.
>
> This is the rationale to not care about received nor sent file
> descriptors. The communication channels and the remote peers must be
> trusted to not give more privileges to the sandboxed processes.
>
> If a peer is malicious, it doesn't need to pass a file descriptor to the
> sandboxed process, it can just read (data) commands and apply them to
> its file descriptors.

I see the reasoning. i.e. sandboxing the process is not more
important than securing communication channels, or securing the peer.

So in a system that let a peer process to pass a socket into a
higher privileged process, when the communication channel or the peer
process is compromised,  e.g. swapping the fd/socket into a different
one that the attacker controls, confuse deputy attack can happen. The
recommendation here is to secure peer and communication.
I agree with this approach in general.  I need to think about how it
applies to specific cases.

> I think the ability to pass file descriptors
> should be seen as a way to improve performance by avoiding a user space
> process to act as a proxy receiving read/write commands and managing
> file descriptors itself. On the other hand, file descriptors could be
> used as real capabilities/tokens to manage access, but senders still
> need to be careful to only pass the required ones.
>
> All this to say that being able to restrict actions on file descriptors
> would be useful for senders/services to send a subset of the file
> descriptor capabilities (cf. Capsicum), but not the other way around.
>
In the Landlock kernel doc:
Similarly to file access modes (e.g. O_RDWR), Landlock access rights
attached to file descriptors are retained even if they are passed
between processes (e.g. through a Unix domain socket). Such access
rights will then be enforced even if the receiving process is not
sandboxed by Landlock. Indeed, this is required to keep a consistent
access control over the whole system, and this avoids unattended
bypasses through file descriptor passing (i.e. confused deputy
attack).

iiuc, the design for file and socket in landlock is different. For
socket, the access rules are applied only to the current process (more
like seccomp), while for file restriction, the rules can be passed
into another un-landlocked process.

>
> >
> >> - For performance reasons, we should avoid restricting
> >> send/recv/read/write but instead only restrict the control plane: object
> >> creation and configuration.
> >>
> > Performance is a valid concern.
> >
> > As example of server, usually the main process listens/accepts incoming
> > connections, and forked processes do send/recv, the main process can
> > be viewed as a control plane, and send/recv can be viewed as a data
> > plane. It makes sense that we start with the control plane.
> >
> > We might like to keep a note that by not restricting send/recv, a
> > socket can be created OOP, then passed into current process and call
> > send/recv, so the network is not fully disabled by landlock alone
> > (still need seccomp)
>
> Right, the kernel (and then Landlock) is not enough to sandbox a
> complete environment, user space needs to be aware and be configured for
> that too.
>
> I understand the desire to restrict as much as possible, but this
> require to add more code and then it increase the risk of bugs, whereas
> it might not be a big deal for attackers. I don't think the cost is
> worth it and I don't want to give a false sense of security that could
> let users think their application cannot communicate with the network if
> it can communicate with local processes connected to the network.
>
Agree with the cost/benefit concern.

I think the current design is already good enough for the
decompression program case  in
https://cr.yp.to/unix/disablenetwork.html, as mentioned in Günther's
response.

> >
> > Things might get more complicated, say: a forked process is intended
> > to send/recv UDP, but was confused and got a TCP socket from
> > OOP, etc. This is not different than accept(2) case above. There might
> > be an opportunity for Landlock to harden this when we design for
> > data-plane.
> >
> >> I'm not convinced that being able to control all kind of socket bind,
> >> listen and connect actions might be worth implementing instead of a
> >> fine-grained access control for the main protocols (TCP, UDP, unix and
> >> vsock maybe), with the related tests and guarantees.
> >>
> >> However, this landlock_socket_attr struct could have an allowed_access
> >> field that could contain LANDLOCK_ACCESS_NET_{CONNECT,LISTEN,BIND}_TCP
> >> rights (which would just not be constrained by any port, except if a
> >> landlock_net_port_attr rule matches). It would then make sense to rename
> >> LANDLOCK_ACCESS_SOCKET_CREATE to LANDLOCK_ACCESS_NET_CREATE_SOCKET. This
> >> right would not be accepted in a landlock_net_port_attr.allowed_access
> >> though.
> >>
> > I'm not sure if my view is fully explained. I don't mean to control
> > all kinds of socket bind/listen/connect actions.
> > My view is:
> > 1> have a rule to check family + type, to make sure the process is
> > using the socket type they intend to use, such as
> > LANDLOCK_ACCESS_SOCKET_ATTR_{CREATE|CONNECT|BIND|ACCEPT|LISTEN}, as
> > discussed in accept(2) case.
> > 2> have protocol specific rules, such as LANDLOCK_ACCESS_NET_{CONNECT,BIND}_TCP.
> > So bind(2) will be checked by both 1 and 2.
>
> Right, I understand your point.
>
Great ! Thanks a lot for the discussion !
















> >
> > As example of TCP server, the process will use:
> > LANDLOCK_ACCESS_SOCKET_ATTR_{CREATE|BIND|ACCEPT|LISTEN}
> > LANDLOCK_ACCESS_NET_{BIND}_TCP
> >
> >>>
> >>> s/LANDLOCK_ACCESS_SOCKET_CREATE/LANDLOCK_ACCESS_SOCKET_TYPE.
> >>>
> >>> This implies the kernel will check on socket fd's property (family +
> >>> type) at those calls, this applies to
> >>> a - the socket fd is created within the process, after landlock is applied.
> >>> b - created in process prior to landlock is applied.
> >>> c - created out of process then passed into this process,
> >>
> >> OK, these are the same rules as for LANDLOCK_ACCESS_NET_{CONNECT,BIND}_TCP.
> >
> > I don't mean this to be _TCP specific, this is still the family + type
> > discussion above.
>
> Yes, I meant that your a/b/c rules would apply for the current
> LANDLOCK_ACCESS_NET_{CONNECT,BIND}_TCP types as well.
Mickaël Salaün July 12, 2023, 11:30 a.m. UTC | #18
On 05/07/2023 17:00, Jeff Xu wrote:
> On Fri, Jun 30, 2023 at 11:23 AM Mickaël Salaün <mic@digikod.net> wrote:
>>
>>
>> On 30/06/2023 06:18, Jeff Xu wrote:
>>> On Thu, Jun 29, 2023 at 4:07 AM Mickaël Salaün <mic@digikod.net> wrote:
>>>>
>>>>
>>>> On 29/06/2023 05:18, Jeff Xu wrote:
>>>>> resend.
>>>>>
>>>>> On Wed, Jun 28, 2023 at 12:29 PM Mickaël Salaün <mic@digikod.net> wrote:
>>>>>>
>>>>>>
>>>>>> On 28/06/2023 19:03, Jeff Xu wrote:

[...]

>> The sandboxing/Landlock threat model is to restrict a process when it is
>> sandboxed, but this sandboxing is a request from the same process (or
>> one of its parent) that happen when it is more trustworthy (or at least
>> has more privileges) than after it sandbox itself.
>>
>> The process sandboxing itself can use several kernel features, and one
>> of it is Landlock. In any case, it should take care of closing file
>> descriptors that should not be passed to the sandboxed process.
>>
> Agree.
> 
>> The limits of sandboxing are the communication channels from and to
>> outside the sandbox. The peers talking with sandboxed processes should
>> then not be subject to confused deputy attacks, which means they must
>> not enable to bypass the user-defined security policy (from which the
>> Landlock policy is only a part). Receiving file descriptors should then
>> not be more important than controlling the communication channels. If a
>> not-sandboxed process is willing to give more right to a sandboxed
>> process, by passing FDs or just receiving commands, then this
>> not-sandboxed process need to be fixed.
>>
>> This is the rationale to not care about received nor sent file
>> descriptors. The communication channels and the remote peers must be
>> trusted to not give more privileges to the sandboxed processes.
>>
>> If a peer is malicious, it doesn't need to pass a file descriptor to the
>> sandboxed process, it can just read (data) commands and apply them to
>> its file descriptors.
> 
> I see the reasoning. i.e. sandboxing the process is not more
> important than securing communication channels, or securing the peer.
> 
> So in a system that let a peer process to pass a socket into a
> higher privileged process, when the communication channel or the peer
> process is compromised,  e.g. swapping the fd/socket into a different
> one that the attacker controls, confuse deputy attack can happen. The
> recommendation here is to secure peer and communication.
> I agree with this approach in general.  I need to think about how it
> applies to specific cases.
> 
>> I think the ability to pass file descriptors
>> should be seen as a way to improve performance by avoiding a user space
>> process to act as a proxy receiving read/write commands and managing
>> file descriptors itself. On the other hand, file descriptors could be
>> used as real capabilities/tokens to manage access, but senders still
>> need to be careful to only pass the required ones.
>>
>> All this to say that being able to restrict actions on file descriptors
>> would be useful for senders/services to send a subset of the file
>> descriptor capabilities (cf. Capsicum), but not the other way around.
>>
> In the Landlock kernel doc:
> Similarly to file access modes (e.g. O_RDWR), Landlock access rights
> attached to file descriptors are retained even if they are passed
> between processes (e.g. through a Unix domain socket). Such access
> rights will then be enforced even if the receiving process is not
> sandboxed by Landlock. Indeed, this is required to keep a consistent
> access control over the whole system, and this avoids unattended
> bypasses through file descriptor passing (i.e. confused deputy
> attack).
> 
> iiuc, the design for file and socket in landlock is different. For
> socket, the access rules are applied only to the current process (more
> like seccomp), while for file restriction, the rules can be passed
> into another un-landlocked process.

The O_RDWR restrictions are enforced by the basic kernel access control, 
not Landlock. However, for file truncation, Landlock complements the 
basic kernel access rights and behave the same.

There is indeed slight differences between file system and socket 
restrictions. For the file system, a file descriptor is a direct access 
to a file/data. For the network, we cannot identify for which data/peer 
a newly created socket will give access to, we need to wait for a 
connect or bind request to identify the use case for this socket. We 
could tie the access rights (related to ports) to an opened socket, but 
this would not align with the way Landlock access control works for the 
file system. Indeed, a directory file descriptor may enable to open 
another file (i.e. a new data item), but this opening is restricted by 
Landlock. A newly created socket gives access to the network (or a 
subset of it), but binding or connecting to a peer (i.e. accessing new 
data) is restricted by Landlock. Accesses tied to FDs are those that 
enable to get access to the underlying data (e.g. read, write, 
truncate). A newly created socket is harmless until it is connected to a 
peer, similarly to a memfd file descriptor. A directory opened by a 
sandboxed process can be passed to a process outside this sandbox and it 
might be allowed to open a relative path/file, which might not be the 
case for the sandboxed process.

I think it might be summarize by the difference between underlying FD 
data in the case of a regular file (i.e. tied access rights), and 
relative new data in the case of a directory or a socket (i.e. 
sandboxing policy scope).
Konstantin Meskhidze (A) July 13, 2023, 11:44 a.m. UTC | #19
6/29/2023 2:07 PM, Mickaël Salaün пишет:
> 
> On 29/06/2023 05:18, Jeff Xu wrote:
>> resend.
>> 
>> On Wed, Jun 28, 2023 at 12:29 PM Mickaël Salaün <mic@digikod.net> wrote:
>>>
>>>
>>> On 28/06/2023 19:03, Jeff Xu wrote:
>>>> Hello,
>>>>
>>>> Thanks for writing up the example for an incoming TCP connection ! It
>>>> helps with the context.
>>>>
>>>> Since I'm late to this thread, one thing I want to ask:  all the APIs
>>>> proposed so far are at the process level, we don't have any API that
>>>> applies restriction to socket fd itself, right ? this is what I
>>>> thought, but I would like to get confirmation.
>>>
>>> Restriction are applied to actions, not to already existing/opened FDs.
>>> We could add a way to restrict opened FDs, but I don't think this is the
>>> right approach because sandboxing is a deliberate action from a process,
>>> and it should already take care of its FDs.
>>>
>>>
>>>>
>>>> On Wed, Jun 28, 2023 at 2:09 AM Günther Noack <gnoack@google.com> wrote:
>>>>>
>>>>> Hello!
>>>>>
>>>>> On Mon, Jun 26, 2023 at 05:29:34PM +0200, Mickaël Salaün wrote:
>>>>>> Here is a design to be able to only allow a set of network protocols and
>>>>>> deny everything else. This would be complementary to Konstantin's patch
>>>>>> series which addresses fine-grained access control.
>>>>>>
>>>>>> First, I want to remind that Landlock follows an allowed list approach with
>>>>>> a set of (growing) supported actions (for compatibility reasons), which is
>>>>>> kind of an allow-list-on-a-deny-list. But with this proposal, we want to be
>>>>>> able to deny everything, which means: supported, not supported, known and
>>>>>> unknown protocols.
>>>>>>
>>>>>> We could add a new "handled_access_socket" field to the landlock_ruleset
>>>>>> struct, which could contain a LANDLOCK_ACCESS_SOCKET_CREATE flag.
>>>>>>
>>>>>> If this field is set, users could add a new type of rules:
>>>>>> struct landlock_socket_attr {
>>>>>>       __u64 allowed_access;
>>>>>>       int domain; // see socket(2)
>>>>>>       int type; // see socket(2)
>>>>>> }
>>>>>>
>>>>>> The allowed_access field would only contain LANDLOCK_ACCESS_SOCKET_CREATE at
>>>>>> first, but it could grow with other actions (which cannot be handled with
>>>>>> seccomp):
>>>>>> - use: walk through all opened FDs and mark them as allowed or denied
>>>>>> - receive: hook on received FDs
>>>>>> - send: hook on sent FDs
>>>>>>
>>>>>> We might also use the same approach for non-socket objects that can be
>>>>>> identified with some meaningful properties.
>>>>>>
>>>>>> What do you think?
>>>>>
>>>>> This sounds like a good plan to me - it would make it possible to restrict new
>>>>> socket creation using protocols that were not intended to be used, and I also
>>>>> think it would fit the Landlock model nicely.
>>>>>
>>>>> Small remark on the side: The security_socket_create() hook does not only get
>>>>> invoked as a result of socket(2), but also as a part of accept(2) - so this
>>>>> approach might already prevent new connections very effectively.
>>>>>
>>>> That is an interesting aspect that might be worth discussing more.
>>>> seccomp is per syscall, landlock doesn't necessarily follow the same,
>>>> another design is to add more logic in Landlock, e.g.
>>>> LANDLOCK_ACCESS_SOCKET_PROTOCOL which will apply to all of the socket
>>>> calls (socket/bind/listen/accept/connect). App dev might feel it is
>>>> easier to use.
>>>
>>> seccomp restricts the use of the syscall interface, whereas Landlock
>>> restricts the use of kernel objects (i.e. the semantic).
>>>
>>> We need to find a good tradeoff between a lot of access rights and a few
>>> grouping different actions. This should make sense from a developer
>>> point of view according to its knowledge and use of the kernel
>>> interfaces (potential wrapped with high level libraries), but also to
>>> the semantic of the sandbox and the security guarantees we want to provide.
>>>
>>> We should also keep in mind that high level Landlock libraries can take
>>> care of potential coarse-grained use of restrictions.
>>>
>>>
>>>>
>>>>> Spelling out some scenarios, so that we are sure that we are on the same page:
>>>>>
>>>>> A)
>>>>>
>>>>> A program that does not need networking could specify a ruleset where
>>>>> LANDLOCK_ACCESS_SOCKET_CREATE is handled, and simply not permit anything.
>>>>>
>>>>> B)
>>>>>
>>>>> A program that runs a TCP server could specify a ruleset where
>>>>> LANDLOCK_NET_BIND_TCP, LANDLOCK_NET_CONNECT_TCP and
>>>>> LANDLOCK_ACCESS_SOCKET_CREATE are handled, and where the following rules are added:
>>>>>
>>>>>     /* From Konstantin's patch set */
>>>>>     struct landlock_net_service_attr bind_attr = {
>>>>>       .allowed_access = LANDLOCK_NET_BIND_TCP,
>>>>>       .port = 8080,
>>>>>     };
>>>>>
>>>>>     /* From Mickaël's proposal */
>>>>>     struct landlock_socket_attr sock_inet_attr = {
>>>>>       .allowed_access = LANDLOCK_ACCESS_SOCKET_CREATE,
>>>>>       .domain = AF_INET,
>>>>>       .type = SOCK_STREAM,
>>>>>     }
>>>>>
>>>>>     struct landlock_socket_attr sock_inet6_attr = {
>>>>>       .allowed_access = LANDLOCK_ACCESS_SOCKET_CREATE,
>>>>>       .domain = AF_INET6,
>>>>>        .type = SOCK_STREAM,
>>>>>     }
>>>>>
>>>>> That should then be enough to bind and listen on ports, whereas outgoing
>>>>> connections with TCP and anything using other network protocols would not be
>>>>> permitted.
>>>>>
>>>> TCP server is an interesting case. From a security perspective, a
>>>> process cares if it is acting as a server or client in TCP, a server
>>>> might only want to accept an incoming TCP connection, never initiate
>>>> an outgoing TCP connection, and a client is the opposite.
>>>>
>>>> Processes can restrict outgoing/incoming TCP connection by seccomp for
>>>> accept(2) or connect(2),  though I feel Landlock can do this more
>>>> naturally for app dev, and at per-protocol level.  seccomp doesn't
>>>> provide per-protocol granularity.
>>>
>>> Right, seccomp cannot filter TCP ports.
>>>
>>>>
>>>> For bind(2), iirc, it can be used for a server to assign dst port of
>>>> incoming TCP connection, also by a client to assign a src port of an
>>>> outgoing TCP connection. LANDLOCK_NET_BIND_TCP will apply to both
>>>> cases, right ? this might not be a problem, just something to keep
>>>> note.
>>>
>>> Good point. I think it is in line with the rule definition: to allow to
>>> bind on a specific port. However, if clients want to set the source port
>>> to a (legitimate) value, then that would be an issue because we cannot
>>> allow a whole range of ports (e.g., >= 1024). I'm not sure if this
>>> practice would be deemed "legitimate" though. Do you know client
>>> applications using bind?
>>>
>>> Konstantin, we should have a test for this case anyway.
> 
> Thinking more about TCP clients binding sockets, a
> LANDLOCK_ACCESS_NET_LISTEN_TCP would be more useful than
> LANDLOCK_ACCESS_NET_BIND_TCP, but being able to limit the scope of
> "bindable" ports is also valuable to forbid a malicious sandboxed
> process to impersonate a legitimate server process. This also means that
> it might be interesting to be able to handle port ranges.
> 
> We already have a LANDLOCK_ACCESS_NET_BIND_TCP implementation and
> related tests, so I think we should proceed with that. The next
> network-related patch series should implement this
> LANDLOCK_ACCESS_NET_LISTEN_TCP access right though, which should not be
> difficult thanks to the framework implemented with current patch series.
> 
> Konstantin, would you like to develop the TCP listening access control
> once this patch series land?

  Hi all,
  Sorry for the late reply. I think this access control would be useful.
  I would like to add LANDLOCK_ACCESS_NET_LISTEN_TCP access right in 
future patches.




> 
> 
>>>>> (Alternatively, it could bind() the socket early, *then enable Landlock* and
>>>>> leave out the rule for BIND_TCP, only permitting SOCKET_CREATE for IPv4 and
>>>>> IPv6, so that listen() and accept() work on the already-bound socket.)
>>>>>
>>>> For this approach, LANDLOCK_ACCESS_SOCKET_PROTOCOL is a better name,
>>>> so dev is fully aware it is not just applied to socket create.
>>>
>>> I don't get the semantic of LANDLOCK_ACCESS_SOCKET_PROTOCOL. What does
>>> PROTOCOL mean?
>>>
>> I meant checking family + type of socket, and apply to all of
>> socket(2),bind(2),accept(2),connect(2),listen(2), maybe
>> send(2)/recv(2) too.
> 
> OK, that would be kind of similar to the LANDLOCK_ACCESS_SOCKET_USE
> description. However, I think this kind of global approach has several
> issues:
> - This covers a lot of different aspects and would increase the cost of
> development/testing/review.
> - Whereas it wraps different actions, it will not let user space have a
> fine-grained access control on these, which could be useful for some use
> cases.
> - I don't see the point of restricting accept(2) if we can already
> restrict bind(2) and listen(2). accept(2) could be useful to identify
> the remote peer but I'm not convinced this would make sense, and if it
> would, then this can be postponed until we have a way to identify peers.
> - For performance reasons, we should avoid restricting
> send/recv/read/write but instead only restrict the control plane: object
> creation and configuration.

   I agree. I'm not sure about restricting the data plane here. We have 
to restrict connection making, not data transfering when connection has 
been established.
> 
> I'm not convinced that being able to control all kind of socket bind,
> listen and connect actions might be worth implementing instead of a
> fine-grained access control for the main protocols (TCP, UDP, unix and
> vsock maybe), with the related tests and guarantees.
> 
> However, this landlock_socket_attr struct could have an allowed_access
> field that could contain LANDLOCK_ACCESS_NET_{CONNECT,LISTEN,BIND}_TCP
> rights (which would just not be constrained by any port, except if a
> landlock_net_port_attr rule matches). It would then make sense to rename
> LANDLOCK_ACCESS_SOCKET_CREATE to LANDLOCK_ACCESS_NET_CREATE_SOCKET. This
> right would not be accepted in a landlock_net_port_attr.allowed_access
> though.
> 
>> 
>> s/LANDLOCK_ACCESS_SOCKET_CREATE/LANDLOCK_ACCESS_SOCKET_TYPE.
>> 
>> This implies the kernel will check on socket fd's property (family +
>> type) at those calls, this applies to
>> a - the socket fd is created within the process, after landlock is applied.
>> b - created in process prior to landlock is applied.
>> c - created out of process then passed into this process,
> 
> OK, these are the same rules as for LANDLOCK_ACCESS_NET_{CONNECT,BIND}_TCP.
> .
Konstantin Meskhidze (A) July 13, 2023, 1:20 p.m. UTC | #20
7/12/2023 2:30 PM, Mickaël Salaün пишет:
> 
> On 05/07/2023 17:00, Jeff Xu wrote:
>> On Fri, Jun 30, 2023 at 11:23 AM Mickaël Salaün <mic@digikod.net> wrote:
>>>
>>>
>>> On 30/06/2023 06:18, Jeff Xu wrote:
>>>> On Thu, Jun 29, 2023 at 4:07 AM Mickaël Salaün <mic@digikod.net> wrote:
>>>>>
>>>>>
>>>>> On 29/06/2023 05:18, Jeff Xu wrote:
>>>>>> resend.
>>>>>>
>>>>>> On Wed, Jun 28, 2023 at 12:29 PM Mickaël Salaün <mic@digikod.net> wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 28/06/2023 19:03, Jeff Xu wrote:
> 
> [...]
> 
>>> The sandboxing/Landlock threat model is to restrict a process when it is
>>> sandboxed, but this sandboxing is a request from the same process (or
>>> one of its parent) that happen when it is more trustworthy (or at least
>>> has more privileges) than after it sandbox itself.
>>>
>>> The process sandboxing itself can use several kernel features, and one
>>> of it is Landlock. In any case, it should take care of closing file
>>> descriptors that should not be passed to the sandboxed process.
>>>
>> Agree.
>> 
>>> The limits of sandboxing are the communication channels from and to
>>> outside the sandbox. The peers talking with sandboxed processes should
>>> then not be subject to confused deputy attacks, which means they must
>>> not enable to bypass the user-defined security policy (from which the
>>> Landlock policy is only a part). Receiving file descriptors should then
>>> not be more important than controlling the communication channels. If a
>>> not-sandboxed process is willing to give more right to a sandboxed
>>> process, by passing FDs or just receiving commands, then this
>>> not-sandboxed process need to be fixed.
>>>
>>> This is the rationale to not care about received nor sent file
>>> descriptors. The communication channels and the remote peers must be
>>> trusted to not give more privileges to the sandboxed processes.
>>>
>>> If a peer is malicious, it doesn't need to pass a file descriptor to the
>>> sandboxed process, it can just read (data) commands and apply them to
>>> its file descriptors.
>> 
>> I see the reasoning. i.e. sandboxing the process is not more
>> important than securing communication channels, or securing the peer.
>> 
>> So in a system that let a peer process to pass a socket into a
>> higher privileged process, when the communication channel or the peer
>> process is compromised,  e.g. swapping the fd/socket into a different
>> one that the attacker controls, confuse deputy attack can happen. The
>> recommendation here is to secure peer and communication.
>> I agree with this approach in general.  I need to think about how it
>> applies to specific cases.
>> 
>>> I think the ability to pass file descriptors
>>> should be seen as a way to improve performance by avoiding a user space
>>> process to act as a proxy receiving read/write commands and managing
>>> file descriptors itself. On the other hand, file descriptors could be
>>> used as real capabilities/tokens to manage access, but senders still
>>> need to be careful to only pass the required ones.
>>>
>>> All this to say that being able to restrict actions on file descriptors
>>> would be useful for senders/services to send a subset of the file
>>> descriptor capabilities (cf. Capsicum), but not the other way around.
>>>
>> In the Landlock kernel doc:
>> Similarly to file access modes (e.g. O_RDWR), Landlock access rights
>> attached to file descriptors are retained even if they are passed
>> between processes (e.g. through a Unix domain socket). Such access
>> rights will then be enforced even if the receiving process is not
>> sandboxed by Landlock. Indeed, this is required to keep a consistent
>> access control over the whole system, and this avoids unattended
>> bypasses through file descriptor passing (i.e. confused deputy
>> attack).
>> 
>> iiuc, the design for file and socket in landlock is different. For
>> socket, the access rules are applied only to the current process (more
>> like seccomp), while for file restriction, the rules can be passed
>> into another un-landlocked process.
> 
> The O_RDWR restrictions are enforced by the basic kernel access control,
> not Landlock. However, for file truncation, Landlock complements the
> basic kernel access rights and behave the same.
> 
> There is indeed slight differences between file system and socket
> restrictions. For the file system, a file descriptor is a direct access
> to a file/data. For the network, we cannot identify for which data/peer
> a newly created socket will give access to, we need to wait for a
> connect or bind request to identify the use case for this socket. We
> could tie the access rights (related to ports) to an opened socket, but
> this would not align with the way Landlock access control works for the
> file system. Indeed, a directory file descriptor may enable to open
> another file (i.e. a new data item), but this opening is restricted by
> Landlock. A newly created socket gives access to the network (or a
> subset of it), but binding or connecting to a peer (i.e. accessing new
> data) is restricted by Landlock. Accesses tied to FDs are those that
> enable to get access to the underlying data (e.g. read, write,
> truncate). A newly created socket is harmless until it is connected to a
> peer, similarly to a memfd file descriptor. A directory opened by a
> sandboxed process can be passed to a process outside this sandbox and it
> might be allowed to open a relative path/file, which might not be the
> case for the sandboxed process.

   I would like to mention that in case of files a Landlock rule is tied 
to undreliying file's inode ( already existing at the moment of creating
    a landlock's rule), and it's impossible to tie a new landlock rule 
to a socket before it's creating. Thats why all network access rules 
work with "port objects", representing network connections.

I was thinking about sendind socket's FD to another process.
If one process creates a socket and binds it to some port N. Then it 
sends socket's FD to a landlocked process with rule restricting to bind
to port N. Is this situation theoretically possible???


> 
> I think it might be summarize by the difference between underlying FD
> data in the case of a regular file (i.e. tied access rights), and
> relative new data in the case of a directory or a socket (i.e.
> sandboxing policy scope).
> .
Mickaël Salaün July 13, 2023, 2:52 p.m. UTC | #21
On 13/07/2023 15:20, Konstantin Meskhidze (A) wrote:
> 
> 
> 7/12/2023 2:30 PM, Mickaël Salaün пишет:
>>
>> On 05/07/2023 17:00, Jeff Xu wrote:
>>> On Fri, Jun 30, 2023 at 11:23 AM Mickaël Salaün <mic@digikod.net> wrote:
>>>>
>>>>
>>>> On 30/06/2023 06:18, Jeff Xu wrote:
>>>>> On Thu, Jun 29, 2023 at 4:07 AM Mickaël Salaün <mic@digikod.net> wrote:
>>>>>>
>>>>>>
>>>>>> On 29/06/2023 05:18, Jeff Xu wrote:
>>>>>>> resend.
>>>>>>>
>>>>>>> On Wed, Jun 28, 2023 at 12:29 PM Mickaël Salaün <mic@digikod.net> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 28/06/2023 19:03, Jeff Xu wrote:
>>
>> [...]
>>
>>>> The sandboxing/Landlock threat model is to restrict a process when it is
>>>> sandboxed, but this sandboxing is a request from the same process (or
>>>> one of its parent) that happen when it is more trustworthy (or at least
>>>> has more privileges) than after it sandbox itself.
>>>>
>>>> The process sandboxing itself can use several kernel features, and one
>>>> of it is Landlock. In any case, it should take care of closing file
>>>> descriptors that should not be passed to the sandboxed process.
>>>>
>>> Agree.
>>>
>>>> The limits of sandboxing are the communication channels from and to
>>>> outside the sandbox. The peers talking with sandboxed processes should
>>>> then not be subject to confused deputy attacks, which means they must
>>>> not enable to bypass the user-defined security policy (from which the
>>>> Landlock policy is only a part). Receiving file descriptors should then
>>>> not be more important than controlling the communication channels. If a
>>>> not-sandboxed process is willing to give more right to a sandboxed
>>>> process, by passing FDs or just receiving commands, then this
>>>> not-sandboxed process need to be fixed.
>>>>
>>>> This is the rationale to not care about received nor sent file
>>>> descriptors. The communication channels and the remote peers must be
>>>> trusted to not give more privileges to the sandboxed processes.
>>>>
>>>> If a peer is malicious, it doesn't need to pass a file descriptor to the
>>>> sandboxed process, it can just read (data) commands and apply them to
>>>> its file descriptors.
>>>
>>> I see the reasoning. i.e. sandboxing the process is not more
>>> important than securing communication channels, or securing the peer.
>>>
>>> So in a system that let a peer process to pass a socket into a
>>> higher privileged process, when the communication channel or the peer
>>> process is compromised,  e.g. swapping the fd/socket into a different
>>> one that the attacker controls, confuse deputy attack can happen. The
>>> recommendation here is to secure peer and communication.
>>> I agree with this approach in general.  I need to think about how it
>>> applies to specific cases.
>>>
>>>> I think the ability to pass file descriptors
>>>> should be seen as a way to improve performance by avoiding a user space
>>>> process to act as a proxy receiving read/write commands and managing
>>>> file descriptors itself. On the other hand, file descriptors could be
>>>> used as real capabilities/tokens to manage access, but senders still
>>>> need to be careful to only pass the required ones.
>>>>
>>>> All this to say that being able to restrict actions on file descriptors
>>>> would be useful for senders/services to send a subset of the file
>>>> descriptor capabilities (cf. Capsicum), but not the other way around.
>>>>
>>> In the Landlock kernel doc:
>>> Similarly to file access modes (e.g. O_RDWR), Landlock access rights
>>> attached to file descriptors are retained even if they are passed
>>> between processes (e.g. through a Unix domain socket). Such access
>>> rights will then be enforced even if the receiving process is not
>>> sandboxed by Landlock. Indeed, this is required to keep a consistent
>>> access control over the whole system, and this avoids unattended
>>> bypasses through file descriptor passing (i.e. confused deputy
>>> attack).
>>>
>>> iiuc, the design for file and socket in landlock is different. For
>>> socket, the access rules are applied only to the current process (more
>>> like seccomp), while for file restriction, the rules can be passed
>>> into another un-landlocked process.
>>
>> The O_RDWR restrictions are enforced by the basic kernel access control,
>> not Landlock. However, for file truncation, Landlock complements the
>> basic kernel access rights and behave the same.
>>
>> There is indeed slight differences between file system and socket
>> restrictions. For the file system, a file descriptor is a direct access
>> to a file/data. For the network, we cannot identify for which data/peer
>> a newly created socket will give access to, we need to wait for a
>> connect or bind request to identify the use case for this socket. We
>> could tie the access rights (related to ports) to an opened socket, but
>> this would not align with the way Landlock access control works for the
>> file system. Indeed, a directory file descriptor may enable to open
>> another file (i.e. a new data item), but this opening is restricted by
>> Landlock. A newly created socket gives access to the network (or a
>> subset of it), but binding or connecting to a peer (i.e. accessing new
>> data) is restricted by Landlock. Accesses tied to FDs are those that
>> enable to get access to the underlying data (e.g. read, write,
>> truncate). A newly created socket is harmless until it is connected to a
>> peer, similarly to a memfd file descriptor. A directory opened by a
>> sandboxed process can be passed to a process outside this sandbox and it
>> might be allowed to open a relative path/file, which might not be the
>> case for the sandboxed process.
> 
>     I would like to mention that in case of files a Landlock rule is tied
> to undreliying file's inode ( already existing at the moment of creating
>      a landlock's rule), and it's impossible to tie a new landlock rule
> to a socket before it's creating. Thats why all network access rules
> work with "port objects", representing network connections.

Correct, even if a port is not a *kernel* object.

> 
> I was thinking about sendind socket's FD to another process.
> If one process creates a socket and binds it to some port N. Then it
> sends socket's FD to a landlocked process with rule restricting to bind
> to port N. Is this situation theoretically possible???

Yes, it's possible an it's OK because the bind action was performed by 
an unsandboxed process. If this unsandboxed process is not trusted or if 
it can be fooled by a malicious client, the system should be designed to 
make it not able to talk to the sandboxed process.

> 
> 
>>
>> I think it might be summarize by the difference between underlying FD
>> data in the case of a regular file (i.e. tied access rights), and
>> relative new data in the case of a directory or a socket (i.e.
>> sandboxing policy scope).
>> .