mbox series

[bpf-next,v4,00/20] Add return value range check for BPF LSM

Message ID 20240711111908.3817636-1-xukuohai@huaweicloud.com (mailing list archive)
Headers show
Series Add return value range check for BPF LSM | expand

Message

Xu Kuohai July 11, 2024, 11:18 a.m. UTC
From: Xu Kuohai <xukuohai@huawei.com>

LSM BPF prog returning a positive number attached to the hook
file_alloc_security makes kernel panic. 

Here is a panic log:

[  441.235774] BUG: kernel NULL pointer dereference, address: 00000000000009
[  441.236748] #PF: supervisor write access in kernel mode
[  441.237429] #PF: error_code(0x0002) - not-present page
[  441.238119] PGD 800000000b02f067 P4D 800000000b02f067 PUD b031067 PMD 0
[  441.238990] Oops: 0002 [#1] PREEMPT SMP PTI
[  441.239546] CPU: 0 PID: 347 Comm: loader Not tainted 6.8.0-rc6-gafe0cbf23373 #22
[  441.240496] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b4
[  441.241933] RIP: 0010:alloc_file+0x4b/0x190
[  441.242485] Code: 8b 04 25 c0 3c 1f 00 48 8b b0 30 0c 00 00 e8 9c fe ff ff 48 3d 00 f0 ff fb
[  441.244820] RSP: 0018:ffffc90000c67c40 EFLAGS: 00010203
[  441.245484] RAX: ffff888006a891a0 RBX: ffffffff8223bd00 RCX: 0000000035b08000
[  441.246391] RDX: ffff88800b95f7b0 RSI: 00000000001fc110 RDI: f089cd0b8088ffff
[  441.247294] RBP: ffffc90000c67c58 R08: 0000000000000001 R09: 0000000000000001
[  441.248209] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000001
[  441.249108] R13: ffffc90000c67c78 R14: ffffffff8223bd00 R15: fffffffffffffff4
[  441.250007] FS:  00000000005f3300(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000
[  441.251053] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  441.251788] CR2: 00000000000001a9 CR3: 000000000bdc4003 CR4: 0000000000170ef0
[  441.252688] Call Trace:
[  441.253011]  <TASK>
[  441.253296]  ? __die+0x24/0x70
[  441.253702]  ? page_fault_oops+0x15b/0x480
[  441.254236]  ? fixup_exception+0x26/0x330
[  441.254750]  ? exc_page_fault+0x6d/0x1c0
[  441.255257]  ? asm_exc_page_fault+0x26/0x30
[  441.255792]  ? alloc_file+0x4b/0x190
[  441.256257]  alloc_file_pseudo+0x9f/0xf0
[  441.256760]  __anon_inode_getfile+0x87/0x190
[  441.257311]  ? lock_release+0x14e/0x3f0
[  441.257808]  bpf_link_prime+0xe8/0x1d0
[  441.258315]  bpf_tracing_prog_attach+0x311/0x570
[  441.258916]  ? __pfx_bpf_lsm_file_alloc_security+0x10/0x10
[  441.259605]  __sys_bpf+0x1bb7/0x2dc0
[  441.260070]  __x64_sys_bpf+0x20/0x30
[  441.260533]  do_syscall_64+0x72/0x140
[  441.261004]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
[  441.261643] RIP: 0033:0x4b0349
[  441.262045] Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 88
[  441.264355] RSP: 002b:00007fff74daee38 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
[  441.265293] RAX: ffffffffffffffda RBX: 00007fff74daef30 RCX: 00000000004b0349
[  441.266187] RDX: 0000000000000040 RSI: 00007fff74daee50 RDI: 000000000000001c
[  441.267114] RBP: 000000000000001b R08: 00000000005ef820 R09: 0000000000000000
[  441.268018] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
[  441.268907] R13: 0000000000000004 R14: 00000000005ef018 R15: 00000000004004e8

This is because the filesystem uses IS_ERR to check if the return value
is an error code. If it is not, the filesystem takes the return value
as a file pointer. Since the positive number returned by the BPF prog
is not a real file pointer, this misinterpretation causes a panic.

Since other LSM modules always return either a negative error code
or a valid pointer, this specific issue only exists in BPF LSM. The
proposed solution is to reject LSM BPF progs returning unexpected
values in the verifier. This patch set adds return value check to
ensure only BPF progs returning expected values are accepted.

Since each LSM hook has different excepted return values, we need to
know the expected return values for each individual hook to do the
check. Earlier versions of the patch set used LSM hook annotations
to specify the return value range for each hook. Based on Paul's
suggestion, current version gets rid of such annotations and instead
converts hook return values to a common pattern: return 0 on success
and negative error code on failure.

Basically, LSM hooks are divided into two types: hooks that return a
negative error code and zero or other values, and hooks that do not
return a negative error code. This patch set converts all hooks of the
first type and part of the second type to return 0 on success and a
negative error code on failure (see patches 1-10). For certain hooks,
like ismaclabel and inode_xattr_skipcap, the hook name already imply
that returning 0 or 1 is the best choice, so they are not converted.
There are four unconverted hooks. Except for ismaclabel, which is not
used by BPF LSM, the other three are specified with a BTF ID list to
only return 0 or 1.

v4:
1. remove LSM_HOOK return value annotaion and convert LSM hook return
   value to a common patern: return 0 on success and negative error code
   on failure (patch 1-10)
2. enable BPF LSM progs to read and write output params (patch 12)
3. add a special case for bitwise AND on range [-1, 0] (patch 16)
4. add a 32-bit comparing flag for retval_range_within (patch 15)
5. collect ACKs, style fix, etc

v3: https://lore.kernel.org/bpf/20240411122752.2873562-1-xukuohai@huaweicloud.com/
1. Fix incorrect lsm hook return value ranges, and add disabled hook
   list for bpf lsm, and merge two LSM_RET_INT patches. (KP Singh)
2. Avoid bpf lsm progs attached to different hooks to call each other
   with tail call
3. Fix a CI failure caused by false rejection of AND operation
4. Add tests

v2: https://lore.kernel.org/bpf/20240325095653.1720123-1-xukuohai@huaweicloud.com/
fix bpf ci failure

v1: https://lore.kernel.org/bpf/20240316122359.1073787-1-xukuohai@huaweicloud.com/

Xu Kuohai (20):
  lsm: Refactor return value of LSM hook vm_enough_memory
  lsm: Refactor return value of LSM hook inode_need_killpriv
  lsm: Refactor return value of LSM hook inode_getsecurity
  lsm: Refactor return value of LSM hook inode_listsecurity
  lsm: Refactor return value of LSM hook inode_copy_up_xattr
  lsm: Refactor return value of LSM hook getselfattr
  lsm: Refactor return value of LSM hook setprocattr
  lsm: Refactor return value of LSM hook getprocattr
  lsm: Refactor return value of LSM hook key_getsecurity
  lsm: Refactor return value of LSM hook audit_rule_match
  bpf, lsm: Add disabled BPF LSM hook list
  bpf, lsm: Enable BPF LSM prog to read/write return value parameters
  bpf, lsm: Add check for BPF LSM return value
  bpf: Prevent tail call between progs attached to different hooks
  bpf: Fix compare error in function retval_range_within
  bpf: Add a special case for bitwise AND on range [-1, 0]
  selftests/bpf: Avoid load failure for token_lsm.c
  selftests/bpf: Add return value checks for failed tests
  selftests/bpf: Add test for lsm tail call
  selftests/bpf: Add verifier tests for bpf lsm

 fs/attr.c                                     |   5 +-
 fs/inode.c                                    |   4 +-
 fs/nfs/nfs4proc.c                             |   5 +-
 fs/overlayfs/copy_up.c                        |   6 +-
 fs/proc/base.c                                |  10 +-
 fs/xattr.c                                    |  24 +-
 include/linux/bpf.h                           |   2 +
 include/linux/bpf_lsm.h                       |  15 +
 include/linux/lsm_hook_defs.h                 |  22 +-
 include/linux/security.h                      |  62 ++--
 include/linux/tnum.h                          |   3 +
 kernel/bpf/bpf_lsm.c                          |  64 +++-
 kernel/bpf/btf.c                              |  21 +-
 kernel/bpf/core.c                             |  21 +-
 kernel/bpf/tnum.c                             |  25 ++
 kernel/bpf/verifier.c                         | 173 ++++++++++-
 net/socket.c                                  |   9 +-
 security/apparmor/audit.c                     |  22 +-
 security/apparmor/include/audit.h             |   2 +-
 security/apparmor/lsm.c                       |  22 +-
 security/commoncap.c                          |  32 +-
 security/integrity/evm/evm_main.c             |   2 +-
 security/keys/keyctl.c                        |  11 +-
 security/lsm_syscalls.c                       |   6 +-
 security/security.c                           | 167 ++++++++---
 security/selinux/hooks.c                      |  94 +++---
 security/selinux/include/audit.h              |   8 +-
 security/selinux/ss/services.c                |  54 ++--
 security/smack/smack_lsm.c                    | 104 ++++---
 .../selftests/bpf/prog_tests/test_lsm.c       |  46 ++-
 .../selftests/bpf/prog_tests/verifier.c       |   2 +
 tools/testing/selftests/bpf/progs/err.h       |  10 +
 .../selftests/bpf/progs/lsm_tailcall.c        |  34 +++
 .../selftests/bpf/progs/test_sig_in_xattr.c   |   4 +
 .../bpf/progs/test_verify_pkcs7_sig.c         |   8 +-
 tools/testing/selftests/bpf/progs/token_lsm.c |   4 +-
 .../bpf/progs/verifier_global_subprogs.c      |   7 +-
 .../selftests/bpf/progs/verifier_lsm.c        | 274 ++++++++++++++++++
 38 files changed, 1098 insertions(+), 286 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/lsm_tailcall.c
 create mode 100644 tools/testing/selftests/bpf/progs/verifier_lsm.c

Comments

Paul Moore July 12, 2024, 3:56 p.m. UTC | #1
On Thu, Jul 11, 2024 at 7:13 AM Xu Kuohai <xukuohai@huaweicloud.com> wrote:
>
> From: Xu Kuohai <xukuohai@huawei.com>
>
> LSM BPF prog returning a positive number attached to the hook
> file_alloc_security makes kernel panic.

...

> Xu Kuohai (20):
>   lsm: Refactor return value of LSM hook vm_enough_memory
>   lsm: Refactor return value of LSM hook inode_need_killpriv
>   lsm: Refactor return value of LSM hook inode_getsecurity
>   lsm: Refactor return value of LSM hook inode_listsecurity
>   lsm: Refactor return value of LSM hook inode_copy_up_xattr
>   lsm: Refactor return value of LSM hook getselfattr
>   lsm: Refactor return value of LSM hook setprocattr
>   lsm: Refactor return value of LSM hook getprocattr
>   lsm: Refactor return value of LSM hook key_getsecurity
>   lsm: Refactor return value of LSM hook audit_rule_match
>   bpf, lsm: Add disabled BPF LSM hook list
>   bpf, lsm: Enable BPF LSM prog to read/write return value parameters
>   bpf, lsm: Add check for BPF LSM return value
>   bpf: Prevent tail call between progs attached to different hooks
>   bpf: Fix compare error in function retval_range_within
>   bpf: Add a special case for bitwise AND on range [-1, 0]
>   selftests/bpf: Avoid load failure for token_lsm.c
>   selftests/bpf: Add return value checks for failed tests
>   selftests/bpf: Add test for lsm tail call
>   selftests/bpf: Add verifier tests for bpf lsm

I'm not quite sure what happened, but it looks like patches 13/20
through 20/20 did not hit the mailing lists, see lore link below; did
you have any mail failures when sending the patchset?  Regardless, can
you sort this out and resend the patchset?

https://lore.kernel.org/all/20240711111908.3817636-1-xukuohai@huaweicloud.com
Paul Moore July 12, 2024, 4 p.m. UTC | #2
On Fri, Jul 12, 2024 at 11:56 AM Paul Moore <paul@paul-moore.com> wrote:
> On Thu, Jul 11, 2024 at 7:13 AM Xu Kuohai <xukuohai@huaweicloud.com> wrote:
> >
> > From: Xu Kuohai <xukuohai@huawei.com>
> >
> > LSM BPF prog returning a positive number attached to the hook
> > file_alloc_security makes kernel panic.
>
> ...
>
> > Xu Kuohai (20):
> >   lsm: Refactor return value of LSM hook vm_enough_memory
> >   lsm: Refactor return value of LSM hook inode_need_killpriv
> >   lsm: Refactor return value of LSM hook inode_getsecurity
> >   lsm: Refactor return value of LSM hook inode_listsecurity
> >   lsm: Refactor return value of LSM hook inode_copy_up_xattr
> >   lsm: Refactor return value of LSM hook getselfattr
> >   lsm: Refactor return value of LSM hook setprocattr
> >   lsm: Refactor return value of LSM hook getprocattr
> >   lsm: Refactor return value of LSM hook key_getsecurity
> >   lsm: Refactor return value of LSM hook audit_rule_match
> >   bpf, lsm: Add disabled BPF LSM hook list
> >   bpf, lsm: Enable BPF LSM prog to read/write return value parameters
> >   bpf, lsm: Add check for BPF LSM return value
> >   bpf: Prevent tail call between progs attached to different hooks
> >   bpf: Fix compare error in function retval_range_within
> >   bpf: Add a special case for bitwise AND on range [-1, 0]
> >   selftests/bpf: Avoid load failure for token_lsm.c
> >   selftests/bpf: Add return value checks for failed tests
> >   selftests/bpf: Add test for lsm tail call
> >   selftests/bpf: Add verifier tests for bpf lsm
>
> I'm not quite sure what happened, but it looks like patches 13/20
> through 20/20 did not hit the mailing lists, see lore link below; did
> you have any mail failures when sending the patchset?  Regardless, can
> you sort this out and resend the patchset?
>
> https://lore.kernel.org/all/20240711111908.3817636-1-xukuohai@huaweicloud.com

Oh wait, it looks like the patchset was split in lore somehow,
nevermind.  The "missing" patches are here:

https://lore.kernel.org/all/20240711113828.3818398-1-xukuohai@huaweicloud.com
Paul Moore July 12, 2024, 9:44 p.m. UTC | #3
On Thu, Jul 11, 2024 at 7:13 AM Xu Kuohai <xukuohai@huaweicloud.com> wrote:
>
> From: Xu Kuohai <xukuohai@huawei.com>
>
> LSM BPF prog returning a positive number attached to the hook
> file_alloc_security makes kernel panic.
>
> Here is a panic log:
>
> [  441.235774] BUG: kernel NULL pointer dereference, address: 00000000000009
> [  441.236748] #PF: supervisor write access in kernel mode
> [  441.237429] #PF: error_code(0x0002) - not-present page
> [  441.238119] PGD 800000000b02f067 P4D 800000000b02f067 PUD b031067 PMD 0
> [  441.238990] Oops: 0002 [#1] PREEMPT SMP PTI
> [  441.239546] CPU: 0 PID: 347 Comm: loader Not tainted 6.8.0-rc6-gafe0cbf23373 #22
> [  441.240496] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b4
> [  441.241933] RIP: 0010:alloc_file+0x4b/0x190
> [  441.242485] Code: 8b 04 25 c0 3c 1f 00 48 8b b0 30 0c 00 00 e8 9c fe ff ff 48 3d 00 f0 ff fb
> [  441.244820] RSP: 0018:ffffc90000c67c40 EFLAGS: 00010203
> [  441.245484] RAX: ffff888006a891a0 RBX: ffffffff8223bd00 RCX: 0000000035b08000
> [  441.246391] RDX: ffff88800b95f7b0 RSI: 00000000001fc110 RDI: f089cd0b8088ffff
> [  441.247294] RBP: ffffc90000c67c58 R08: 0000000000000001 R09: 0000000000000001
> [  441.248209] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000001
> [  441.249108] R13: ffffc90000c67c78 R14: ffffffff8223bd00 R15: fffffffffffffff4
> [  441.250007] FS:  00000000005f3300(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000
> [  441.251053] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  441.251788] CR2: 00000000000001a9 CR3: 000000000bdc4003 CR4: 0000000000170ef0
> [  441.252688] Call Trace:
> [  441.253011]  <TASK>
> [  441.253296]  ? __die+0x24/0x70
> [  441.253702]  ? page_fault_oops+0x15b/0x480
> [  441.254236]  ? fixup_exception+0x26/0x330
> [  441.254750]  ? exc_page_fault+0x6d/0x1c0
> [  441.255257]  ? asm_exc_page_fault+0x26/0x30
> [  441.255792]  ? alloc_file+0x4b/0x190
> [  441.256257]  alloc_file_pseudo+0x9f/0xf0
> [  441.256760]  __anon_inode_getfile+0x87/0x190
> [  441.257311]  ? lock_release+0x14e/0x3f0
> [  441.257808]  bpf_link_prime+0xe8/0x1d0
> [  441.258315]  bpf_tracing_prog_attach+0x311/0x570
> [  441.258916]  ? __pfx_bpf_lsm_file_alloc_security+0x10/0x10
> [  441.259605]  __sys_bpf+0x1bb7/0x2dc0
> [  441.260070]  __x64_sys_bpf+0x20/0x30
> [  441.260533]  do_syscall_64+0x72/0x140
> [  441.261004]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
> [  441.261643] RIP: 0033:0x4b0349
> [  441.262045] Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 88
> [  441.264355] RSP: 002b:00007fff74daee38 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
> [  441.265293] RAX: ffffffffffffffda RBX: 00007fff74daef30 RCX: 00000000004b0349
> [  441.266187] RDX: 0000000000000040 RSI: 00007fff74daee50 RDI: 000000000000001c
> [  441.267114] RBP: 000000000000001b R08: 00000000005ef820 R09: 0000000000000000
> [  441.268018] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
> [  441.268907] R13: 0000000000000004 R14: 00000000005ef018 R15: 00000000004004e8
>
> This is because the filesystem uses IS_ERR to check if the return value
> is an error code. If it is not, the filesystem takes the return value
> as a file pointer. Since the positive number returned by the BPF prog
> is not a real file pointer, this misinterpretation causes a panic.
>
> Since other LSM modules always return either a negative error code
> or a valid pointer, this specific issue only exists in BPF LSM. The
> proposed solution is to reject LSM BPF progs returning unexpected
> values in the verifier. This patch set adds return value check to
> ensure only BPF progs returning expected values are accepted.
>
> Since each LSM hook has different excepted return values, we need to
> know the expected return values for each individual hook to do the
> check. Earlier versions of the patch set used LSM hook annotations
> to specify the return value range for each hook. Based on Paul's
> suggestion, current version gets rid of such annotations and instead
> converts hook return values to a common pattern: return 0 on success
> and negative error code on failure.
>
> Basically, LSM hooks are divided into two types: hooks that return a
> negative error code and zero or other values, and hooks that do not
> return a negative error code. This patch set converts all hooks of the
> first type and part of the second type to return 0 on success and a
> negative error code on failure (see patches 1-10). For certain hooks,
> like ismaclabel and inode_xattr_skipcap, the hook name already imply
> that returning 0 or 1 is the best choice, so they are not converted.
> There are four unconverted hooks. Except for ismaclabel, which is not
> used by BPF LSM, the other three are specified with a BTF ID list to
> only return 0 or 1.

Thank you for following up on your initial work with this patchset, Xu
Kuohai.  It doesn't look like I'm going to be able to finish my review
by the end of the day today, so expect that a bit later, but so far I
think most of the changes look good and provide a nice improvement :)
Paul Moore July 19, 2024, 2:13 a.m. UTC | #4
On Fri, Jul 12, 2024 at 5:44 PM Paul Moore <paul@paul-moore.com> wrote:
> On Thu, Jul 11, 2024 at 7:13 AM Xu Kuohai <xukuohai@huaweicloud.com> wrote:
> > From: Xu Kuohai <xukuohai@huawei.com>
> >
> > LSM BPF prog returning a positive number attached to the hook
> > file_alloc_security makes kernel panic.
> >
> > Here is a panic log:
> >
> > [  441.235774] BUG: kernel NULL pointer dereference, address: 00000000000009
> > [  441.236748] #PF: supervisor write access in kernel mode
> > [  441.237429] #PF: error_code(0x0002) - not-present page
> > [  441.238119] PGD 800000000b02f067 P4D 800000000b02f067 PUD b031067 PMD 0
> > [  441.238990] Oops: 0002 [#1] PREEMPT SMP PTI
> > [  441.239546] CPU: 0 PID: 347 Comm: loader Not tainted 6.8.0-rc6-gafe0cbf23373 #22
> > [  441.240496] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b4
> > [  441.241933] RIP: 0010:alloc_file+0x4b/0x190
> > [  441.242485] Code: 8b 04 25 c0 3c 1f 00 48 8b b0 30 0c 00 00 e8 9c fe ff ff 48 3d 00 f0 ff fb
> > [  441.244820] RSP: 0018:ffffc90000c67c40 EFLAGS: 00010203
> > [  441.245484] RAX: ffff888006a891a0 RBX: ffffffff8223bd00 RCX: 0000000035b08000
> > [  441.246391] RDX: ffff88800b95f7b0 RSI: 00000000001fc110 RDI: f089cd0b8088ffff
> > [  441.247294] RBP: ffffc90000c67c58 R08: 0000000000000001 R09: 0000000000000001
> > [  441.248209] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000001
> > [  441.249108] R13: ffffc90000c67c78 R14: ffffffff8223bd00 R15: fffffffffffffff4
> > [  441.250007] FS:  00000000005f3300(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000
> > [  441.251053] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  441.251788] CR2: 00000000000001a9 CR3: 000000000bdc4003 CR4: 0000000000170ef0
> > [  441.252688] Call Trace:
> > [  441.253011]  <TASK>
> > [  441.253296]  ? __die+0x24/0x70
> > [  441.253702]  ? page_fault_oops+0x15b/0x480
> > [  441.254236]  ? fixup_exception+0x26/0x330
> > [  441.254750]  ? exc_page_fault+0x6d/0x1c0
> > [  441.255257]  ? asm_exc_page_fault+0x26/0x30
> > [  441.255792]  ? alloc_file+0x4b/0x190
> > [  441.256257]  alloc_file_pseudo+0x9f/0xf0
> > [  441.256760]  __anon_inode_getfile+0x87/0x190
> > [  441.257311]  ? lock_release+0x14e/0x3f0
> > [  441.257808]  bpf_link_prime+0xe8/0x1d0
> > [  441.258315]  bpf_tracing_prog_attach+0x311/0x570
> > [  441.258916]  ? __pfx_bpf_lsm_file_alloc_security+0x10/0x10
> > [  441.259605]  __sys_bpf+0x1bb7/0x2dc0
> > [  441.260070]  __x64_sys_bpf+0x20/0x30
> > [  441.260533]  do_syscall_64+0x72/0x140
> > [  441.261004]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
> > [  441.261643] RIP: 0033:0x4b0349
> > [  441.262045] Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 88
> > [  441.264355] RSP: 002b:00007fff74daee38 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
> > [  441.265293] RAX: ffffffffffffffda RBX: 00007fff74daef30 RCX: 00000000004b0349
> > [  441.266187] RDX: 0000000000000040 RSI: 00007fff74daee50 RDI: 000000000000001c
> > [  441.267114] RBP: 000000000000001b R08: 00000000005ef820 R09: 0000000000000000
> > [  441.268018] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
> > [  441.268907] R13: 0000000000000004 R14: 00000000005ef018 R15: 00000000004004e8
> >
> > This is because the filesystem uses IS_ERR to check if the return value
> > is an error code. If it is not, the filesystem takes the return value
> > as a file pointer. Since the positive number returned by the BPF prog
> > is not a real file pointer, this misinterpretation causes a panic.
> >
> > Since other LSM modules always return either a negative error code
> > or a valid pointer, this specific issue only exists in BPF LSM. The
> > proposed solution is to reject LSM BPF progs returning unexpected
> > values in the verifier. This patch set adds return value check to
> > ensure only BPF progs returning expected values are accepted.
> >
> > Since each LSM hook has different excepted return values, we need to
> > know the expected return values for each individual hook to do the
> > check. Earlier versions of the patch set used LSM hook annotations
> > to specify the return value range for each hook. Based on Paul's
> > suggestion, current version gets rid of such annotations and instead
> > converts hook return values to a common pattern: return 0 on success
> > and negative error code on failure.
> >
> > Basically, LSM hooks are divided into two types: hooks that return a
> > negative error code and zero or other values, and hooks that do not
> > return a negative error code. This patch set converts all hooks of the
> > first type and part of the second type to return 0 on success and a
> > negative error code on failure (see patches 1-10). For certain hooks,
> > like ismaclabel and inode_xattr_skipcap, the hook name already imply
> > that returning 0 or 1 is the best choice, so they are not converted.
> > There are four unconverted hooks. Except for ismaclabel, which is not
> > used by BPF LSM, the other three are specified with a BTF ID list to
> > only return 0 or 1.
>
> Thank you for following up on your initial work with this patchset, Xu
> Kuohai.  It doesn't look like I'm going to be able to finish my review
> by the end of the day today, so expect that a bit later, but so far I
> think most of the changes look good and provide a nice improvement :)

You should have my feedback now, let me know if you have any questions.

One additional comment I might make is that you may either want to
wait until after v6.11-rc1 is released and I've had a chance to rebase
the lsm/{dev,next} branches and merge the patchsets which are
currently queued; there are a few patches queued up which will have an
impact on this work.  While it's an unstable branch, you can take a
peek at those queues patches in the lsm/dev-staging branch.

https://github.com/LinuxSecurityModule/kernel/blob/main/README.md
Xu Kuohai July 19, 2024, 3:55 a.m. UTC | #5
On 7/19/2024 10:13 AM, Paul Moore wrote:
> On Fri, Jul 12, 2024 at 5:44 PM Paul Moore <paul@paul-moore.com> wrote:
>> On Thu, Jul 11, 2024 at 7:13 AM Xu Kuohai <xukuohai@huaweicloud.com> wrote:
>>> From: Xu Kuohai <xukuohai@huawei.com>
>>>
>>> LSM BPF prog returning a positive number attached to the hook
>>> file_alloc_security makes kernel panic.
>>>
>>> Here is a panic log:
>>>
>>> [  441.235774] BUG: kernel NULL pointer dereference, address: 00000000000009
>>> [  441.236748] #PF: supervisor write access in kernel mode
>>> [  441.237429] #PF: error_code(0x0002) - not-present page
>>> [  441.238119] PGD 800000000b02f067 P4D 800000000b02f067 PUD b031067 PMD 0
>>> [  441.238990] Oops: 0002 [#1] PREEMPT SMP PTI
>>> [  441.239546] CPU: 0 PID: 347 Comm: loader Not tainted 6.8.0-rc6-gafe0cbf23373 #22
>>> [  441.240496] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b4
>>> [  441.241933] RIP: 0010:alloc_file+0x4b/0x190
>>> [  441.242485] Code: 8b 04 25 c0 3c 1f 00 48 8b b0 30 0c 00 00 e8 9c fe ff ff 48 3d 00 f0 ff fb
>>> [  441.244820] RSP: 0018:ffffc90000c67c40 EFLAGS: 00010203
>>> [  441.245484] RAX: ffff888006a891a0 RBX: ffffffff8223bd00 RCX: 0000000035b08000
>>> [  441.246391] RDX: ffff88800b95f7b0 RSI: 00000000001fc110 RDI: f089cd0b8088ffff
>>> [  441.247294] RBP: ffffc90000c67c58 R08: 0000000000000001 R09: 0000000000000001
>>> [  441.248209] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000001
>>> [  441.249108] R13: ffffc90000c67c78 R14: ffffffff8223bd00 R15: fffffffffffffff4
>>> [  441.250007] FS:  00000000005f3300(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000
>>> [  441.251053] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [  441.251788] CR2: 00000000000001a9 CR3: 000000000bdc4003 CR4: 0000000000170ef0
>>> [  441.252688] Call Trace:
>>> [  441.253011]  <TASK>
>>> [  441.253296]  ? __die+0x24/0x70
>>> [  441.253702]  ? page_fault_oops+0x15b/0x480
>>> [  441.254236]  ? fixup_exception+0x26/0x330
>>> [  441.254750]  ? exc_page_fault+0x6d/0x1c0
>>> [  441.255257]  ? asm_exc_page_fault+0x26/0x30
>>> [  441.255792]  ? alloc_file+0x4b/0x190
>>> [  441.256257]  alloc_file_pseudo+0x9f/0xf0
>>> [  441.256760]  __anon_inode_getfile+0x87/0x190
>>> [  441.257311]  ? lock_release+0x14e/0x3f0
>>> [  441.257808]  bpf_link_prime+0xe8/0x1d0
>>> [  441.258315]  bpf_tracing_prog_attach+0x311/0x570
>>> [  441.258916]  ? __pfx_bpf_lsm_file_alloc_security+0x10/0x10
>>> [  441.259605]  __sys_bpf+0x1bb7/0x2dc0
>>> [  441.260070]  __x64_sys_bpf+0x20/0x30
>>> [  441.260533]  do_syscall_64+0x72/0x140
>>> [  441.261004]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
>>> [  441.261643] RIP: 0033:0x4b0349
>>> [  441.262045] Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 88
>>> [  441.264355] RSP: 002b:00007fff74daee38 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
>>> [  441.265293] RAX: ffffffffffffffda RBX: 00007fff74daef30 RCX: 00000000004b0349
>>> [  441.266187] RDX: 0000000000000040 RSI: 00007fff74daee50 RDI: 000000000000001c
>>> [  441.267114] RBP: 000000000000001b R08: 00000000005ef820 R09: 0000000000000000
>>> [  441.268018] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
>>> [  441.268907] R13: 0000000000000004 R14: 00000000005ef018 R15: 00000000004004e8
>>>
>>> This is because the filesystem uses IS_ERR to check if the return value
>>> is an error code. If it is not, the filesystem takes the return value
>>> as a file pointer. Since the positive number returned by the BPF prog
>>> is not a real file pointer, this misinterpretation causes a panic.
>>>
>>> Since other LSM modules always return either a negative error code
>>> or a valid pointer, this specific issue only exists in BPF LSM. The
>>> proposed solution is to reject LSM BPF progs returning unexpected
>>> values in the verifier. This patch set adds return value check to
>>> ensure only BPF progs returning expected values are accepted.
>>>
>>> Since each LSM hook has different excepted return values, we need to
>>> know the expected return values for each individual hook to do the
>>> check. Earlier versions of the patch set used LSM hook annotations
>>> to specify the return value range for each hook. Based on Paul's
>>> suggestion, current version gets rid of such annotations and instead
>>> converts hook return values to a common pattern: return 0 on success
>>> and negative error code on failure.
>>>
>>> Basically, LSM hooks are divided into two types: hooks that return a
>>> negative error code and zero or other values, and hooks that do not
>>> return a negative error code. This patch set converts all hooks of the
>>> first type and part of the second type to return 0 on success and a
>>> negative error code on failure (see patches 1-10). For certain hooks,
>>> like ismaclabel and inode_xattr_skipcap, the hook name already imply
>>> that returning 0 or 1 is the best choice, so they are not converted.
>>> There are four unconverted hooks. Except for ismaclabel, which is not
>>> used by BPF LSM, the other three are specified with a BTF ID list to
>>> only return 0 or 1.
>>
>> Thank you for following up on your initial work with this patchset, Xu
>> Kuohai.  It doesn't look like I'm going to be able to finish my review
>> by the end of the day today, so expect that a bit later, but so far I
>> think most of the changes look good and provide a nice improvement :)
> 
> You should have my feedback now, let me know if you have any questions.
> 
> One additional comment I might make is that you may either want to
> wait until after v6.11-rc1 is released and I've had a chance to rebase
> the lsm/{dev,next} branches and merge the patchsets which are
> currently queued; there are a few patches queued up which will have an
> impact on this work.  While it's an unstable branch, you can take a
> peek at those queues patches in the lsm/dev-staging branch.
> 
> https://github.com/LinuxSecurityModule/kernel/blob/main/README.md
> 

Got it, thanks for your valuable time and feedback! The individual
comment will be replied once I'm sure I understand it or confirmed
the next step.

Additionally, for the next update, I'll split the series into two,
as the refactoring patches and the BPF patches are not closely
related.