From patchwork Fri Jul 21 16:33:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roberto Sassu X-Patchwork-Id: 13322371 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68D7BC0015E for ; Fri, 21 Jul 2023 16:53:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230142AbjGUQxh (ORCPT ); Fri, 21 Jul 2023 12:53:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49708 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229492AbjGUQxg (ORCPT ); Fri, 21 Jul 2023 12:53:36 -0400 Received: from frasgout11.his.huawei.com (unknown [14.137.139.23]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 412901FD2; Fri, 21 Jul 2023 09:53:34 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.18.147.228]) by frasgout11.his.huawei.com (SkyGuard) with ESMTP id 4R6vwh6X9Kz9xFPw; Sat, 22 Jul 2023 00:22:44 +0800 (CST) Received: from huaweicloud.com (unknown [10.204.63.22]) by APP2 (Coremail) with SMTP id GxC2BwC3hl1bs7pkcDDSBA--.22409S2; Fri, 21 Jul 2023 17:33:39 +0100 (CET) From: Roberto Sassu To: zohar@linux.ibm.com, dmitry.kasatkin@gmail.com, paul@paul-moore.com, jmorris@namei.org, serge@hallyn.com Cc: linux-kernel@vger.kernel.org, linux-integrity@vger.kernel.org, linux-security-module@vger.kernel.org, bpf@vger.kernel.org, jarkko@kernel.org, pbrobinson@gmail.com, zbyszek@in.waw.pl, hch@lst.de, mjg59@srcf.ucam.org, Roberto Sassu Subject: [RFC][PATCH 00/12] integrity: Introduce a digest cache Date: Fri, 21 Jul 2023 18:33:14 +0200 Message-Id: <20230721163326.4106089-1-roberto.sassu@huaweicloud.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-CM-TRANSID: GxC2BwC3hl1bs7pkcDDSBA--.22409S2 X-Coremail-Antispam: 1UD129KBjvAXoW3uF45Gr47XFyxKFWkXr45trb_yoW8Xr1fuo Zaqa13Zw4DKry3CF4q9FnrZFsrWanYgw1xJrWvqrW5X3WfXFWUG3Z8Ca17XFy3Zr18Gry7 Cw10q3yUJr4ktr93n29KB7ZKAUJUUUU8529EdanIXcx71UUUUU7v73VFW2AGmfu7bjvjm3 AaLaJ3UjIYCTnIWjp_UUUYf7kC6x804xWl14x267AKxVW8JVW5JwAFc2x0x2IEx4CE42xK 8VAvwI8IcIk0rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2ocxC64kIII0Yj41l84x0c7CEw4 AK67xGY2AK021l84ACjcxK6xIIjxv20xvE14v26r4j6ryUM28EF7xvwVC0I7IYx2IY6xkF 7I0E14v26r4UJVWxJr1l84ACjcxK6I8E87Iv67AKxVW8JVWxJwA2z4x0Y4vEx4A2jsIEc7 CjxVAFwI0_Gr1j6F4UJwAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAq x4xG6I80ewAv7VC0I7IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6x CaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4IIrI8v6xkF7I0E8cxan2IY04v7MxkF 7I0En4kS14v26r4a6rW5MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI 8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AK xVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r4j6ryUMIIF0xvE2Ix0cI 8IcVCY1x0267AKxVW8Jr0_Cr1UMIIF0xvE42xK8VAvwI8IcIk0rVWrJr0_WFyUJwCI42IY 6I8E87Iv67AKxVW8JVWxJwCI42IY6I8E87Iv6xkF7I0E14v26r4UJVWxJrUvcSsGvfC2Kf nxnUUI43ZEXa7IU8imRUUUUUU== X-CM-SenderInfo: purev21wro2thvvxqx5xdzvxpfor3voofrz/1tbiAQAFBF1jj5DJPgAAsW X-CFilter-Loop: Reflected Precedence: bulk List-ID: From: Roberto Sassu A bit of history first. The original name of this work was IMA Digest Lists, which was somehow considered too invasive. I then moved the code to a separate component named DIGLIM (DIGest Lists Integrity Module), with the purpose of removing the complexity away of IMA, and also add the possibility of using it with other kernel components (e.g. Integrity Policy Enforcement, or IPE). Since it was originally proposed, in 2017, this work grew up a lot thanks to the feedback of my colleagues and external reviewers. It became integrally part of the openEuler distribution since end of 2020. The upstreaming process has been difficult, also due to the fact too many features were included in the proposals. So, I decided to take a step back and find the minimum possible set of features that would make this work meaningful. The really minimum set that would make appraisal work is to parse the RPM header (after signature verification) containing the reference digests and compare the calculated file digest with them. That would be similar to reading the signature from the xattr and verifying the digest. But, maybe that was too minimal. It does not make sense to appraise and parse again the same RPM header for another packaged files. And should not be necessarily tied to appraisal, as it would be useful for measurement too (to make a PCR predictable). Please read the Benefits section here: https://lore.kernel.org/linux-integrity/20210914163401.864635-1-roberto.sassu@huawei.com/ So, this is how the integrity digest cache is born. The main difference with the previous attempts is that there is not a centralized place to store file/metadata digests, but they are stored in the digest cache, attached to the file (the digest list) the digest cache was created from. The link with the files being measured/appraised is the new security.digest_list xattr containing the full path of the digest list they refer to. At the time there is a measurement/appraisal, the digest cache is built by reading (after appraisal) and parsing the digest list (a TLV format and the RPM package format are currently supported). Extracted digests are added to a per-package hash table, sized depending on the number of elements. Lookup should be much faster, as we are not anymore searching in a hash table with 80000 digests, but most of the time with 100 or less. Also, there is no need for locking after creation, the digest cache does not change depending on writes (it would not make sense, since digest lists are signed). Another very important feature is that the digest cache is reclaimable, i.e. it disappears if the inode is evicted from memory. In that case, the digest cache need to be initialized again. The digest cache does not disappear while IMA is using it, by acquiring and releasing a reference to the path structure of the digest list. What about the predictability of PCRs, that some folks are trying to address with the Unified Kernel Image? The concept of digest lists is quite simple: measure the digest list to represent the possible access of a group of files, instead of recording individual file accesses. If digest lists are measured in a deterministic way, the PCR remains predictable despite files are accessed in a different order. We currently don't support xattrs in the initial ram disk, but if we did and the initial ram disk uses the same measured files as the root filesystem, the PCR would still remain predictable. Another important point of the design was to avoid any possible interference with existing IMA measurement and appraisal behavior. This has been achieved in the following way. First, the digest cache needs to be explicitly enabled in the IMA policy through the new policy keyword 'digest_cache=content'. Second, new-style measurements cannot be done on the default IMA PCR, a policy writer must specify a different PCR or the policy will be rejected. Also, the use of the digest cache is incompatible with other appraisal methods, e.g. with xattrs or a modsig. The policy writer cannot specify any of the other methods if it includes the digest_cache directive. Finally, the digest cache does not bypass EVM too. Files matched with the digest caches can only be opened read-only, to prevent updating an unverified HMAC to a valid one. Another incorrect policy combination is forbidden. The use of the digest_cache directive alone does not enable the new behavior. The action for which the digest cache should be used should have also be done on the digest list itself. Otherwise, one cannot explain why there are no measurements (if the digest list was not measured for example). The same applies for appraisal. The last part I wanted to talk about is about the digest list parsers. This was a long debate. In the original proposal, Matthew Garrett and Christoph Hellwig said that adding parsers in the kernel is not scalable and not a good idea in general. While I do agree with them, I'm also thinking what benefits we get if we relax a bit this requirement. If we merge this patch set and the dependency (user asymmetric keys and signatures) today, we are immediately able to have a predictable PCR for measurement, and do appraisal at least of executable code without additional support from the Linux distributions. We would need a small rpm plugin to write/remove RPM headers and their signature to/from the disk as soon as packages are installed/removed. Over the years, I have tried many alternatives to the kernel-based parsers. In the very first version, we supported only one digest list format in the kernel, and injected the digest list to each package at build time. While it works, it is still one way to do that, others don't support it and it is building infrastructure-dependent. I have tried to do the conversion from the RPM format to the kernel format in user space, with the idea that the process doing it could be isolated against an untrusted root. I had a recent discussion with the security folks and they don't seem excited about it. I have fully implemented DIGLIM and PGP keys and signatures support in eBPF. The idea itself of being able to add kernel functionality in a safe way without touching the kernel is very nice, and I would have pursued it more. However, after I found LSM policy bypass and other bugs and after my patches were not accepted by the maintainers, I didn't really feel I could rely on this subsystem. So, back to the original approach for now, maintainers seem to be more incline to accept kernel code, if there is a need for strong isolation. In the future, we might work on better alternative. This patch set depends on: https://lore.kernel.org/linux-integrity/20230720153247.3755856-2-roberto.sassu@huaweicloud.com/ which allows to appraise RPM package headers with the PGP keys of Linux distribution vendors. Patch 1 introduces a new hook to identify the loading of digest lists and consequently appraise them. Patches 2-4 implement the digest cache, and an iterator to prefetch the digest lists to measure them in a deterministic way. Patches 5-6 implement the currently supported digest list formats: tlv and rpm. The tlv format relies on the TLV parser defined in the patch set mentioned above. Patches 7-9 enable the usage of the digest cache in IMA for measurement and appraisal. Patches 10-12 add a tool to manage digest lists. If you are curious to try on an existing system, you first need to build gpg with the patches in the previous patch set, and convert the PGP keys of your Linux distribution to the new user asymmetric key format: $ gpg --conv-kernel >> certs/uasym_keys.bin This embeds the converted keys in the kernel image. Then, enable the following kernel options: CONFIG_INTEGRITY_DIGEST_CACHE=y CONFIG_UASYM_KEYS_SIGS=y CONFIG_UASYM_PRELOAD_PUBLIC_KEYS=y and rebuild the kernel with the patches applied. After boot, build and install the digest list tool in tools/digest-lists, and execute: $ manage_digest_lists -o gen -d /etc/digest_lists -i rpmdb -f rpm You also need the new gpg when you execute this tool, to convert the PGP signatures of the RPM headers to the user asymmetric key format. You may want to add the following additional files in a digest list by creating a file named 'list' with the content: /usr/bin/manage_digest_lists /usr/lib64/libgen-tlv-list.so /usr/lib64/libgen-rpm-list.so /usr/lib64/libparse-rpm-list.so /usr/lib64/libparse-tlv-list.so and, execute: $ manage_digest_lists -i list -L -d /etc/digest_lists -o gen -f tlv You need to sign the created file: scripts/sign-file sha256 certs/signing_key.pem certs/signing_key.pem /etc/digest_lists/tlv-list The final step is to add security.digest_list to each file with: $ manage_digest_lists -i /etc/digest_lists -o add-xattr After that, create the following policy in /etc/ima/ima-policy: dont_measure fsmagic=0x01021994 dont_appraise fsmagic=0x01021994 appraise func=BPRM_CHECK digest_cache=content appraise func=MMAP_CHECK digest_cache=content appraise func=DIGEST_LIST_CHECK appraise_type=imasig|modsig measure func=DIGEST_LIST_CHECK template=ima-modsig pcr=11 measure func=BPRM_CHECK digest_cache=content pcr=11 measure func=MMAP_CHECK digest_cache=content pcr=11 I'm excluding tmpfs for now, we need to deal with memfd. Before loading the policy, you could enable the dynamic debug with: $ echo "file tlv* +p" > /sys/kernel/debug/dynamic_debug/control $ echo "file rpm* +p" > /sys/kernel/debug/dynamic_debug/control $ echo "file digest* +p" > /sys/kernel/debug/dynamic_debug/control Or add the same strings with the dyndbg= option in the kernel command line. Then, just cat the policy to IMA: $ cat /etc/ima/ima-policy > /sys/kernel/security/ima/policy If that worked, you can reboot the system. Systemd will take care of loading the IMA policy at boot. Everything works for me on Fedora 38. You can check the content of the measurement list: $ cat /sys/kernel/security/ima/ascii_runtime_measurements You will see only the measurement of the digest lists, not of the other files. If you try: $ cp -a /bin/cat . $ ./cat That will work. But if you do: $ echo 1 >> cat $ cat -bash: ./cat: Permission denied Execution will be denied, and you will see a new entry in the measurement list (honestly, it should not be there, as access to the file was denied): 11 50b5a68bea0776a84eef6725f17ce474756e51c0 ima-ng sha256:15e1efee080fe54f5d7404af7e913de01671e745ce55215d89f3d6521d3884f0 /root/cat Finally, you could try to evict inodes from memory, to free the digest cache: $ echo 3 > /proc/sys/vm/drop_caches You will see in the kernel logs messages like: [ 313.032536] DIGEST CACHE: Remove digest sha256:102900208eef27b766380135906d431dba87edaa7ec6aa72e6ebd3dd67f3a97b from digest list /etc/digest_lists/rpm-libseccomp-2.5.3-4.fc38.x86_64 Roberto Sassu (12): ima: Introduce hook DIGEST_LIST_CHECK integrity: Introduce a digest cache integrity/digest_cache: Add functions to populate and search integrity/digest_cache: Iterate over digest lists in same dir integrity/digest_cache: Parse tlv digest lists integrity/digest_cache: Parse rpm digest lists ima: Add digest_cache policy keyword ima: Use digest cache for measurement ima: Use digest cache for appraisal tools: Add tool to manage digest lists tools/digest-lists: Add tlv digest list generator and parser tools/digest-lists: Add rpm digest list generator and parser Documentation/ABI/testing/ima_policy | 6 +- MAINTAINERS | 1 + include/linux/kernel_read_file.h | 1 + include/uapi/linux/tlv_digest_list.h | 59 +++ include/uapi/linux/xattr.h | 3 + security/integrity/Kconfig | 12 + security/integrity/Makefile | 4 + security/integrity/digest_cache.c | 430 ++++++++++++++++++ security/integrity/digest_cache.h | 113 +++++ security/integrity/digest_cache_iter.c | 163 +++++++ .../integrity/digest_list_parsers/parsers.h | 15 + security/integrity/digest_list_parsers/rpm.c | 174 +++++++ security/integrity/digest_list_parsers/tlv.c | 188 ++++++++ security/integrity/iint.c | 7 + security/integrity/ima/ima.h | 17 +- security/integrity/ima/ima_api.c | 22 +- security/integrity/ima/ima_appraise.c | 16 +- security/integrity/ima/ima_main.c | 39 +- security/integrity/ima/ima_policy.c | 54 ++- security/integrity/integrity.h | 5 + tools/Makefile | 16 +- tools/digest-lists/.gitignore | 7 + tools/digest-lists/Makefile | 72 +++ tools/digest-lists/common.c | 163 +++++++ tools/digest-lists/common.h | 90 ++++ tools/digest-lists/generators/generators.h | 18 + tools/digest-lists/generators/rpm.c | 257 +++++++++++ tools/digest-lists/generators/tlv.c | 168 +++++++ tools/digest-lists/manage_digest_lists.c | 349 ++++++++++++++ tools/digest-lists/manage_digest_lists.txt | 82 ++++ tools/digest-lists/parsers/parsers.h | 16 + tools/digest-lists/parsers/rpm.c | 169 +++++++ tools/digest-lists/parsers/tlv.c | 195 ++++++++ tools/digest-lists/parsers/tlv_parser.h | 38 ++ 34 files changed, 2942 insertions(+), 27 deletions(-) create mode 100644 include/uapi/linux/tlv_digest_list.h create mode 100644 security/integrity/digest_cache.c create mode 100644 security/integrity/digest_cache.h create mode 100644 security/integrity/digest_cache_iter.c create mode 100644 security/integrity/digest_list_parsers/parsers.h create mode 100644 security/integrity/digest_list_parsers/rpm.c create mode 100644 security/integrity/digest_list_parsers/tlv.c create mode 100644 tools/digest-lists/.gitignore create mode 100644 tools/digest-lists/Makefile create mode 100644 tools/digest-lists/common.c create mode 100644 tools/digest-lists/common.h create mode 100644 tools/digest-lists/generators/generators.h create mode 100644 tools/digest-lists/generators/rpm.c create mode 100644 tools/digest-lists/generators/tlv.c create mode 100644 tools/digest-lists/manage_digest_lists.c create mode 100644 tools/digest-lists/manage_digest_lists.txt create mode 100644 tools/digest-lists/parsers/parsers.h create mode 100644 tools/digest-lists/parsers/rpm.c create mode 100644 tools/digest-lists/parsers/tlv.c create mode 100644 tools/digest-lists/parsers/tlv_parser.h