mbox series

[RFC,v5,00/11] Integrity Policy Enforcement LSM (IPE)

Message ID 20200728213614.586312-1-deven.desai@linux.microsoft.com (mailing list archive)
Headers show
Series Integrity Policy Enforcement LSM (IPE) | expand

Message

Deven Bowers July 28, 2020, 9:36 p.m. UTC
Overview:
------------------------------------

IPE is a Linux Security Module which allows for a configurable
policy to enforce integrity requirements on the whole system. It
attempts to solve the issue of Code Integrity: that any code being
executed (or files being read), are identical to the version that
was built by a trusted source.

The type of system for which IPE is designed for use is an embedded device
with a specific purpose (e.g. network firewall device in a data center),
where all software and configuration is built and provisioned by the owner.

Specifically, a system which leverages IPE is not intended for general
purpose computing and does not utilize any software or configuration
built by a third party. An ideal system to leverage IPE has both mutable
and immutable components, however, all binary executable code is immutable.

The scope of IPE is constrained to the OS. It is assumed that platform
firmware verifies the the kernel and optionally the root filesystem (e.g.
via U-Boot verified boot). IPE then utilizes LSM hooks to enforce a
flexible, kernel-resident integrity verification policy.

IPE differs from other LSMs which provide integrity checking (for instance,
IMA), as it has no dependency on the filesystem metadata itself. The
attributes that IPE checks are deterministic properties that exist solely
in the kernel. Additionally, IPE provides no additional mechanisms of
verifying these files (e.g. IMA Signatures) - all of the attributes of
verifying files are existing features within the kernel, such as dm-verity
or fsverity.

IPE provides a policy that allows owners of the system to easily specify
integrity requirements and uses dm-verity signatures to simplify the
authentication of allowed objects like authorized code and data.

IPE supports two modes, permissive (similar to SELinux's permissive mode)
and enforce. Permissive mode performs the same checks, and logs policy
violations as enforce mode, but will not enforce the policy. This allows
users to test policies before enforcing them.

The default mode is enforce, and can be changed via the kernel commandline
parameter `ipe.enforce=(0|1)`, or the securityfs node
`/sys/kernel/security/ipe/enforce`. The ability to switch modes can be
compiled out of the LSM via setting the config
CONFIG_SECURITY_IPE_PERMISSIVE_SWITCH to N.

IPE additionally supports success auditing. When enabled, all events
that pass IPE policy and are not blocked will emit an audit event. This
is disabled by default, and can be enabled via the kernel commandline
`ipe.success_audit=(0|1)` or the securityfs node
`/sys/kernel/security/ipe/success_audit`.

Policies can be staged at runtime through securityfs and activated through
sysfs. Please see the Deploying Policies section of this cover letter for
more information.

The IPE LSM is compiled under CONFIG_SECURITY_IPE.

Policy:
------------------------------------

IPE policy is designed to be both forward compatible and backwards
compatible. There is one required line, at the top of the policy,
indicating the policy name, and the policy version, for instance:

  policy_name="Ex Policy" policy_version=0.0.0

The policy version indicates the current version of the policy (NOT the
policy syntax version). This is used to prevent roll-back of policy to
potentially insecure previous versions of the policy.

The next portion of IPE policy, are rules. Rules are formed by key=value
pairs, known as properties. IPE rules require two properties: "action",
which determines what IPE does when it encounters a match against the
policy, and "op", which determines when that rule should be evaluated.
Thus, a minimal rule is:

  op=EXECUTE action=ALLOW

This example will allow any execution. Additional properties are used to
restrict attributes about the files being evaluated. These properties are
intended to be deterministic attributes that are resident in the kernel.
Available properties for IPE described in the properties section of this
cover-letter, the repository available in Appendix A, and the kernel
documentation page.

Order does not matter for the rule's properties - they can be listed in
any order, however it is encouraged to have the "op" property be first,
and the "action" property be last, for readability.

Additionally, rules are evaluated top-to-bottom. As a result, any
revocation rules, or denies should be placed early in the file to ensure
that these rules are evaluated before a rule with "action=ALLOW" is hit.

Any unknown syntax in IPE policy will result in a fatal error to parse
the policy. User mode can interrogate the kernel to understand what
properties and the associated versions through the securityfs node,
$securityfs/ipe/property_config, which will return a string of form:

  key1=version1
  key2=version2
  .
  .
  .
  keyN=versionN

User-mode should correlate these versions with the supported values
identified in the documentation to determine whether a policy should
be accepted by the system.

Additionally, a DEFAULT operation must be set for all understood
operations within IPE. For policies to remain completely forwards
compatible, it is recommended that users add a "DEFAULT action=ALLOW"
and override the defaults on a per-operation basis.

For more information about the policy syntax, please see Appendix A or
the kernel documentation page.

Early Usermode Protection:
--------------------------

IPE can be provided with a policy at startup to load and enforce.
This is intended to be a minimal policy to get the system to a state
where userland is setup and ready to receive commands, at which
point a policy can be deployed via securityfs. This "boot policy" can be
specified via the config, SECURITY_IPE_BOOT_POLICY, which accepts a path
to a plain-text version of the IPE policy to apply. This policy will be
compiled into the kernel. If not specified, IPE will be disabled until a
policy is deployed and activated through the method above.

Policy Examples:
------------------------------------

Allow all:

  policy_name="Allow All" policy_version=0.0.0
  DEFAULT action=ALLOW

Allow only initial superblock:

  policy_name="Allow All Initial SB" policy_version=0.0.0
  DEFAULT action=DENY

  op=EXECUTE boot_verified=TRUE action=ALLOW

Allow any signed dm-verity volume and the initial superblock:

  policy_name="AllowSignedAndInitial" policy_version=0.0.0
  DEFAULT action=DENY

  op=EXECUTE boot_verified=TRUE action=ALLOW
  op=EXECUTE dmverity_signature=TRUE action=ALLOW

Prohibit execution from a specific dm-verity volume:

  policy_name="AllowSignedAndInitial" policy_version=0.0.0
  DEFAULT action=DENY

  op=EXECUTE dmverity_roothash=401fcec5944823ae12f62726e8184407a5fa9599783f030dec146938 action=DENY
  op=EXECUTE boot_verified=TRUE action=ALLOW
  op=EXECUTE dmverity_signature=TRUE action=ALLOW

Allow only a specific dm-verity volume:

  policy_name="AllowSignedAndInitial" policy_version=0.0.0
  DEFAULT action=DENY

  op=EXECUTE dmverity_roothash=401fcec5944823ae12f62726e8184407a5fa9599783f030dec146938 action=ALLOW

Deploying Policies:
-------------------

Deploying policies is simple. First sign a plain text policy, with a
certificate that is present in the SYSTEM_TRUSTED_KEYRING of your test
machine. Through openssl, the signing can be done via:

  openssl smime -sign -in "$MY_POLICY" -signer "$MY_CERTIFICATE" \
    -inkey "$MY_PRIVATE_KEY" -binary -outform der -noattr -nodetach \
    -out "$MY_POLICY.p7s"

Then, simply cat the file into the IPE's "new_policy" securityfs node:

  cat "$MY_POLICY.p7s" > /sys/kernel/security/ipe/new_policy

The policy should now be present under the policies/ subdirectory, under
its "policy_name" attribute.

The policy is now present in the kernel and can be marked as active,
via the sysctl "ipe.active_policy":

  echo -n 1 > "/sys/kernel/security/ipe/$MY_POLICY_NAME/active"

This will now mark the policy as active and the system will be enforcing
$MY_POLICY_NAME. At any point the policy can be updated on the provision
that the policy version to be deployed is greater than or equal to the
running version (to prevent roll-back attacks). This update can be done
by redirecting the file into the policy's "raw" node, under the policies
subdirectory:

  cat "$MY_UPDATED_POLICY.p7s" > \
    "/sys/kernel/security/ipe/policies/$MY_POLICY_NAME/raw"

Additionally, policies can be deleted via the "del_policy" securityfs
node. Simply write the name of the policy to be deleted to that node:

  echo -n 1 >
    "/sys/kernel/security/ipe/policies/$MY_POLICY_NAME/delete"

There are two requirements to delete policies:

1. The policy being deleted must not be the active policy.
2. The policy being deleted must not be the boot policy.

It's important to know above that the "echo" command will add a newline
to the end of the input, and this will be considered as part of the
filename. You can remove the newline via the -n parameter.

NOTE: If a MAC LSM is enabled, the securityfs commands will require
CAP_MAC_ADMIN. This is due to sysfs supporting fine-grained MAC
attributes, while securityfs at the current moment does not.

Properties:
------------------------------------

This initial patchset introducing IPE adds three properties:
'boot_verified', 'dmverity_signature' and 'dmverity_roothash'.

boot_verified (CONFIG_IPE_BOOT_PROP):
  This property can be utilized for authorization of the first
  super-block that is mounted on the system, where IPE attempts
  to evaluate a file. Typically this is used for systems with
  an initramfs or other initial disk, where this is unmounted before
  the system becomes available, and is not covered by any other property.
  The format of this property is:

    boot_verified=(TRUE|FALSE)

  WARNING: This property will trust any disk where the first IPE
  evaluation occurs. If you do not have a startup disk that is
  unpacked and unmounted (like initramfs), then it will automatically
  trust the root filesystem and potentially overauthorize the entire
  disk.

dmverity_roothash (CONFIG_IPE_DM_VERITY_ROOTHASH):
  This property can be utilized for authorization or revocation of
  specific dmverity volumes, identified via root hash. It has a
  dependency on the DM_VERITY module. The format of this property is:

    dmverity_roothash=<HashHexDigest>

dmverity_signature (CONFIG_IPE_DM_VERITY_SIGNATURE):
  This property can be utilized for authorization of all dm-verity
  volumes that have a signed roothash that chains to the system
  trusted keyring. It has a dependency on the
  DM_VERITY_VERIFY_ROOTHASH_SIG config. The format of this property is:

    dmverity_signature=(TRUE|FALSE)

Testing:
------------------------------------

A test suite is available (Appendix B) for ease of use. For manual
instructions:

Enable IPE through the following Kconfigs:

  CONFIG_SECURITY_IPE=y
  CONFIG_SECURITY_IPE_BOOT_POLICY="../AllowAllInitialSB.pol"
  CONFIG_SECURITY_IPE_PERMISSIVE_SWITCH=y
  CONFIG_IPE_BOOT_PROP=y
  CONFIG_IPE_DM_VERITY_ROOTHASH=y
  CONFIG_IPE_DM_VERITY_SIGNATURE=y
  CONFIG_DM_VERITY=y
  CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG=y
  CONFIG_SYSTEM_TRUSTED_KEYRING=y
  CONFIG_SYSTEM_TRUSTED_KEYS="/path/to/my/cert/list.pem"

Start a test system, that boots directly from the filesystem, without
an initrd. I recommend testing in permissive mode until all tests
pass, then switch to enforce to ensure behavior remains identical.

boot_verified:

  If booted correctly, the filesystem mounted on / should be marked as
  boot_verified. Verify by turning on success auditing (sysctl
  ipe.success_audit=1), and run a binary. In the audit output,
  `prop_boot_verified` should be `TRUE`.

  To test denials, mount a temporary filesystem (mount -t tmpfs -o
  size=4M tmp tmp), and copy a binary (e.g. ls) to this new
  filesystem. Disable success auditing and attempt to run the file.
  The file should have an audit event, but be allowed to execute in
  permissive mode, and prop_boot_verified should be FALSE.

dmverity_roothash:

  First, you must create a dm-verity volume. This can be done through
  squashfs-tools and veritysetup (provided by cryptsetup).

  Creating a squashfs volume:

    mksquashfs /path/to/directory/with/executable /path/to/output.squashfs

  Format the volume for use with dm-verity & save the root hash:

    output_rh=$(veritysetup format output.squashfs output.hashtree | \
      tee verity_out.txt | awk "/Root hash/" | \
      sed -E "s/Root hash:\s+//g")

    echo -n $output_rh > output.roothash

  Create a two policies, filling in the appropriate fields below:

    Policy 1:

      policy_name="roothash-denial" policy_version=0.0.0
      DEFAULT action=ALLOW
      op=EXECUTE dmverity_roothash=$output_rh action=DENY

    Policy 2:

      policy_name="roothash-allow" policy_version=0.0.0
      DEFAULT action=ALLOW
      DEFAULT op=EXECUTE action=DENY

      op=EXECUTE boot_verified=TRUE action=ALLOW
      op=EXECUTE dmverity_roothash=$output_rh action=ALLOW

  Deploy each policy, then mark the first, "roothash-denial" as active,
  per the "Deploying Policies" section of this cover letter. Mount the
  dm-verity volume:

    veritysetup open output.squashfs output.hashtree unverified \
      `cat output.roothash`

    mount /dev/mapper/unverified /my/mount/point

  Attempt to execute a binary in the mount point, and it should emit an
  audit event for a match against the rule:
  
    op=EXECUTE dmverity_roothash=$output_rh action=DENY

  To test the second policy, perform the same steps, but this time, enable
  success auditing before running the executable. The success audit event
  should be a match against this rule:

    op=EXECUTE dmverity_roothash=$output_rh action=ALLOW

dmverity_signature:

  Follow the setup steps for dmverity_roothash. Sign the roothash via:

    openssl smime -sign -in "output.roothash" -signer "$MY_CERTIFICATE" \
      -inkey "$MY_PRIVATE_KEY" -binary -outform der -noattr \
      -out "output.p7s"

    Create a policy:

      policy_name="verified" policy_version=0.0.0
      DEFAULT action=DENY

      op=EXECUTE boot_verified=TRUE action=ALLOW
      op=EXECUTE dmverity_verified=TRUE action=ALLOW

  Deploy the policy, and mark as active, per the "Deploying Policies"
  section of this cover letter. Mount the dm-verity volume with
  verification:

    veritysetup open output.squashfs output.hashtree unverified \
      `cat output.roothash` --root-hash-signature=output.p7s

    mount /dev/mapper/unverified /my/mount/point

  NOTE: The --root-hash-signature option was introduced in veritysetup
  2.3.0

  Turn on success auditing and attempt to execute a binary in the mount
  point, and it should emit an audit event for a match against the rule:

    op=EXECUTE dmverity_verified=TRUE action=ALLOW

  To test denials, mount the dm-verity volume the same way as the
  "dmverity_roothash" section, and attempt to execute a binary. Failure
  should occur.

Documentation:
------------------------------------

Full documentation is available on github in IPE's master repository
(Appendix A). This is intended to be an exhaustive source of documentation
around IPE.

Additionally, there is higher level documentation in the admin-guide.

Technical diagrams are available here:

  http://microsoft.github.io/ipe/technical/diagrams/

Known Gaps:
------------------------------------

IPE has two known gaps:

1. IPE cannot verify the integrity of anonymous executable memory, such as
  the trampolines created by gcc closures and libffi, or JIT'd code.
  Unfortunately, as this is dynamically generated code, there is no way for
  IPE to detect that this code has not been tampered with in transition
  from where it was built, to where it is running. As a result, IPE is
  incapable of tackling this problem for dynamically generated code.
  However, there is a patch series being prepared that addresses this
  problem for libffi and gcc closures by implemeting a safer kernel
  trampoline API. 

2. IPE cannot verify the integrity of interpreted languages' programs when
  these scripts invoked via `<interpreter> <file>`. This is because the way
  interpreters execute these files, the scripts themselves are not
  evaluated as executable code through one of IPE's hooks. Interpreters
  can be enlightened to the usage of IPE by trying to mmap a file into
  executable memory (+X), after opening the file and responding to the
  error code appropriately. This also applies to included files, or high
  value files, such as configuration files of critical system components.
  This specific gap is planned on being addressed within IPE. For more
  information on how we plan to address this gap, please see the Future
  Development section, below.

Future Development:
------------------------------------

Support for filtering signatures by specific certificates. In this case,
our "dmverity_signature" (or a separate property) can be set to a
specific certificate declared in IPE's policy, allowing for more
controlled use-cases determine by a user's PKI structure.

Support for integrity verification for general file reads. This addresses
the script interpreter issue indicated in the "Known Gaps" section, as
these script files are typically opened with O_RDONLY. We are evaluating
whether to do this by comparing the original userland filepath passed into
the open syscall, thereby allowing existing callers to take advantage
without any code changes; the alternate design is to extend the new
openat2(2) syscall, with an new flag, tentatively called "O_VERIFY". While
the second option requires a code change for all the interpreters,
frameworks and languages that wish to leverage it, it is a wholly cleaner
implementation in the kernel. For interpreters specifically, the O_MAYEXEC
patch series published by Mickaël Salaün[1] is a similar implementation
to the O_VERIFY idea described above.

Onboarding IPE's test suite to KernelCI. Currently we are developing a
test suite in the same vein as SELinux's test suite. Once development
of the test suite is complete, and provided IPE is accepted, we intend
to onboard this test suite onto KernelCI.

Hardened resistance against roll-back attacks. Currently there exists a
window of opportunity between user-mode setup and the user-policy being
deployed, where a prior user-policy can be loaded, that is potentially
insecure. However, with a kernel update, you can revise the boot policy's
version to be the same version as the latest policy, closing this window.
In the future, I would like to close this window of opportunity without
a kernel update, using some persistent storage mechanism.

Open Issues:
------------

For linux-audit/integrity folks:
1. Introduction of new audit definitions in the kernel integrity range - is
  this preferred, as opposed to reusing definitions with existing IMA
  definitions?

TODOs:
------

linux-audit changes to support the new audit events.


Appendix:
------------------------------------

A. IPE Github Repository: https://github.com/microsoft/ipe
   Hosted Documentation: https://microsoft.github.io/ipe
B. IPE Users' Guide: Documentation/admin-guide/LSM/ipe.rst
C. IPE Test Suite: *TBA* (under development)

References:
------------------------------------

1. https://lore.kernel.org/linux-integrity/20200505153156.925111-1-mic@digikod.net/

Changelog:
------------------------------------

v1: Introduced

v2:
  Split the second patch of the previous series into two.
  Minor corrections in the cover-letter and documentation
  comments regarding CAP_MAC_ADMIN checks in IPE.

v3:
  Address various comments by Jann Horn. Highlights:
    Switch various audit allocators to GFP_KERNEL.
    Utilize rcu_access_pointer() in various locations.
    Strip out the caching system for properties
    Strip comments from headers
    Move functions around in patches
    Remove kernel command line parameters
    Reconcile the race condition on the delete node for policy by
      expanding the policy critical section.

  Address a few comments by Jonathan Corbet around the documentation
    pages for IPE.

  Fix an issue with the initialization of IPE policy with a "-0"
    version, caused by not initializing the hlist entries before
    freeing.

v4:
  Address a concern around IPE's behavior with unknown syntax.
    Specifically, make any unknown syntax a fatal error instead of a
    warning, as suggested by Mickaël Salaün.
  Introduce a new securityfs node, $securityfs/ipe/property_config,
    which provides a listing of what properties are enabled by the
    kernel and their versions. This allows usermode to predict what
    policies should be allowed.
  Strip some comments from c files that I missed.
  Clarify some documentation comments around 'boot_verified'.
    While this currently does not functionally change the property
    itself, the distinction is important when IPE can enforce verified
    reads. Additionally, 'KERNEL_READ' was omitted from the documentation.
    This has been corrected.
  Change SecurityFS and SHA1 to a reverse dependency.
  Update the cover-letter with the updated behavior of unknown syntax.
  Remove all sysctls, making an equivalent function in securityfs.
  Rework the active/delete mechanism to be a node under the policy in
    $securityfs/ipe/policies.
  The kernel command line parameters ipe.enforce and ipe.success_audit
    have returned as this functionality is no longer exposed through
    sysfs.

v5:
  Correct some grammatical errors reported by Randy Dunlap.
  Fix some warnings reported by kernel test bot.
  Change convention around security_bdev_setsecurity. -ENOSYS
    is now expected if an LSM does not implement a particular @name,
    as suggested by Casey Schaufler.
  Minor string corrections related to the move from sysfs to securityfs
  Correct a spelling of an #ifdef for the permissive argument.
  Add the kernel parameters re-added to the documentation.Integrity Policy Enforcement LSM (IPE)

Overview:
------------------------------------

IPE is a Linux Security Module which allows for a configurable
policy to enforce integrity requirements on the whole system. It
attempts to solve the issue of Code Integrity: that any code being
executed (or files being read), are identical to the version that
was built by a trusted source.

The type of system for which IPE is designed for use is an embedded device
with a specific purpose (e.g. network firewall device in a data center),
where all software and configuration is built and provisioned by the owner.

Specifically, a system which leverages IPE is not intended for general
purpose computing and does not utilize any software or configuration
built by a third party. An ideal system to leverage IPE has both mutable
and immutable components, however, all binary executable code is immutable.

The scope of IPE is constrained to the OS. It is assumed that platform
firmware verifies the the kernel and optionally the root filesystem (e.g.
via U-Boot verified boot). IPE then utilizes LSM hooks to enforce a
flexible, kernel-resident integrity verification policy.

IPE differs from other LSMs which provide integrity checking (for instance,
IMA), as it has no dependency on the filesystem metadata itself. The
attributes that IPE checks are deterministic properties that exist solely
in the kernel. Additionally, IPE provides no additional mechanisms of
verifying these files (e.g. IMA Signatures) - all of the attributes of
verifying files are existing features within the kernel, such as dm-verity
or fsverity.

IPE provides a policy that allows owners of the system to easily specify
integrity requirements and uses dm-verity signatures to simplify the
authentication of allowed objects like authorized code and data.

IPE supports two modes, permissive (similar to SELinux's permissive mode)
and enforce. Permissive mode performs the same checks, and logs policy
violations as enforce mode, but will not enforce the policy. This allows
users to test policies before enforcing them.

The default mode is enforce, and can be changed via the kernel commandline
parameter `ipe.enforce=(0|1)`, or the securityfs node
`/sys/kernel/security/ipe/enforce`. The ability to switch modes can be
compiled out of the LSM via setting the config
CONFIG_SECURITY_IPE_PERMISSIVE_SWITCH to N.

IPE additionally supports success auditing. When enabled, all events
that pass IPE policy and are not blocked will emit an audit event. This
is disabled by default, and can be enabled via the kernel commandline
`ipe.success_audit=(0|1)` or the securityfs node
`/sys/kernel/security/ipe/success_audit`.

Policies can be staged at runtime through securityfs and activated through
sysfs. Please see the Deploying Policies section of this cover letter for
more information.

The IPE LSM is compiled under CONFIG_SECURITY_IPE.

Policy:
------------------------------------

IPE policy is designed to be both forward compatible and backwards
compatible. There is one required line, at the top of the policy,
indicating the policy name, and the policy version, for instance:

  policy_name="Ex Policy" policy_version=0.0.0

The policy version indicates the current version of the policy (NOT the
policy syntax version). This is used to prevent roll-back of policy to
potentially insecure previous versions of the policy.

The next portion of IPE policy, are rules. Rules are formed by key=value
pairs, known as properties. IPE rules require two properties: "action",
which determines what IPE does when it encounters a match against the
policy, and "op", which determines when that rule should be evaluated.
Thus, a minimal rule is:

  op=EXECUTE action=ALLOW

This example will allow any execution. Additional properties are used to
restrict attributes about the files being evaluated. These properties are
intended to be deterministic attributes that are resident in the kernel.
Available properties for IPE described in the properties section of this
cover-letter, the repository available in Appendix A, and the kernel
documentation page.

Order does not matter for the rule's properties - they can be listed in
any order, however it is encouraged to have the "op" property be first,
and the "action" property be last, for readability.

Additionally, rules are evaluated top-to-bottom. As a result, any
revocation rules, or denies should be placed early in the file to ensure
that these rules are evaluated before a rule with "action=ALLOW" is hit.

Any unknown syntax in IPE policy will result in a fatal error to parse
the policy. User mode can interrogate the kernel to understand what
properties and the associated versions through the securityfs node,
$securityfs/ipe/property_config, which will return a string of form:

  key1=version1
  key2=version2
  .
  .
  .
  keyN=versionN

User-mode should correlate these versions with the supported values
identified in the documentation to determine whether a policy should
be accepted by the system.

Additionally, a DEFAULT operation must be set for all understood
operations within IPE. For policies to remain completely forwards
compatible, it is recommended that users add a "DEFAULT action=ALLOW"
and override the defaults on a per-operation basis.

For more information about the policy syntax, please see Appendix A or
the kernel documentation page.

Early Usermode Protection:
--------------------------

IPE can be provided with a policy at startup to load and enforce.
This is intended to be a minimal policy to get the system to a state
where userland is setup and ready to receive commands, at which
point a policy can be deployed via securityfs. This "boot policy" can be
specified via the config, SECURITY_IPE_BOOT_POLICY, which accepts a path
to a plain-text version of the IPE policy to apply. This policy will be
compiled into the kernel. If not specified, IPE will be disabled until a
policy is deployed and activated through the method above.

Policy Examples:
------------------------------------

Allow all:

  policy_name="Allow All" policy_version=0.0.0
  DEFAULT action=ALLOW

Allow only initial superblock:

  policy_name="Allow All Initial SB" policy_version=0.0.0
  DEFAULT action=DENY

  op=EXECUTE boot_verified=TRUE action=ALLOW

Allow any signed dm-verity volume and the initial superblock:

  policy_name="AllowSignedAndInitial" policy_version=0.0.0
  DEFAULT action=DENY

  op=EXECUTE boot_verified=TRUE action=ALLOW
  op=EXECUTE dmverity_signature=TRUE action=ALLOW

Prohibit execution from a specific dm-verity volume:

  policy_name="AllowSignedAndInitial" policy_version=0.0.0
  DEFAULT action=DENY

  op=EXECUTE dmverity_roothash=401fcec5944823ae12f62726e8184407a5fa9599783f030dec146938 action=DENY
  op=EXECUTE boot_verified=TRUE action=ALLOW
  op=EXECUTE dmverity_signature=TRUE action=ALLOW

Allow only a specific dm-verity volume:

  policy_name="AllowSignedAndInitial" policy_version=0.0.0
  DEFAULT action=DENY

  op=EXECUTE dmverity_roothash=401fcec5944823ae12f62726e8184407a5fa9599783f030dec146938 action=ALLOW

Deploying Policies:
-------------------

Deploying policies is simple. First sign a plain text policy, with a
certificate that is present in the SYSTEM_TRUSTED_KEYRING of your test
machine. Through openssl, the signing can be done via:

  openssl smime -sign -in "$MY_POLICY" -signer "$MY_CERTIFICATE" \
    -inkey "$MY_PRIVATE_KEY" -binary -outform der -noattr -nodetach \
    -out "$MY_POLICY.p7s"

Then, simply cat the file into the IPE's "new_policy" securityfs node:

  cat "$MY_POLICY.p7s" > /sys/kernel/security/ipe/new_policy

The policy should now be present under the policies/ subdirectory, under
its "policy_name" attribute.

The policy is now present in the kernel and can be marked as active,
via the sysctl "ipe.active_policy":

  echo -n 1 > "/sys/kernel/security/ipe/$MY_POLICY_NAME/active"

This will now mark the policy as active and the system will be enforcing
$MY_POLICY_NAME. At any point the policy can be updated on the provision
that the policy version to be deployed is greater than or equal to the
running version (to prevent roll-back attacks). This update can be done
by redirecting the file into the policy's "raw" node, under the policies
subdirectory:

  cat "$MY_UPDATED_POLICY.p7s" > \
    "/sys/kernel/security/ipe/policies/$MY_POLICY_NAME/raw"

Additionally, policies can be deleted via the "del_policy" securityfs
node. Simply write the name of the policy to be deleted to that node:

  echo -n 1 >
    "/sys/kernel/security/ipe/policies/$MY_POLICY_NAME/delete"

There are two requirements to delete policies:

1. The policy being deleted must not be the active policy.
2. The policy being deleted must not be the boot policy.

It's important to know above that the "echo" command will add a newline
to the end of the input, and this will be considered as part of the
filename. You can remove the newline via the -n parameter.

NOTE: If a MAC LSM is enabled, the securityfs commands will require
CAP_MAC_ADMIN. This is due to sysfs supporting fine-grained MAC
attributes, while securityfs at the current moment does not.

Properties:
------------------------------------

This initial patchset introducing IPE adds three properties:
'boot_verified', 'dmverity_signature' and 'dmverity_roothash'.

boot_verified (CONFIG_IPE_BOOT_PROP):
  This property can be utilized for authorization of the first
  super-block that is mounted on the system, where IPE attempts
  to evaluate a file. Typically this is used for systems with
  an initramfs or other initial disk, where this is unmounted before
  the system becomes available, and is not covered by any other property.
  The format of this property is:

    boot_verified=(TRUE|FALSE)

  WARNING: This property will trust any disk where the first IPE
  evaluation occurs. If you do not have a startup disk that is
  unpacked and unmounted (like initramfs), then it will automatically
  trust the root filesystem and potentially overauthorize the entire
  disk.

dmverity_roothash (CONFIG_IPE_DM_VERITY_ROOTHASH):
  This property can be utilized for authorization or revocation of
  specific dmverity volumes, identified via root hash. It has a
  dependency on the DM_VERITY module. The format of this property is:

    dmverity_roothash=<HashHexDigest>

dmverity_signature (CONFIG_IPE_DM_VERITY_SIGNATURE):
  This property can be utilized for authorization of all dm-verity
  volumes that have a signed roothash that chains to the system
  trusted keyring. It has a dependency on the
  DM_VERITY_VERIFY_ROOTHASH_SIG config. The format of this property is:

    dmverity_signature=(TRUE|FALSE)

Testing:
------------------------------------

A test suite is available (Appendix B) for ease of use. For manual
instructions:

Enable IPE through the following Kconfigs:

  CONFIG_SECURITY_IPE=y
  CONFIG_SECURITY_IPE_BOOT_POLICY="../AllowAllInitialSB.pol"
  CONFIG_SECURITY_IPE_PERMISSIVE_SWITCH=y
  CONFIG_IPE_BOOT_PROP=y
  CONFIG_IPE_DM_VERITY_ROOTHASH=y
  CONFIG_IPE_DM_VERITY_SIGNATURE=y
  CONFIG_DM_VERITY=y
  CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG=y
  CONFIG_SYSTEM_TRUSTED_KEYRING=y
  CONFIG_SYSTEM_TRUSTED_KEYS="/path/to/my/cert/list.pem"

Start a test system, that boots directly from the filesystem, without
an initrd. I recommend testing in permissive mode until all tests
pass, then switch to enforce to ensure behavior remains identical.

boot_verified:

  If booted correctly, the filesystem mounted on / should be marked as
  boot_verified. Verify by turning on success auditing (sysctl
  ipe.success_audit=1), and run a binary. In the audit output,
  `prop_boot_verified` should be `TRUE`.

  To test denials, mount a temporary filesystem (mount -t tmpfs -o
  size=4M tmp tmp), and copy a binary (e.g. ls) to this new
  filesystem. Disable success auditing and attempt to run the file.
  The file should have an audit event, but be allowed to execute in
  permissive mode, and prop_boot_verified should be FALSE.

dmverity_roothash:

  First, you must create a dm-verity volume. This can be done through
  squashfs-tools and veritysetup (provided by cryptsetup).

  Creating a squashfs volume:

    mksquashfs /path/to/directory/with/executable /path/to/output.squashfs

  Format the volume for use with dm-verity & save the root hash:

    output_rh=$(veritysetup format output.squashfs output.hashtree | \
      tee verity_out.txt | awk "/Root hash/" | \
      sed -E "s/Root hash:\s+//g")

    echo -n $output_rh > output.roothash

  Create a two policies, filling in the appropriate fields below:

    Policy 1:

      policy_name="roothash-denial" policy_version=0.0.0
      DEFAULT action=ALLOW
      op=EXECUTE dmverity_roothash=$output_rh action=DENY

    Policy 2:

      policy_name="roothash-allow" policy_version=0.0.0
      DEFAULT action=ALLOW
      DEFAULT op=EXECUTE action=DENY

      op=EXECUTE boot_verified=TRUE action=ALLOW
      op=EXECUTE dmverity_roothash=$output_rh action=ALLOW

  Deploy each policy, then mark the first, "roothash-denial" as active,
  per the "Deploying Policies" section of this cover letter. Mount the
  dm-verity volume:

    veritysetup open output.squashfs output.hashtree unverified \
      `cat output.roothash`

    mount /dev/mapper/unverified /my/mount/point

  Attempt to execute a binary in the mount point, and it should emit an
  audit event for a match against the rule:
  
    op=EXECUTE dmverity_roothash=$output_rh action=DENY

  To test the second policy, perform the same steps, but this time, enable
  success auditing before running the executable. The success audit event
  should be a match against this rule:

    op=EXECUTE dmverity_roothash=$output_rh action=ALLOW

dmverity_signature:

  Follow the setup steps for dmverity_roothash. Sign the roothash via:

    openssl smime -sign -in "output.roothash" -signer "$MY_CERTIFICATE" \
      -inkey "$MY_PRIVATE_KEY" -binary -outform der -noattr \
      -out "output.p7s"

    Create a policy:

      policy_name="verified" policy_version=0.0.0
      DEFAULT action=DENY

      op=EXECUTE boot_verified=TRUE action=ALLOW
      op=EXECUTE dmverity_verified=TRUE action=ALLOW

  Deploy the policy, and mark as active, per the "Deploying Policies"
  section of this cover letter. Mount the dm-verity volume with
  verification:

    veritysetup open output.squashfs output.hashtree unverified \
      `cat output.roothash` --root-hash-signature=output.p7s

    mount /dev/mapper/unverified /my/mount/point

  NOTE: The --root-hash-signature option was introduced in veritysetup
  2.3.0

  Turn on success auditing and attempt to execute a binary in the mount
  point, and it should emit an audit event for a match against the rule:

    op=EXECUTE dmverity_verified=TRUE action=ALLOW

  To test denials, mount the dm-verity volume the same way as the
  "dmverity_roothash" section, and attempt to execute a binary. Failure
  should occur.

Documentation:
------------------------------------

Full documentation is available on github in IPE's master repository
(Appendix A). This is intended to be an exhaustive source of documentation
around IPE.

Additionally, there is higher level documentation in the admin-guide.

Technical diagrams are available here:

  http://microsoft.github.io/ipe/technical/diagrams/

Known Gaps:
------------------------------------

IPE has two known gaps:

1. IPE cannot verify the integrity of anonymous executable memory, such as
  the trampolines created by gcc closures and libffi, or JIT'd code.
  Unfortunately, as this is dynamically generated code, there is no way for
  IPE to detect that this code has not been tampered with in transition
  from where it was built, to where it is running. As a result, IPE is
  incapable of tackling this problem for dynamically generated code.
  However, there is a patch series being prepared that addresses this
  problem for libffi and gcc closures by implemeting a safer kernel
  trampoline API. 

2. IPE cannot verify the integrity of interpreted languages' programs when
  these scripts invoked via `<interpreter> <file>`. This is because the way
  interpreters execute these files, the scripts themselves are not
  evaluated as executable code through one of IPE's hooks. Interpreters
  can be enlightened to the usage of IPE by trying to mmap a file into
  executable memory (+X), after opening the file and responding to the
  error code appropriately. This also applies to included files, or high
  value files, such as configuration files of critical system components.
  This specific gap is planned on being addressed within IPE. For more
  information on how we plan to address this gap, please see the Future
  Development section, below.

Future Development:
------------------------------------

Support for filtering signatures by specific certificates. In this case,
our "dmverity_signature" (or a separate property) can be set to a
specific certificate declared in IPE's policy, allowing for more
controlled use-cases determine by a user's PKI structure.

Support for integrity verification for general file reads. This addresses
the script interpreter issue indicated in the "Known Gaps" section, as
these script files are typically opened with O_RDONLY. We are evaluating
whether to do this by comparing the original userland filepath passed into
the open syscall, thereby allowing existing callers to take advantage
without any code changes; the alternate design is to extend the new
openat2(2) syscall, with an new flag, tentatively called "O_VERIFY". While
the second option requires a code change for all the interpreters,
frameworks and languages that wish to leverage it, it is a wholly cleaner
implementation in the kernel. For interpreters specifically, the O_MAYEXEC
patch series published by Mickaël Salaün[1] is a similar implementation
to the O_VERIFY idea described above.

Onboarding IPE's test suite to KernelCI. Currently we are developing a
test suite in the same vein as SELinux's test suite. Once development
of the test suite is complete, and provided IPE is accepted, we intend
to onboard this test suite onto KernelCI.

Hardened resistance against roll-back attacks. Currently there exists a
window of opportunity between user-mode setup and the user-policy being
deployed, where a prior user-policy can be loaded, that is potentially
insecure. However, with a kernel update, you can revise the boot policy's
version to be the same version as the latest policy, closing this window.
In the future, I would like to close this window of opportunity without
a kernel update, using some persistent storage mechanism.

Open Issues:
------------

For linux-audit/integrity folks:
1. Introduction of new audit definitions in the kernel integrity range - is
  this preferred, as opposed to reusing definitions with existing IMA
  definitions?

TODOs:
------

linux-audit changes to support the new audit events.


Appendix:
------------------------------------

A. IPE Github Repository: https://github.com/microsoft/ipe
   Hosted Documentation: https://microsoft.github.io/ipe
B. IPE Users' Guide: Documentation/admin-guide/LSM/ipe.rst
C. IPE Test Suite: *TBA* (under development)

References:
------------------------------------

1. https://lore.kernel.org/linux-integrity/20200505153156.925111-1-mic@digikod.net/

Changelog:
------------------------------------

v1: Introduced

v2:
  Split the second patch of the previous series into two.
  Minor corrections in the cover-letter and documentation
  comments regarding CAP_MAC_ADMIN checks in IPE.

v3:
  Address various comments by Jann Horn. Highlights:
    Switch various audit allocators to GFP_KERNEL.
    Utilize rcu_access_pointer() in various locations.
    Strip out the caching system for properties
    Strip comments from headers
    Move functions around in patches
    Remove kernel command line parameters
    Reconcile the race condition on the delete node for policy by
      expanding the policy critical section.

  Address a few comments by Jonathan Corbet around the documentation
    pages for IPE.

  Fix an issue with the initialization of IPE policy with a "-0"
    version, caused by not initializing the hlist entries before
    freeing.

v4:
  Address a concern around IPE's behavior with unknown syntax.
    Specifically, make any unknown syntax a fatal error instead of a
    warning, as suggested by Mickaël Salaün.
  Introduce a new securityfs node, $securityfs/ipe/property_config,
    which provides a listing of what properties are enabled by the
    kernel and their versions. This allows usermode to predict what
    policies should be allowed.
  Strip some comments from c files that I missed.
  Clarify some documentation comments around 'boot_verified'.
    While this currently does not functionally change the property
    itself, the distinction is important when IPE can enforce verified
    reads. Additionally, 'KERNEL_READ' was omitted from the documentation.
    This has been corrected.
  Change SecurityFS and SHA1 to a reverse dependency.
  Update the cover-letter with the updated behavior of unknown syntax.
  Remove all sysctls, making an equivalent function in securityfs.
  Rework the active/delete mechanism to be a node under the policy in
    $securityfs/ipe/policies.
  The kernel command line parameters ipe.enforce and ipe.success_audit
    have returned as this functionality is no longer exposed through
    sysfs.

v5:
  Correct some grammatical errors reported by Randy Dunlap.
  Fix some warnings reported by kernel test bot.
  Change convention around security_bdev_setsecurity. -ENOSYS
    is now expected if an LSM does not implement a particular @name,
    as suggested by Casey Schaufler.
  Minor string corrections related to the move from sysfs to securityfs
  Correct a spelling of an #ifdef for the permissive argument.
  Add the kernel parameters re-added to the documentation.
  Fix a minor bug where the mode being audited on permissive switch
    was the original mode, not the mode being swapped to.
  Cleanup doc comments, fix some whitespace alignment issues.

Deven Bowers (11):
  scripts: add ipe tooling to generate boot policy
  security: add ipe lsm evaluation loop and audit system
  security: add ipe lsm policy parser and policy loading
  ipe: add property for trust of boot volume
  fs: add security blob and hooks for block_device
  dm-verity: move signature check after tree validation
  dm-verity: add bdev_setsecurity hook for dm-verity signature
  ipe: add property for signed dmverity volumes
  dm-verity: add bdev_setsecurity hook for root-hash
  documentation: add ipe documentation
  cleanup: uapi/linux/audit.h

 Documentation/admin-guide/LSM/index.rst       |    1 +
 Documentation/admin-guide/LSM/ipe.rst         |  508 +++++++
 .../admin-guide/kernel-parameters.txt         |   12 +
 MAINTAINERS                                   |    8 +
 drivers/md/dm-verity-target.c                 |   52 +-
 drivers/md/dm-verity-verify-sig.c             |  147 +-
 drivers/md/dm-verity-verify-sig.h             |   24 +-
 drivers/md/dm-verity.h                        |    2 +-
 fs/block_dev.c                                |    8 +
 include/linux/device-mapper.h                 |    3 +
 include/linux/fs.h                            |    1 +
 include/linux/lsm_hook_defs.h                 |    5 +
 include/linux/lsm_hooks.h                     |   12 +
 include/linux/security.h                      |   22 +
 include/uapi/linux/audit.h                    |   36 +-
 scripts/Makefile                              |    1 +
 scripts/ipe/Makefile                          |    2 +
 scripts/ipe/polgen/.gitignore                 |    1 +
 scripts/ipe/polgen/Makefile                   |    7 +
 scripts/ipe/polgen/polgen.c                   |  136 ++
 security/Kconfig                              |   12 +-
 security/Makefile                             |    2 +
 security/ipe/.gitignore                       |    2 +
 security/ipe/Kconfig                          |   48 +
 security/ipe/Makefile                         |   33 +
 security/ipe/ipe-audit.c                      |  303 ++++
 security/ipe/ipe-audit.h                      |   24 +
 security/ipe/ipe-blobs.c                      |   95 ++
 security/ipe/ipe-blobs.h                      |   18 +
 security/ipe/ipe-engine.c                     |  213 +++
 security/ipe/ipe-engine.h                     |   49 +
 security/ipe/ipe-hooks.c                      |  169 +++
 security/ipe/ipe-hooks.h                      |   70 +
 security/ipe/ipe-parse.c                      |  889 +++++++++++
 security/ipe/ipe-parse.h                      |   17 +
 security/ipe/ipe-pin.c                        |   93 ++
 security/ipe/ipe-pin.h                        |   36 +
 security/ipe/ipe-policy.c                     |  149 ++
 security/ipe/ipe-policy.h                     |   69 +
 security/ipe/ipe-prop-internal.h              |   49 +
 security/ipe/ipe-property.c                   |  143 ++
 security/ipe/ipe-property.h                   |  100 ++
 security/ipe/ipe-secfs.c                      | 1309 +++++++++++++++++
 security/ipe/ipe-secfs.h                      |   14 +
 security/ipe/ipe.c                            |  115 ++
 security/ipe/ipe.h                            |   22 +
 security/ipe/properties/Kconfig               |   36 +
 security/ipe/properties/Makefile              |   13 +
 security/ipe/properties/boot-verified.c       |   82 ++
 security/ipe/properties/dmverity-roothash.c   |  153 ++
 security/ipe/properties/dmverity-signature.c  |   82 ++
 security/ipe/properties/prop-entry.h          |   38 +
 security/ipe/utility.h                        |   32 +
 security/security.c                           |   74 +
 54 files changed, 5443 insertions(+), 98 deletions(-)
 create mode 100644 Documentation/admin-guide/LSM/ipe.rst
 create mode 100644 scripts/ipe/Makefile
 create mode 100644 scripts/ipe/polgen/.gitignore
 create mode 100644 scripts/ipe/polgen/Makefile
 create mode 100644 scripts/ipe/polgen/polgen.c
 create mode 100644 security/ipe/.gitignore
 create mode 100644 security/ipe/Kconfig
 create mode 100644 security/ipe/Makefile
 create mode 100644 security/ipe/ipe-audit.c
 create mode 100644 security/ipe/ipe-audit.h
 create mode 100644 security/ipe/ipe-blobs.c
 create mode 100644 security/ipe/ipe-blobs.h
 create mode 100644 security/ipe/ipe-engine.c
 create mode 100644 security/ipe/ipe-engine.h
 create mode 100644 security/ipe/ipe-hooks.c
 create mode 100644 security/ipe/ipe-hooks.h
 create mode 100644 security/ipe/ipe-parse.c
 create mode 100644 security/ipe/ipe-parse.h
 create mode 100644 security/ipe/ipe-pin.c
 create mode 100644 security/ipe/ipe-pin.h
 create mode 100644 security/ipe/ipe-policy.c
 create mode 100644 security/ipe/ipe-policy.h
 create mode 100644 security/ipe/ipe-prop-internal.h
 create mode 100644 security/ipe/ipe-property.c
 create mode 100644 security/ipe/ipe-property.h
 create mode 100644 security/ipe/ipe-secfs.c
 create mode 100644 security/ipe/ipe-secfs.h
 create mode 100644 security/ipe/ipe.c
 create mode 100644 security/ipe/ipe.h
 create mode 100644 security/ipe/properties/Kconfig
 create mode 100644 security/ipe/properties/Makefile
 create mode 100644 security/ipe/properties/boot-verified.c
 create mode 100644 security/ipe/properties/dmverity-roothash.c
 create mode 100644 security/ipe/properties/dmverity-signature.c
 create mode 100644 security/ipe/properties/prop-entry.h
 create mode 100644 security/ipe/utility.h

Comments

Pavel Machek Aug. 2, 2020, 11:55 a.m. UTC | #1
Hi!

> IPE is a Linux Security Module which allows for a configurable
> policy to enforce integrity requirements on the whole system. It
> attempts to solve the issue of Code Integrity: that any code being
> executed (or files being read), are identical to the version that
> was built by a trusted source.

How is that different from security/integrity/ima?

									Pavel
Sasha Levin Aug. 2, 2020, 2:03 p.m. UTC | #2
On Sun, Aug 02, 2020 at 01:55:45PM +0200, Pavel Machek wrote:
>Hi!
>
>> IPE is a Linux Security Module which allows for a configurable
>> policy to enforce integrity requirements on the whole system. It
>> attempts to solve the issue of Code Integrity: that any code being
>> executed (or files being read), are identical to the version that
>> was built by a trusted source.
>
>How is that different from security/integrity/ima?

Maybe if you would have read the cover letter all the way down to the
5th paragraph which explains how IPE is different from IMA we could
avoided this mail exchange...
Pavel Machek Aug. 2, 2020, 2:31 p.m. UTC | #3
On Sun 2020-08-02 10:03:00, Sasha Levin wrote:
> On Sun, Aug 02, 2020 at 01:55:45PM +0200, Pavel Machek wrote:
> >Hi!
> >
> >>IPE is a Linux Security Module which allows for a configurable
> >>policy to enforce integrity requirements on the whole system. It
> >>attempts to solve the issue of Code Integrity: that any code being
> >>executed (or files being read), are identical to the version that
> >>was built by a trusted source.
> >
> >How is that different from security/integrity/ima?
> 
> Maybe if you would have read the cover letter all the way down to the
> 5th paragraph which explains how IPE is different from IMA we could
> avoided this mail exchange...

"
IPE differs from other LSMs which provide integrity checking (for
instance,
IMA), as it has no dependency on the filesystem metadata itself. The
attributes that IPE checks are deterministic properties that exist
solely
in the kernel. Additionally, IPE provides no additional mechanisms of
verifying these files (e.g. IMA Signatures) - all of the attributes of
verifying files are existing features within the kernel, such as
dm-verity
or fsverity.
"

That is not really helpful.
									Pavel
James Bottomley Aug. 2, 2020, 4:43 p.m. UTC | #4
On Sun, 2020-08-02 at 16:31 +0200, Pavel Machek wrote:
> On Sun 2020-08-02 10:03:00, Sasha Levin wrote:
> > On Sun, Aug 02, 2020 at 01:55:45PM +0200, Pavel Machek wrote:
> > > Hi!
> > > 
> > > > IPE is a Linux Security Module which allows for a configurable
> > > > policy to enforce integrity requirements on the whole system.
> > > > It attempts to solve the issue of Code Integrity: that any code
> > > > being executed (or files being read), are identical to the
> > > > version that was built by a trusted source.
> > > 
> > > How is that different from security/integrity/ima?
> > 
> > Maybe if you would have read the cover letter all the way down to
> > the 5th paragraph which explains how IPE is different from IMA we
> > could avoided this mail exchange...
> 
> "
> IPE differs from other LSMs which provide integrity checking (for
> instance,
> IMA), as it has no dependency on the filesystem metadata itself. The
> attributes that IPE checks are deterministic properties that exist
> solely
> in the kernel. Additionally, IPE provides no additional mechanisms of
> verifying these files (e.g. IMA Signatures) - all of the attributes
> of
> verifying files are existing features within the kernel, such as
> dm-verity
> or fsverity.
> "
> 
> That is not really helpful.

I think what the above is trying to to is to expose is an IMA
limitation that the new LSM fixes.  I think what it meant to say is
that IMA uses xattrs to store the signature data which is the "metadata
dependency".  However, it overlooks the fact that IMA can use appended
signatures as well, which have no metadata dependency, so I'm not sure
I've helped you understand why this is different from IMA.

Perhaps a more convincing argument is that IMA hooks into various
filesystem "gates" to perform integrity checks (file read and file
execute being the most obvious).  This LSM wants additional gates
within device mapper itself that IMA currently doesn't hook into.

Perhaps the big question is: If we used the existing IMA appended
signature for detached signatures (effectively becoming the
"properties" referred to in the cover letter) and hooked IMA into
device mapper using additional policy terms, would that satisfy all the
requirements this new LSM has?

James
Deven Bowers Aug. 4, 2020, 4:07 p.m. UTC | #5
On 8/2/2020 9:43 AM, James Bottomley wrote:
> On Sun, 2020-08-02 at 16:31 +0200, Pavel Machek wrote:
>> On Sun 2020-08-02 10:03:00, Sasha Levin wrote:
>>> On Sun, Aug 02, 2020 at 01:55:45PM +0200, Pavel Machek wrote:
>>>> Hi!
>>>>
>>>>> IPE is a Linux Security Module which allows for a configurable
>>>>> policy to enforce integrity requirements on the whole system.
>>>>> It attempts to solve the issue of Code Integrity: that any code
>>>>> being executed (or files being read), are identical to the
>>>>> version that was built by a trusted source.
>>>>
>>>> How is that different from security/integrity/ima?
>>>
>>> Maybe if you would have read the cover letter all the way down to
>>> the 5th paragraph which explains how IPE is different from IMA we
>>> could avoided this mail exchange...
>>
>> "
>> IPE differs from other LSMs which provide integrity checking (for
>> instance,
>> IMA), as it has no dependency on the filesystem metadata itself. The
>> attributes that IPE checks are deterministic properties that exist
>> solely
>> in the kernel. Additionally, IPE provides no additional mechanisms of
>> verifying these files (e.g. IMA Signatures) - all of the attributes
>> of
>> verifying files are existing features within the kernel, such as
>> dm-verity
>> or fsverity.
>> "
>>
>> That is not really helpful.

Perhaps I can explain (and re-word this paragraph) a bit better.

As James indicates, IPE does try to close the gap of the IMA limitation
with xattr. I honestly wasn’t familiar with the appended signatures,
which seems fine.

Regardless, this isn’t the larger benefit that IPE provides. The
larger benefit of this is how IPE separates _mechanisms_ (properties)
to enforce integrity requirements, from _policy_. The LSM provides
policy, while things like dm-verity provide mechanism.

So to speak, IPE acts as the glue for other mechanisms to leverage a
customizable, system-wide policy to enforce. While this initial
patchset only onboards dm-verity, there’s also potential for MAC labels,
fs-verity, authenticated BTRFS, dm-integrity, etc. IPE leverages
existing systems in the kernel, while IMA uses its own.

Another difference is the general coverage. IMA has some difficulties
in covering mprotect[1], IPE doesn’t (the MAP_ANONYMOUS indicated by
Jann in that thread would be denied as the file struct would be null,
with IPE’s current set of supported mechanisms. mprotect would continue
to function as expected if you change to PROT_EXEC).

> Perhaps the big question is: If we used the existing IMA appended
> signature for detached signatures (effectively becoming the
> "properties" referred to in the cover letter) and hooked IMA into
> device mapper using additional policy terms, would that satisfy all the
> requirements this new LSM has?

Well, Mimi, what do you think? Should we integrate all the features of
IPE into IMA, or do you think they are sufficiently different in
architecture that it would be worth it to keep the code base in separate
LSMs?


[1] 
https://lore.kernel.org/linux-integrity/1588688204.5157.5.camel@linux.ibm.com/
James Bottomley Aug. 5, 2020, 3:01 p.m. UTC | #6
On Tue, 2020-08-04 at 09:07 -0700, Deven Bowers wrote:
> On 8/2/2020 9:43 AM, James Bottomley wrote:
> > On Sun, 2020-08-02 at 16:31 +0200, Pavel Machek wrote:
> > > On Sun 2020-08-02 10:03:00, Sasha Levin wrote:
> > > > On Sun, Aug 02, 2020 at 01:55:45PM +0200, Pavel Machek wrote:
> > > > > Hi!
> > > > > 
> > > > > > IPE is a Linux Security Module which allows for a
> > > > > > configurable policy to enforce integrity requirements on
> > > > > > the whole system. It attempts to solve the issue of Code
> > > > > > Integrity: that any code being executed (or files being
> > > > > > read), are identical to the version that was built by a
> > > > > > trusted source.
> > > > > 
> > > > > How is that different from security/integrity/ima?
> > > > 
> > > > Maybe if you would have read the cover letter all the way down
> > > > to the 5th paragraph which explains how IPE is different from
> > > > IMA we could avoided this mail exchange...
> > > 
> > > "
> > > IPE differs from other LSMs which provide integrity checking (for
> > > instance, IMA), as it has no dependency on the filesystem
> > > metadata itself.
> > > The attributes that IPE checks are deterministic properties that
> > > exist solely in the kernel. Additionally, IPE provides no
> > > additional mechanisms of verifying these files (e.g. IMA
> > > Signatures) - all of the attributes of verifying files are
> > > existing features within the kernel, such as dm-verity
> > > or fsverity.
> > > "
> > > 
> > > That is not really helpful.
> 
> Perhaps I can explain (and re-word this paragraph) a bit better.
> 
> As James indicates, IPE does try to close the gap of the IMA
> limitation with xattr. I honestly wasn’t familiar with the appended
> signatures, which seems fine.
> 
> Regardless, this isn’t the larger benefit that IPE provides. The
> larger benefit of this is how IPE separates _mechanisms_ (properties)
> to enforce integrity requirements, from _policy_. The LSM provides
> policy, while things like dm-verity provide mechanism.

Colour me confused here, but I thought that's exactly what IMA does. 
The mechanism is the gates and the policy is simply a list of rules
which are applied when a gate is triggered.  The policy necessarily has
to be tailored to the information available at the gate (so the bprm
exec gate knows filesystem things like the inode for instance) but the
whole thing looks very extensible.

> So to speak, IPE acts as the glue for other mechanisms to leverage a
> customizable, system-wide policy to enforce. While this initial
> patchset only onboards dm-verity, there’s also potential for MAC
> labels, fs-verity, authenticated BTRFS, dm-integrity, etc. IPE
> leverages existing systems in the kernel, while IMA uses its own.

Is this about who does the measurement?  I think there's no reason at
all why IMA can't leverage existing measurements, it's just nothing to
leverage existed when it was created.

> Another difference is the general coverage. IMA has some difficulties
> in covering mprotect[1], IPE doesn’t (the MAP_ANONYMOUS indicated by
> Jann in that thread would be denied as the file struct would be null,
> with IPE’s current set of supported mechanisms. mprotect would
> continue to function as expected if you change to PROT_EXEC).

I don't really think a debate over who does what and why is productive
at this stage.  I just note that IMA policy could be updated to deny
MAP_ANONYMOUS, but no-one's asked for that (probably because of the
huge application breakage that would ensue).  The policy is a product
of the use case and the current use case for IMA is working with
existing filesystem semantics.

> > Perhaps the big question is: If we used the existing IMA appended
> > signature for detached signatures (effectively becoming the
> > "properties" referred to in the cover letter) and hooked IMA into
> > device mapper using additional policy terms, would that satisfy all
> > the requirements this new LSM has?
> 
> Well, Mimi, what do you think? Should we integrate all the features
> of IPE into IMA, or do you think they are sufficiently different in
> architecture that it would be worth it to keep the code base in
> separate LSMs?

I'll leave Mimi to answer, but really this is exactly the question that
should have been asked before writing IPE.  However, since we have the
cart before the horse, let me break the above down into two specific
questions.

   1. Could we implement IPE in IMA (as in would extensions to IMA cover
      everything).  I think the answers above indicate this is a "yes".
   2. Should we extend IMA to implement it?  This is really whether from a
      usability standpoint two seperate LSMs would make sense to cover the
      different use cases.  I've got to say the least attractive thing
      about separation is the fact that you now both have a policy parser.
       You've tried to differentiate yours by making it more Kconfig
      based, but policy has a way of becoming user space supplied because
      the distros hate config options, so I think you're going to end up
      with a policy parser very like IMAs.

James
James Morris Aug. 5, 2020, 4:59 p.m. UTC | #7
On Wed, 5 Aug 2020, James Bottomley wrote:

> I'll leave Mimi to answer, but really this is exactly the question that
> should have been asked before writing IPE.  However, since we have the
> cart before the horse, let me break the above down into two specific
> questions.

The question is valid and it was asked. We decided to first prototype what 
we needed and then evaluate if it should be integrated with IMA. We 
discussed this plan in person with Mimi (at LSS-NA in 2019), and presented 
a more mature version of IPE to LSS-NA in 2020, with the expectation that 
such a discussion may come up (it did not).

These patches are still part of this process and 'RFC' status.

>    1. Could we implement IPE in IMA (as in would extensions to IMA cover
>       everything).  I think the answers above indicate this is a "yes".

It could be done, if needed.

>    2. Should we extend IMA to implement it?  This is really whether from a
>       usability standpoint two seperate LSMs would make sense to cover the
>       different use cases.

One issue here is that IMA is fundamentally a measurement & appraisal 
scheme which has been extended to include integrity enforcement. IPE was 
designed from scratch to only perform integrity enforcement. As such, it 
is a cleaner design -- "do one thing and do it well" is a good design 
pattern.

In our use-case, we utilize _both_ IMA and IPE, for attestation and code 
integrity respectively. It is useful to be able to separate these 
concepts. They really are different:

- Code integrity enforcement ensures that code running locally is of known 
provenance and has not been modified prior to execution.

- Attestation is about measuring the health of a system and having that 
measurement validated by a remote system. (Local attestation is useless).

I'm not sure there is value in continuing to shoe-horn both of these into 
IMA.


>  I've got to say the least attractive thing
>       about separation is the fact that you now both have a policy parser.
>        You've tried to differentiate yours by making it more Kconfig
>       based, but policy has a way of becoming user space supplied because
>       the distros hate config options, so I think you're going to end up
>       with a policy parser very like IMAs.
Mimi Zohar Aug. 5, 2020, 6:15 p.m. UTC | #8
On Wed, 2020-08-05 at 09:59 -0700, James Morris wrote:
> On Wed, 5 Aug 2020, James Bottomley wrote:
> 
> > I'll leave Mimi to answer, but really this is exactly the question that
> > should have been asked before writing IPE.  However, since we have the
> > cart before the horse, let me break the above down into two specific
> > questions.
> 
> The question is valid and it was asked. We decided to first prototype what 
> we needed and then evaluate if it should be integrated with IMA. We 
> discussed this plan in person with Mimi (at LSS-NA in 2019), and presented 
> a more mature version of IPE to LSS-NA in 2020, with the expectation that 
> such a discussion may come up (it did not).

When we first spoke the concepts weren't fully formulated, at least to
me.
> 
> These patches are still part of this process and 'RFC' status.
> 
> >    1. Could we implement IPE in IMA (as in would extensions to IMA cover
> >       everything).  I think the answers above indicate this is a "yes".
> 
> It could be done, if needed.
> 
> >    2. Should we extend IMA to implement it?  This is really whether from a
> >       usability standpoint two seperate LSMs would make sense to cover the
> >       different use cases.
> 
> One issue here is that IMA is fundamentally a measurement & appraisal 
> scheme which has been extended to include integrity enforcement. IPE was 
> designed from scratch to only perform integrity enforcement. As such, it 
> is a cleaner design -- "do one thing and do it well" is a good design 
> pattern.
> 
> In our use-case, we utilize _both_ IMA and IPE, for attestation and code 
> integrity respectively. It is useful to be able to separate these 
> concepts. They really are different:
> 
> - Code integrity enforcement ensures that code running locally is of known 
> provenance and has not been modified prior to execution.
> 
> - Attestation is about measuring the health of a system and having that 
> measurement validated by a remote system. (Local attestation is useless).
> 
> I'm not sure there is value in continuing to shoe-horn both of these into 
> IMA.

True, IMA was originally limited to measurement and attestation, but
most of the original EVM concepts were subsequently included in IMA. 
(Remember, Reiner Sailer wrote the original IMA, which I inherited.  I
was originially working on EVM code integrity.)  From a naming
perspective including EVM code integrity in IMA was a mistake.  My
thinking at the time was that as IMA was already calculating the file
hash, instead of re-calculating the file hash for integrity, calculate
the file hash once and re-use it for multiple things - measurement, 
integrity, and audit.   At the same time define a single system wide
policy.

When we first started working on IMA, EVM, trusted, and encrypted keys,
the general kernel community didn't see a need for any of it.  Thus, a
lot of what was accomplished has been accomplished without the backing
of the real core filesystem people.

If block layer integrity was enough, there wouldn't have been a need
for fs-verity.   Even fs-verity is limited to read only filesystems,
which makes validating file integrity so much easier.  From the
beginning, we've said that fs-verity signatures should be included in
the measurement list.  (I thought someone signed on to add that support
to IMA, but have not yet seen anything.)

Going forward I see a lot of what we've accomplished being incorporated
into the filesystems.  When IMA will be limited to defining a system
wide policy, I'll have completed my job.

Mimi

> 
> >  I've got to say the least attractive thing
> >       about separation is the fact that you now both have a policy parser.
> >        You've tried to differentiate yours by making it more Kconfig
> >       based, but policy has a way of becoming user space supplied because
> >       the distros hate config options, so I think you're going to end up
> >       with a policy parser very like IMAs.
James Morris Aug. 5, 2020, 11:51 p.m. UTC | #9
On Wed, 5 Aug 2020, Mimi Zohar wrote:

> If block layer integrity was enough, there wouldn't have been a need
> for fs-verity.   Even fs-verity is limited to read only filesystems,
> which makes validating file integrity so much easier.  From the
> beginning, we've said that fs-verity signatures should be included in
> the measurement list.  (I thought someone signed on to add that support
> to IMA, but have not yet seen anything.)
> 
> Going forward I see a lot of what we've accomplished being incorporated
> into the filesystems.  When IMA will be limited to defining a system
> wide policy, I'll have completed my job.

What are your thoughts on IPE being a standalone LSM? Would you prefer to 
see its functionality integrated into IMA?
Mimi Zohar Aug. 6, 2020, 2:33 p.m. UTC | #10
On Thu, 2020-08-06 at 09:51 +1000, James Morris wrote:
> On Wed, 5 Aug 2020, Mimi Zohar wrote:
> 
> > If block layer integrity was enough, there wouldn't have been a need
> > for fs-verity.   Even fs-verity is limited to read only filesystems,
> > which makes validating file integrity so much easier.  From the
> > beginning, we've said that fs-verity signatures should be included in
> > the measurement list.  (I thought someone signed on to add that support
> > to IMA, but have not yet seen anything.)
> > 
> > Going forward I see a lot of what we've accomplished being incorporated
> > into the filesystems.  When IMA will be limited to defining a system
> > wide policy, I'll have completed my job.
> 
> What are your thoughts on IPE being a standalone LSM? Would you prefer to 
> see its functionality integrated into IMA?

Improving the integrity subsystem would be preferred.

Mimi
James Morris Aug. 7, 2020, 4:41 p.m. UTC | #11
On Thu, 6 Aug 2020, Mimi Zohar wrote:

> On Thu, 2020-08-06 at 09:51 +1000, James Morris wrote:
> > On Wed, 5 Aug 2020, Mimi Zohar wrote:
> > 
> > > If block layer integrity was enough, there wouldn't have been a need
> > > for fs-verity.   Even fs-verity is limited to read only filesystems,
> > > which makes validating file integrity so much easier.  From the
> > > beginning, we've said that fs-verity signatures should be included in
> > > the measurement list.  (I thought someone signed on to add that support
> > > to IMA, but have not yet seen anything.)
> > > 
> > > Going forward I see a lot of what we've accomplished being incorporated
> > > into the filesystems.  When IMA will be limited to defining a system
> > > wide policy, I'll have completed my job.
> > 
> > What are your thoughts on IPE being a standalone LSM? Would you prefer to 
> > see its functionality integrated into IMA?
> 
> Improving the integrity subsystem would be preferred.
> 

Are you planning to attend Plumbers? Perhaps we could propose a BoF 
session on this topic.
Mimi Zohar Aug. 7, 2020, 5:31 p.m. UTC | #12
On Sat, 2020-08-08 at 02:41 +1000, James Morris wrote:
> On Thu, 6 Aug 2020, Mimi Zohar wrote:
> 
> > On Thu, 2020-08-06 at 09:51 +1000, James Morris wrote:
> > > On Wed, 5 Aug 2020, Mimi Zohar wrote:
> > > 
> > > > If block layer integrity was enough, there wouldn't have been a need
> > > > for fs-verity.   Even fs-verity is limited to read only filesystems,
> > > > which makes validating file integrity so much easier.  From the
> > > > beginning, we've said that fs-verity signatures should be included in
> > > > the measurement list.  (I thought someone signed on to add that support
> > > > to IMA, but have not yet seen anything.)
> > > > 
> > > > Going forward I see a lot of what we've accomplished being incorporated
> > > > into the filesystems.  When IMA will be limited to defining a system
> > > > wide policy, I'll have completed my job.
> > > 
> > > What are your thoughts on IPE being a standalone LSM? Would you prefer to 
> > > see its functionality integrated into IMA?
> > 
> > Improving the integrity subsystem would be preferred.
> > 
> 
> Are you planning to attend Plumbers? Perhaps we could propose a BoF 
> session on this topic.

That sounds like a good idea.

Mimi
Mimi Zohar Aug. 7, 2020, 6:40 p.m. UTC | #13
On Fri, 2020-08-07 at 13:31 -0400, Mimi Zohar wrote:
> On Sat, 2020-08-08 at 02:41 +1000, James Morris wrote:
> > On Thu, 6 Aug 2020, Mimi Zohar wrote:
> > 
> > > On Thu, 2020-08-06 at 09:51 +1000, James Morris wrote:
> > > > On Wed, 5 Aug 2020, Mimi Zohar wrote:
> > > > 
> > > > > If block layer integrity was enough, there wouldn't have been a need
> > > > > for fs-verity.   Even fs-verity is limited to read only filesystems,
> > > > > which makes validating file integrity so much easier.  From the
> > > > > beginning, we've said that fs-verity signatures should be included in
> > > > > the measurement list.  (I thought someone signed on to add that support
> > > > > to IMA, but have not yet seen anything.)
> > > > > 
> > > > > Going forward I see a lot of what we've accomplished being incorporated
> > > > > into the filesystems.  When IMA will be limited to defining a system
> > > > > wide policy, I'll have completed my job.
> > > > 
> > > > What are your thoughts on IPE being a standalone LSM? Would you prefer to 
> > > > see its functionality integrated into IMA?
> > > 
> > > Improving the integrity subsystem would be preferred.
> > > 
> > 
> > Are you planning to attend Plumbers? Perhaps we could propose a BoF 
> > session on this topic.
> 
> That sounds like a good idea.

Other than it is already sold out.

Mimi
Chuck Lever Aug. 8, 2020, 5:47 p.m. UTC | #14
> On Aug 5, 2020, at 2:15 PM, Mimi Zohar <zohar@linux.ibm.com> wrote:
> 
> On Wed, 2020-08-05 at 09:59 -0700, James Morris wrote:
>> On Wed, 5 Aug 2020, James Bottomley wrote:
>> 
>>> I'll leave Mimi to answer, but really this is exactly the question that
>>> should have been asked before writing IPE.  However, since we have the
>>> cart before the horse, let me break the above down into two specific
>>> questions.
>> 
>> The question is valid and it was asked. We decided to first prototype what 
>> we needed and then evaluate if it should be integrated with IMA. We 
>> discussed this plan in person with Mimi (at LSS-NA in 2019), and presented 
>> a more mature version of IPE to LSS-NA in 2020, with the expectation that 
>> such a discussion may come up (it did not).
> 
> When we first spoke the concepts weren't fully formulated, at least to
> me.
>> 
>> These patches are still part of this process and 'RFC' status.
>> 
>>>   1. Could we implement IPE in IMA (as in would extensions to IMA cover
>>>      everything).  I think the answers above indicate this is a "yes".
>> 
>> It could be done, if needed.
>> 
>>>   2. Should we extend IMA to implement it?  This is really whether from a
>>>      usability standpoint two seperate LSMs would make sense to cover the
>>>      different use cases.
>> 
>> One issue here is that IMA is fundamentally a measurement & appraisal 
>> scheme which has been extended to include integrity enforcement. IPE was 
>> designed from scratch to only perform integrity enforcement. As such, it 
>> is a cleaner design -- "do one thing and do it well" is a good design 
>> pattern.
>> 
>> In our use-case, we utilize _both_ IMA and IPE, for attestation and code 
>> integrity respectively. It is useful to be able to separate these 
>> concepts. They really are different:
>> 
>> - Code integrity enforcement ensures that code running locally is of known 
>> provenance and has not been modified prior to execution.

My interest is in code integrity enforcement for executables stored
in NFS files.

My struggle with IPE is that due to its dependence on dm-verity, it
does not seem to able to protect content that is stored separately
from its execution environment and accessed via a file access
protocol (FUSE, SMB, NFS, etc).


>> - Attestation is about measuring the health of a system and having that 
>> measurement validated by a remote system. (Local attestation is useless).
>> 
>> I'm not sure there is value in continuing to shoe-horn both of these into 
>> IMA.
> 
> True, IMA was originally limited to measurement and attestation, but
> most of the original EVM concepts were subsequently included in IMA. 
> (Remember, Reiner Sailer wrote the original IMA, which I inherited.  I
> was originially working on EVM code integrity.)  From a naming
> perspective including EVM code integrity in IMA was a mistake.  My
> thinking at the time was that as IMA was already calculating the file
> hash, instead of re-calculating the file hash for integrity, calculate
> the file hash once and re-use it for multiple things - measurement, 
> integrity, and audit.   At the same time define a single system wide
> policy.
> 
> When we first started working on IMA, EVM, trusted, and encrypted keys,
> the general kernel community didn't see a need for any of it.  Thus, a
> lot of what was accomplished has been accomplished without the backing
> of the real core filesystem people.
> 
> If block layer integrity was enough, there wouldn't have been a need
> for fs-verity.   Even fs-verity is limited to read only filesystems,
> which makes validating file integrity so much easier.  From the
> beginning, we've said that fs-verity signatures should be included in
> the measurement list.  (I thought someone signed on to add that support
> to IMA, but have not yet seen anything.)

Mimi, when you and I discussed this during LSS NA 2019, I didn't fully
understand that you expected me to implement signed Merkle trees for all
filesystems. At the time, it sounded to me like you wanted signed Merkle
trees only for NFS files. Is that still the case?

The first priority (for me, anyway) therefore is getting the ability to
move IMA metadata between NFS clients and servers shoveled into the NFS
protocol, but that's been blocked for various legal reasons.

IMO we need agreement from everyone (integrity developers, FS
implementers, and Linux distributors) that a signed Merkle tree IMA
metadata format, stored in either an xattr or appended to an executable
file, will be the way forward for IMA in all filesystems.


> Going forward I see a lot of what we've accomplished being incorporated
> into the filesystems.  When IMA will be limited to defining a system
> wide policy, I'll have completed my job.
> 
> Mimi
> 
>> 
>>> I've got to say the least attractive thing
>>>      about separation is the fact that you now both have a policy parser.
>>>       You've tried to differentiate yours by making it more Kconfig
>>>      based, but policy has a way of becoming user space supplied because
>>>      the distros hate config options, so I think you're going to end up
>>>      with a policy parser very like IMAs.
Mimi Zohar Aug. 9, 2020, 5:16 p.m. UTC | #15
On Sat, 2020-08-08 at 13:47 -0400, Chuck Lever wrote:
> > On Aug 5, 2020, at 2:15 PM, Mimi Zohar <zohar@linux.ibm.com> wrote:

<snip>

> > If block layer integrity was enough, there wouldn't have been a need
> > for fs-verity.   Even fs-verity is limited to read only filesystems,
> > which makes validating file integrity so much easier.  From the
> > beginning, we've said that fs-verity signatures should be included in
> > the measurement list.  (I thought someone signed on to add that support
> > to IMA, but have not yet seen anything.)
> 
> Mimi, when you and I discussed this during LSS NA 2019, I didn't fully
> understand that you expected me to implement signed Merkle trees for all
> filesystems. At the time, it sounded to me like you wanted signed Merkle
> trees only for NFS files. Is that still the case?

I definitely do not expect you to support signed Merkle trees for all
filesystems.  My interested is from an IMA perspective of measuring and
verifying the fs-verity Merkle tree root (and header info) signature. 
This is independent of which filesystems support it.

> 
> The first priority (for me, anyway) therefore is getting the ability to
> move IMA metadata between NFS clients and servers shoveled into the NFS
> protocol, but that's been blocked for various legal reasons.

Up to now, verifying remote filesystem file integrity has been out of
scope for IMA.   With fs-verity file signatures I can at least grasp
how remote file integrity could possibly work.  I don't understand how
remote file integrity with existing IMA formats could be supported. You
might want to consider writing a whitepaper, which could later be used
as the basis for a patch set cover letter.

Mimi

> 
> IMO we need agreement from everyone (integrity developers, FS
> implementers, and Linux distributors) that a signed Merkle tree IMA
> metadata format, stored in either an xattr or appended to an executable
> file, will be the way forward for IMA in all filesystems.
James Bottomley Aug. 10, 2020, 3:35 p.m. UTC | #16
On Sun, 2020-08-09 at 13:16 -0400, Mimi Zohar wrote:
> On Sat, 2020-08-08 at 13:47 -0400, Chuck Lever wrote:
> > > On Aug 5, 2020, at 2:15 PM, Mimi Zohar <zohar@linux.ibm.com>
> > > wrote:
> 
> <snip>
> 
> > > If block layer integrity was enough, there wouldn't have been a
> > > need for fs-verity.   Even fs-verity is limited to read only
> > > filesystems, which makes validating file integrity so much
> > > easier.  From the beginning, we've said that fs-verity signatures
> > > should be included in the measurement list.  (I thought someone
> > > signed on to add that support to IMA, but have not yet seen
> > > anything.)
> > 
> > Mimi, when you and I discussed this during LSS NA 2019, I didn't
> > fully understand that you expected me to implement signed Merkle
> > trees for all filesystems. At the time, it sounded to me like you
> > wanted signed Merkle trees only for NFS files. Is that still the
> > case?
> 
> I definitely do not expect you to support signed Merkle trees for all
> filesystems.  My interested is from an IMA perspective of measuring
> and verifying the fs-verity Merkle tree root (and header info)
> signature. This is independent of which filesystems support it.
> 
> > 
> > The first priority (for me, anyway) therefore is getting the
> > ability to move IMA metadata between NFS clients and servers
> > shoveled into the NFS protocol, but that's been blocked for various
> > legal reasons.
> 
> Up to now, verifying remote filesystem file integrity has been out of
> scope for IMA.   With fs-verity file signatures I can at least grasp
> how remote file integrity could possibly work.  I don't understand
> how remote file integrity with existing IMA formats could be
> supported. You might want to consider writing a whitepaper, which
> could later be used as the basis for a patch set cover letter.

I think, before this, we can help with the basics (and perhaps we
should sort them out before we start documenting what we'll do).  The
first basic is that a merkle tree allows unit at a time verification. 
First of all we should agree on the unit.  Since we always fault a page
at a time, I think our merkle tree unit should be a page not a block. 
Next, we should agree where the check gates for the per page accesses
should be ... definitely somewhere in readpage, I suspect and finally
we should agree how the merkle tree is presented at the gate.  I think
there are three ways:

   1. Ahead of time transfer:  The merkle tree is transferred and verified
      at some time before the accesses begin, so we already have a
      verified copy and can compare against the lower leaf.
   2. Async transfer:  We provide an async mechanism to transfer the
      necessary components, so when presented with a unit, we check the
      log n components required to get to the root
   3. The protocol actually provides the capability of 2 (like the SCSI
      DIF/DIX), so to IMA all the pieces get presented instead of IMA
      having to manage the tree

There are also a load of minor things like how we get the head hash,
which must be presented and verified ahead of time for each of the
above 3.

James
Mimi Zohar Aug. 10, 2020, 4:35 p.m. UTC | #17
On Mon, 2020-08-10 at 08:35 -0700, James Bottomley wrote:
> On Sun, 2020-08-09 at 13:16 -0400, Mimi Zohar wrote:
> > On Sat, 2020-08-08 at 13:47 -0400, Chuck Lever wrote:
> > > > On Aug 5, 2020, at 2:15 PM, Mimi Zohar <zohar@linux.ibm.com>
> > > > wrote:
> > 
> > <snip>
> > 
> > > > If block layer integrity was enough, there wouldn't have been a
> > > > need for fs-verity.   Even fs-verity is limited to read only
> > > > filesystems, which makes validating file integrity so much
> > > > easier.  From the beginning, we've said that fs-verity signatures
> > > > should be included in the measurement list.  (I thought someone
> > > > signed on to add that support to IMA, but have not yet seen
> > > > anything.)
> > > 
> > > Mimi, when you and I discussed this during LSS NA 2019, I didn't
> > > fully understand that you expected me to implement signed Merkle
> > > trees for all filesystems. At the time, it sounded to me like you
> > > wanted signed Merkle trees only for NFS files. Is that still the
> > > case?
> > 
> > I definitely do not expect you to support signed Merkle trees for all
> > filesystems.  My interested is from an IMA perspective of measuring
> > and verifying the fs-verity Merkle tree root (and header info)
> > signature. This is independent of which filesystems support it.
> > 
> > > The first priority (for me, anyway) therefore is getting the
> > > ability to move IMA metadata between NFS clients and servers
> > > shoveled into the NFS protocol, but that's been blocked for various
> > > legal reasons.
> > 
> > Up to now, verifying remote filesystem file integrity has been out of
> > scope for IMA.   With fs-verity file signatures I can at least grasp
> > how remote file integrity could possibly work.  I don't understand
> > how remote file integrity with existing IMA formats could be
> > supported. You might want to consider writing a whitepaper, which
> > could later be used as the basis for a patch set cover letter.
> 
> I think, before this, we can help with the basics (and perhaps we
> should sort them out before we start documenting what we'll do).

I'm not opposed to doing that, but you're taking this discussion in a
totally different direction.  The current discussion is about NFSv4
supporting the existing IMA signatures, not only fs-verity signatures. 
I'd like to understand how that is possible and for the community to
weigh in on whether it makes sense.

> The
> first basic is that a merkle tree allows unit at a time verification.
> First of all we should agree on the unit.  Since we always fault a page
> at a time, I think our merkle tree unit should be a page not a block. 
> Next, we should agree where the check gates for the per page accesses
> should be ... definitely somewhere in readpage, I suspect and finally
> we should agree how the merkle tree is presented at the gate.  I think
> there are three ways:
> 
>    1. Ahead of time transfer:  The merkle tree is transferred and verified
>       at some time before the accesses begin, so we already have a
>       verified copy and can compare against the lower leaf.
>    2. Async transfer:  We provide an async mechanism to transfer the
>       necessary components, so when presented with a unit, we check the
>       log n components required to get to the root
>    3. The protocol actually provides the capability of 2 (like the SCSI
>       DIF/DIX), so to IMA all the pieces get presented instead of IMA
>       having to manage the tree
> 
> There are also a load of minor things like how we get the head hash,
> which must be presented and verified ahead of time for each of the
> above 3.
 
I was under the impression that IMA support for fs-verity signatures
would be limited to including the fs-verity signature in the
measurement list and verifying the fs-verity signature.   As fs-verity
is limited to immutable files, this could be done on file open.  fs-
verity would be responsible for enforcing the block/page data
integrity.   From a local filesystem perspective, I think that is all
that is necessary.

In terms of remote file systems,  the main issue is transporting and
storing the Merkle tree.  As fs-verity is limited to immutable files,
this could still be done on file open.

Mimi
James Bottomley Aug. 10, 2020, 5:13 p.m. UTC | #18
On Mon, 2020-08-10 at 12:35 -0400, Mimi Zohar wrote:
> On Mon, 2020-08-10 at 08:35 -0700, James Bottomley wrote:
[...]
> > > Up to now, verifying remote filesystem file integrity has been
> > > out of scope for IMA.   With fs-verity file signatures I can at
> > > least grasp how remote file integrity could possibly work.  I
> > > don't understand how remote file integrity with existing IMA
> > > formats could be supported. You might want to consider writing a
> > > whitepaper, which could later be used as the basis for a patch
> > > set cover letter.
> > 
> > I think, before this, we can help with the basics (and perhaps we
> > should sort them out before we start documenting what we'll do).
> 
> I'm not opposed to doing that, but you're taking this discussion in a
> totally different direction.  The current discussion is about NFSv4
> supporting the existing IMA signatures, not only fs-verity
> signatures. I'd like to understand how that is possible and for the
> community to weigh in on whether it makes sense.

Well, I see the NFS problem as being chunk at a time, right, which is
merkle tree, or is there a different chunk at a time mechanism we want
to use?  IMA currently verifies signature on open/exec and then
controls updates.  Since for NFS we only control the client, we can't
do that on an NFS server, so we really do need verification at read
time ... unless we're threading IMA back to the NFS server?

> > The first basic is that a merkle tree allows unit at a time
> > verification. First of all we should agree on the unit.  Since we
> > always fault a page at a time, I think our merkle tree unit should
> > be a page not a block. Next, we should agree where the check gates
> > for the per page accesses should be ... definitely somewhere in
> > readpage, I suspect and finally we should agree how the merkle tree
> > is presented at the gate.  I think there are three ways:
> > 
> >    1. Ahead of time transfer:  The merkle tree is transferred and
> > verified
> >       at some time before the accesses begin, so we already have a
> >       verified copy and can compare against the lower leaf.
> >    2. Async transfer:  We provide an async mechanism to transfer
> > the
> >       necessary components, so when presented with a unit, we check
> > the
> >       log n components required to get to the root
> >    3. The protocol actually provides the capability of 2 (like the
> > SCSI
> >       DIF/DIX), so to IMA all the pieces get presented instead of
> > IMA
> >       having to manage the tree
> > 
> > There are also a load of minor things like how we get the head
> > hash, which must be presented and verified ahead of time for each
> > of the above 3.
> 
>  
> I was under the impression that IMA support for fs-verity signatures
> would be limited to including the fs-verity signature in the
> measurement list and verifying the fs-verity signature.   As fs-
> verity is limited to immutable files, this could be done on file
> open.  fs-verity would be responsible for enforcing the block/page
> data integrity.   From a local filesystem perspective, I think that
> is all that is necessary.

The fs-verity use case is a bit of a crippled one because it's
immutable.  I think NFS represents more the general case where you
can't rely on immutability and have to verify at chunk read time.  If
we get chunk at a time working for NFS, it should work also for fs-
verity and we wouldn't need to have two special paths.

I think, even for NFS we would only really need to log the open, so
same as you imagine for fs-verity.  As long as the chunk read hashes
match, we can be silent because everything is going OK, so we only need
to determine what to do and log on mismatch (which isn't expected to
happen for fs-verity).

> In terms of remote file systems,  the main issue is transporting and
> storing the Merkle tree.  As fs-verity is limited to immutable files,
> this could still be done on file open.

Right, I mentioned that in my options ... we need some "supply
integrity" hook ... or possibly multiple hooks for a variety of
possible methods.

James
Mimi Zohar Aug. 10, 2020, 5:57 p.m. UTC | #19
On Mon, 2020-08-10 at 10:13 -0700, James Bottomley wrote:
> On Mon, 2020-08-10 at 12:35 -0400, Mimi Zohar wrote:
> > On Mon, 2020-08-10 at 08:35 -0700, James Bottomley wrote:
> [...]
> > > > Up to now, verifying remote filesystem file integrity has been
> > > > out of scope for IMA.   With fs-verity file signatures I can at
> > > > least grasp how remote file integrity could possibly work.  I
> > > > don't understand how remote file integrity with existing IMA
> > > > formats could be supported. You might want to consider writing a
> > > > whitepaper, which could later be used as the basis for a patch
> > > > set cover letter.
> > > 
> > > I think, before this, we can help with the basics (and perhaps we
> > > should sort them out before we start documenting what we'll do).
> > 
> > I'm not opposed to doing that, but you're taking this discussion in a
> > totally different direction.  The current discussion is about NFSv4
> > supporting the existing IMA signatures, not only fs-verity
> > signatures. I'd like to understand how that is possible and for the
> > community to weigh in on whether it makes sense.
> 
> Well, I see the NFS problem as being chunk at a time, right, which is
> merkle tree, or is there a different chunk at a time mechanism we want
> to use?  IMA currently verifies signature on open/exec and then
> controls updates.  Since for NFS we only control the client, we can't
> do that on an NFS server, so we really do need verification at read
> time ... unless we're threading IMA back to the NFS server?

Yes.  I still don't see how we can support the existing IMA signatures,
which is based on the file data hash, unless the "chunk at a time
mechanism" is not a tree, but linear.

Mimi

> 
> > > The first basic is that a merkle tree allows unit at a time
> > > verification. First of all we should agree on the unit.  Since we
> > > always fault a page at a time, I think our merkle tree unit should
> > > be a page not a block. Next, we should agree where the check gates
> > > for the per page accesses should be ... definitely somewhere in
> > > readpage, I suspect and finally we should agree how the merkle tree
> > > is presented at the gate.  I think there are three ways:
> > > 
> > >    1. Ahead of time transfer:  The merkle tree is transferred and
> > > verified
> > >       at some time before the accesses begin, so we already have a
> > >       verified copy and can compare against the lower leaf.
> > >    2. Async transfer:  We provide an async mechanism to transfer
> > > the
> > >       necessary components, so when presented with a unit, we check
> > > the
> > >       log n components required to get to the root
> > >    3. The protocol actually provides the capability of 2 (like the
> > > SCSI
> > >       DIF/DIX), so to IMA all the pieces get presented instead of
> > > IMA
> > >       having to manage the tree
> > > 
> > > There are also a load of minor things like how we get the head
> > > hash, which must be presented and verified ahead of time for each
> > > of the above 3.
> > 
> >  
> > I was under the impression that IMA support for fs-verity signatures
> > would be limited to including the fs-verity signature in the
> > measurement list and verifying the fs-verity signature.   As fs-
> > verity is limited to immutable files, this could be done on file
> > open.  fs-verity would be responsible for enforcing the block/page
> > data integrity.   From a local filesystem perspective, I think that
> > is all that is necessary.
> 
> The fs-verity use case is a bit of a crippled one because it's
> immutable.  I think NFS represents more the general case where you
> can't rely on immutability and have to verify at chunk read time.  If
> we get chunk at a time working for NFS, it should work also for fs-
> verity and we wouldn't need to have two special paths.
> 
> I think, even for NFS we would only really need to log the open, so
> same as you imagine for fs-verity.  As long as the chunk read hashes
> match, we can be silent because everything is going OK, so we only need
> to determine what to do and log on mismatch (which isn't expected to
> happen for fs-verity).
> 
> > In terms of remote file systems,  the main issue is transporting and
> > storing the Merkle tree.  As fs-verity is limited to immutable files,
> > this could still be done on file open.
> 
> Right, I mentioned that in my options ... we need some "supply
> integrity" hook ... or possibly multiple hooks for a variety of
> possible methods.
James Morris Aug. 10, 2020, 8:29 p.m. UTC | #20
On Fri, 7 Aug 2020, Mimi Zohar wrote:

> > > Are you planning to attend Plumbers? Perhaps we could propose a BoF 
> > > session on this topic.
> > 
> > That sounds like a good idea.
> 
> Other than it is already sold out.

Mimi advised me off-list that she is able to attend, so I've submitted a 
BoF proposal:

https://www.linuxplumbersconf.org/event/7/abstracts/732/
Chuck Lever Aug. 10, 2020, 11:36 p.m. UTC | #21
> On Aug 10, 2020, at 11:35 AM, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> 
> On Sun, 2020-08-09 at 13:16 -0400, Mimi Zohar wrote:
>> On Sat, 2020-08-08 at 13:47 -0400, Chuck Lever wrote:
>>>> On Aug 5, 2020, at 2:15 PM, Mimi Zohar <zohar@linux.ibm.com>
>>>> wrote:
>> 
>> <snip>
>> 
>>>> If block layer integrity was enough, there wouldn't have been a
>>>> need for fs-verity.   Even fs-verity is limited to read only
>>>> filesystems, which makes validating file integrity so much
>>>> easier.  From the beginning, we've said that fs-verity signatures
>>>> should be included in the measurement list.  (I thought someone
>>>> signed on to add that support to IMA, but have not yet seen
>>>> anything.)
>>> 
>>> Mimi, when you and I discussed this during LSS NA 2019, I didn't
>>> fully understand that you expected me to implement signed Merkle
>>> trees for all filesystems. At the time, it sounded to me like you
>>> wanted signed Merkle trees only for NFS files. Is that still the
>>> case?
>> 
>> I definitely do not expect you to support signed Merkle trees for all
>> filesystems.  My interested is from an IMA perspective of measuring
>> and verifying the fs-verity Merkle tree root (and header info)
>> signature. This is independent of which filesystems support it.
>> 
>>> 
>>> The first priority (for me, anyway) therefore is getting the
>>> ability to move IMA metadata between NFS clients and servers
>>> shoveled into the NFS protocol, but that's been blocked for various
>>> legal reasons.
>> 
>> Up to now, verifying remote filesystem file integrity has been out of
>> scope for IMA.   With fs-verity file signatures I can at least grasp
>> how remote file integrity could possibly work.  I don't understand
>> how remote file integrity with existing IMA formats could be
>> supported. You might want to consider writing a whitepaper, which
>> could later be used as the basis for a patch set cover letter.
> 
> I think, before this, we can help with the basics (and perhaps we
> should sort them out before we start documenting what we'll do).

Thanks for the help! I just want to emphasize that documentation
(eg, a specification) will be critical for remote filesystems.

If any of this is to be supported by a remote filesystem, then we
need an unencumbered description of the new metadata format rather
than code. GPL-encumbered formats cannot be contributed to the NFS
standard, and are probably difficult for other filesystems that are
not Linux-native, like SMB, as well.


> The
> first basic is that a merkle tree allows unit at a time verification. 
> First of all we should agree on the unit.  Since we always fault a page
> at a time, I think our merkle tree unit should be a page not a block.

Remote filesystems will need to agree that the size of that unit is
the same everywhere, or the unit size could be stored in the per-file
metadata.


> Next, we should agree where the check gates for the per page accesses
> should be ... definitely somewhere in readpage, I suspect and finally
> we should agree how the merkle tree is presented at the gate.  I think
> there are three ways:
> 
>   1. Ahead of time transfer:  The merkle tree is transferred and verified
>      at some time before the accesses begin, so we already have a
>      verified copy and can compare against the lower leaf.
>   2. Async transfer:  We provide an async mechanism to transfer the
>      necessary components, so when presented with a unit, we check the
>      log n components required to get to the root
>   3. The protocol actually provides the capability of 2 (like the SCSI
>      DIF/DIX), so to IMA all the pieces get presented instead of IMA
>      having to manage the tree

A Merkle tree is potentially large enough that it cannot be stored in
an extended attribute. In addition, an extended attribute is not a
byte stream that you can seek into or read small parts of, it is
retrieved in a single shot.

For this reason, the idea was to save only the signature of the tree's
root on durable storage. The client would retrieve that signature
possibly at open time, and reconstruct the tree at that time.

Or the tree could be partially constructed on-demand at the time each
unit is to be checked (say, as part of 2. above).

The client would have to reconstruct that tree again if memory pressure
caused some or all of the tree to be evicted, so perhaps an on-demand
mechanism is preferable.


> There are also a load of minor things like how we get the head hash,
> which must be presented and verified ahead of time for each of the
> above 3.

Also, changes to a file's content and its tree signature are not
atomic. If a file is mutable, then there is the period between when
the file content has changed and when the signature is updated.
Some discussion of how a client is to behave in those situations will
be necessary.


--
Chuck Lever
chucklever@gmail.com
James Bottomley Aug. 11, 2020, 5:43 a.m. UTC | #22
On Mon, 2020-08-10 at 19:36 -0400, Chuck Lever wrote:
> > On Aug 10, 2020, at 11:35 AM, James Bottomley
> > <James.Bottomley@HansenPartnership.com> wrote:
> > On Sun, 2020-08-09 at 13:16 -0400, Mimi Zohar wrote:
> > > On Sat, 2020-08-08 at 13:47 -0400, Chuck Lever wrote:
[...]
> > > > The first priority (for me, anyway) therefore is getting the
> > > > ability to move IMA metadata between NFS clients and servers
> > > > shoveled into the NFS protocol, but that's been blocked for
> > > > various legal reasons.
> > > 
> > > Up to now, verifying remote filesystem file integrity has been
> > > out of scope for IMA.   With fs-verity file signatures I can at
> > > least grasp how remote file integrity could possibly work.  I
> > > don't understand how remote file integrity with existing IMA
> > > formats could be supported. You might want to consider writing a
> > > whitepaper, which could later be used as the basis for a patch
> > > set cover letter.
> > 
> > I think, before this, we can help with the basics (and perhaps we
> > should sort them out before we start documenting what we'll do).
> 
> Thanks for the help! I just want to emphasize that documentation
> (eg, a specification) will be critical for remote filesystems.
> 
> If any of this is to be supported by a remote filesystem, then we
> need an unencumbered description of the new metadata format rather
> than code. GPL-encumbered formats cannot be contributed to the NFS
> standard, and are probably difficult for other filesystems that are
> not Linux-native, like SMB, as well.

I don't understand what you mean by GPL encumbered formats.  The GPL is
a code licence not a data or document licence.  The way the spec
process works in Linux is that we implement or evolve a data format
under a GPL implementaiton, but that implementation doesn't implicate
the later standardisation of the data format and people are free to
reimplement under any licence they choose.

> > The first basic is that a merkle tree allows unit at a time
> > verification. First of all we should agree on the unit.  Since we
> > always fault a page at a time, I think our merkle tree unit should
> > be a page not a block.
> 
> Remote filesystems will need to agree that the size of that unit is
> the same everywhere, or the unit size could be stored in the per-file
> metadata.
> 
> 
> > Next, we should agree where the check gates for the per page
> > accesses should be ... definitely somewhere in readpage, I suspect
> > and finally we should agree how the merkle tree is presented at the
> > gate.  I think there are three ways:
> > 
> >   1. Ahead of time transfer:  The merkle tree is transferred and
> > verified
> >      at some time before the accesses begin, so we already have a
> >      verified copy and can compare against the lower leaf.
> >   2. Async transfer:  We provide an async mechanism to transfer the
> >      necessary components, so when presented with a unit, we check
> > the
> >      log n components required to get to the root
> >   3. The protocol actually provides the capability of 2 (like the
> > SCSI
> >      DIF/DIX), so to IMA all the pieces get presented instead of
> > IMA
> >      having to manage the tree
> 
> A Merkle tree is potentially large enough that it cannot be stored in
> an extended attribute. In addition, an extended attribute is not a
> byte stream that you can seek into or read small parts of, it is
> retrieved in a single shot.

Well you wouldn't store the tree would you, just the head hash.  The
rest of the tree can be derived from the data.  You need to distinguish
between what you *must* have to verify integrity (the head hash,
possibly signed) and what is nice to have to speed up the verification
process.  The choice for the latter is cache or reconstruct depending
on the resources available.  If the tree gets cached on the server,
that would be a server implementation detail invisible to the client.

> For this reason, the idea was to save only the signature of the
> tree's root on durable storage. The client would retrieve that
> signature possibly at open time, and reconstruct the tree at that
> time.

Right that's the integrity data you must have.

> Or the tree could be partially constructed on-demand at the time each
> unit is to be checked (say, as part of 2. above).

Whether it's reconstructed or cached can be an implementation detail. 
You clearly have to reconstruct once, but whether you have to do it
again depends on the memory available for caching and all the other
resource calls in the system.

> The client would have to reconstruct that tree again if memory
> pressure caused some or all of the tree to be evicted, so perhaps an
> on-demand mechanism is preferable.

Right, but I think that's implementation detail.  Probably what we need
is a way to get the log(N) verification hashes from the server and it's
up to the client whether it caches them or not.

> > There are also a load of minor things like how we get the head
> > hash, which must be presented and verified ahead of time for each
> > of the above 3.
> 
> Also, changes to a file's content and its tree signature are not
> atomic. If a file is mutable, then there is the period between when
> the file content has changed and when the signature is updated.
> Some discussion of how a client is to behave in those situations will
> be necessary.

For IMA, if you write to a checked file, it gets rechecked the next
time the gate (open/exec/mmap) is triggered.  This means you must
complete the update and have the new integrity data in-place before
triggering the check.  I think this could apply equally to a merkel
tree based system.  It's a sort of Doctor, Doctor it hurts when I do
this situation.

James
Chuck Lever Aug. 11, 2020, 2:48 p.m. UTC | #23
> On Aug 11, 2020, at 1:43 AM, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> 
> On Mon, 2020-08-10 at 19:36 -0400, Chuck Lever wrote:
>>> On Aug 10, 2020, at 11:35 AM, James Bottomley
>>> <James.Bottomley@HansenPartnership.com> wrote:
>>> On Sun, 2020-08-09 at 13:16 -0400, Mimi Zohar wrote:
>>>> On Sat, 2020-08-08 at 13:47 -0400, Chuck Lever wrote:
> [...]
>>>>> The first priority (for me, anyway) therefore is getting the
>>>>> ability to move IMA metadata between NFS clients and servers
>>>>> shoveled into the NFS protocol, but that's been blocked for
>>>>> various legal reasons.
>>>> 
>>>> Up to now, verifying remote filesystem file integrity has been
>>>> out of scope for IMA.   With fs-verity file signatures I can at
>>>> least grasp how remote file integrity could possibly work.  I
>>>> don't understand how remote file integrity with existing IMA
>>>> formats could be supported. You might want to consider writing a
>>>> whitepaper, which could later be used as the basis for a patch
>>>> set cover letter.
>>> 
>>> I think, before this, we can help with the basics (and perhaps we
>>> should sort them out before we start documenting what we'll do).
>> 
>> Thanks for the help! I just want to emphasize that documentation
>> (eg, a specification) will be critical for remote filesystems.
>> 
>> If any of this is to be supported by a remote filesystem, then we
>> need an unencumbered description of the new metadata format rather
>> than code. GPL-encumbered formats cannot be contributed to the NFS
>> standard, and are probably difficult for other filesystems that are
>> not Linux-native, like SMB, as well.
> 
> I don't understand what you mean by GPL encumbered formats.  The GPL is
> a code licence not a data or document licence.

IETF contributions occur under a BSD-style license incompatible
with the GPL.

https://trustee.ietf.org/trust-legal-provisions.html

Non-Linux implementers (of OEM storage devices) rely on such standards
processes to indemnify them against licensing claims.

Today, there is no specification for existing IMA metadata formats,
there is only code. My lawyer tells me that because the code that
implements these formats is under GPL, the formats themselves cannot
be contributed to, say, the IETF without express permission from the
authors of that code. There are a lot of authors of the Linux IMA
code, so this is proving to be an impediment to contribution. That
blocks the ability to provide a fully-specified NFS protocol
extension to support IMA metadata formats.


> The way the spec
> process works in Linux is that we implement or evolve a data format
> under a GPL implementaiton, but that implementation doesn't implicate
> the later standardisation of the data format and people are free to
> reimplement under any licence they choose.

That technology transfer can happen only if all the authors of that
prototype agree to contribute to a standard. That's much easier if
that agreement comes before an implementation is done. The current
IMA code base is more than a decade old, and there are more than a
hundred authors who have contributed to that base.

Thus IMO we want an unencumbered description of any IMA metadata
format that is to be contributed to an open standards body (as it
would have to be to extend, say, the NFS protocol).

I'm happy to write that specification, as long as any contributions
here are unencumbered and acknowledged. In fact, I have been working
on a (so far, flawed) NFS extension:

https://datatracker.ietf.org/doc/draft-ietf-nfsv4-integrity-measurement/

Now, for example, the components of a putative Merkle-based IMA
metadata format are all already open:

- The unit size is just an integer

- A certificate fingerprint is a de facto standard, and the
fingerprint digest algorithms are all standardized

- Merkle trees are public domain, I believe, and we're not adding
any special sauce here

- Digital signing algorithms are all standardized

Wondering if we want to hash and save the file's mtime and size too.


>>> The first basic is that a merkle tree allows unit at a time
>>> verification. First of all we should agree on the unit.  Since we
>>> always fault a page at a time, I think our merkle tree unit should
>>> be a page not a block.
>> 
>> Remote filesystems will need to agree that the size of that unit is
>> the same everywhere, or the unit size could be stored in the per-file
>> metadata.
>> 
>> 
>>> Next, we should agree where the check gates for the per page
>>> accesses should be ... definitely somewhere in readpage, I suspect
>>> and finally we should agree how the merkle tree is presented at the
>>> gate.  I think there are three ways:
>>> 
>>>  1. Ahead of time transfer:  The merkle tree is transferred and
>>> verified
>>>     at some time before the accesses begin, so we already have a
>>>     verified copy and can compare against the lower leaf.
>>>  2. Async transfer:  We provide an async mechanism to transfer the
>>>     necessary components, so when presented with a unit, we check
>>> the
>>>     log n components required to get to the root
>>>  3. The protocol actually provides the capability of 2 (like the
>>> SCSI
>>>     DIF/DIX), so to IMA all the pieces get presented instead of
>>> IMA
>>>     having to manage the tree
>> 
>> A Merkle tree is potentially large enough that it cannot be stored in
>> an extended attribute. In addition, an extended attribute is not a
>> byte stream that you can seek into or read small parts of, it is
>> retrieved in a single shot.
> 
> Well you wouldn't store the tree would you, just the head hash.  The
> rest of the tree can be derived from the data.  You need to distinguish
> between what you *must* have to verify integrity (the head hash,
> possibly signed)

We're dealing with an untrusted storage device, and for a remote
filesystem, an untrusted network.

Mimi's earlier point is that any IMA metadata format that involves
unsigned digests is exposed to an alteration attack at rest or in
transit, thus will not provide a robust end-to-end integrity
guarantee.

Therefore, tree root digests must be cryptographically signed to be
properly protected in these environments. Verifying that signature
should be done infrequently relative to reading a file's content.


> and what is nice to have to speed up the verification
> process.  The choice for the latter is cache or reconstruct depending
> on the resources available.  If the tree gets cached on the server,
> that would be a server implementation detail invisible to the client.

We assume that storage targets (for block or file) are not trusted.
Therefore storage clients cannot rely on intermediate results (eg,
middle nodes in a Merkle tree) unless those results are generated
within the client's trust envelope.

So: if the storage target is considered inside the client's trust
envelope, it can cache or store durably any intermediate parts of
the verification process. If not, the network and file storage is
considered untrusted, and the client has to rely on nothing but the
signed digest of the tree root.

We could build a scheme around, say, fscache, that might save the
intermediate results durably and locally.


>> For this reason, the idea was to save only the signature of the
>> tree's root on durable storage. The client would retrieve that
>> signature possibly at open time, and reconstruct the tree at that
>> time.
> 
> Right that's the integrity data you must have.
> 
>> Or the tree could be partially constructed on-demand at the time each
>> unit is to be checked (say, as part of 2. above).
> 
> Whether it's reconstructed or cached can be an implementation detail.
> You clearly have to reconstruct once, but whether you have to do it
> again depends on the memory available for caching and all the other
> resource calls in the system.
> 
>> The client would have to reconstruct that tree again if memory
>> pressure caused some or all of the tree to be evicted, so perhaps an
>> on-demand mechanism is preferable.
> 
> Right, but I think that's implementation detail.  Probably what we need
> is a way to get the log(N) verification hashes from the server and it's
> up to the client whether it caches them or not.

Agreed, these are implementation details. But see above about the
trustworthiness of the intermediate hashes. If they are conveyed
on an untrusted network, then they can't be trusted either.


>>> There are also a load of minor things like how we get the head
>>> hash, which must be presented and verified ahead of time for each
>>> of the above 3.
>> 
>> Also, changes to a file's content and its tree signature are not
>> atomic. If a file is mutable, then there is the period between when
>> the file content has changed and when the signature is updated.
>> Some discussion of how a client is to behave in those situations will
>> be necessary.
> 
> For IMA, if you write to a checked file, it gets rechecked the next
> time the gate (open/exec/mmap) is triggered.  This means you must
> complete the update and have the new integrity data in-place before
> triggering the check.  I think this could apply equally to a merkel
> tree based system.  It's a sort of Doctor, Doctor it hurts when I do
> this situation.

I imagine it's a common situation where a "yum update" process is
modifying executables while clients are running them. To prevent
a read from pulling refreshed content before the new tree root is
available, it would have to block temporarily until the verification
process succeeds with the updated tree root.


--
Chuck Lever
chucklever@gmail.com
James Bottomley Aug. 11, 2020, 3:32 p.m. UTC | #24
On Tue, 2020-08-11 at 10:48 -0400, Chuck Lever wrote:
> > On Aug 11, 2020, at 1:43 AM, James Bottomley
> > <James.Bottomley@HansenPartnership.com> wrote:
> > On Mon, 2020-08-10 at 19:36 -0400, Chuck Lever wrote:
[...]
> > > Thanks for the help! I just want to emphasize that documentation
> > > (eg, a specification) will be critical for remote filesystems.
> > > 
> > > If any of this is to be supported by a remote filesystem, then we
> > > need an unencumbered description of the new metadata format
> > > rather than code. GPL-encumbered formats cannot be contributed to
> > > the NFS standard, and are probably difficult for other
> > > filesystems that are not Linux-native, like SMB, as well.
> > 
> > I don't understand what you mean by GPL encumbered formats.  The
> > GPL is a code licence not a data or document licence.
> 
> IETF contributions occur under a BSD-style license incompatible
> with the GPL.
> 
> https://trustee.ietf.org/trust-legal-provisions.html
> 
> Non-Linux implementers (of OEM storage devices) rely on such
> standards processes to indemnify them against licensing claims.

Well, that simply means we won't be contributing the Linux
implementation, right? However, IETF doesn't require BSD for all
implementations, so that's OK.

> Today, there is no specification for existing IMA metadata formats,
> there is only code. My lawyer tells me that because the code that
> implements these formats is under GPL, the formats themselves cannot
> be contributed to, say, the IETF without express permission from the
> authors of that code. There are a lot of authors of the Linux IMA
> code, so this is proving to be an impediment to contribution. That
> blocks the ability to provide a fully-specified NFS protocol
> extension to support IMA metadata formats.

Well, let me put the counterpoint: I can write a book about how linux
device drivers work (which includes describing the data formats), for
instance, without having to get permission from all the authors ... or
is your lawyer taking the view we should be suing Jonathan Corbet,
Alessandro Rubini, and Greg Kroah-Hartman for licence infringement?  In
fact do they think we now have a huge class action possibility against
O'Reilly  and a host of other publishers ...

> > The way the spec process works in Linux is that we implement or
> > evolve a data format under a GPL implementaiton, but that
> > implementation doesn't implicate the later standardisation of the
> > data format and people are free to reimplement under any licence
> > they choose.
> 
> That technology transfer can happen only if all the authors of that
> prototype agree to contribute to a standard. That's much easier if
> that agreement comes before an implementation is done. The current
> IMA code base is more than a decade old, and there are more than a
> hundred authors who have contributed to that base.
> 
> Thus IMO we want an unencumbered description of any IMA metadata
> format that is to be contributed to an open standards body (as it
> would have to be to extend, say, the NFS protocol).

Fine, good grief, people who take a sensible view of this can write the
data format down and publish it under any licence you like then you can
pick it up again safely.  Would CC0 be OK? ... neither GPL nor BSD are
document licences and we shouldn't perpetuate bad practice by licensing
documentation under them.

James
James Bottomley Aug. 11, 2020, 3:53 p.m. UTC | #25
On Tue, 2020-08-11 at 10:48 -0400, Chuck Lever wrote:
> > On Aug 11, 2020, at 1:43 AM, James Bottomley <James.Bottomley@Hanse
> > nPartnership.com> wrote:
> > 
> > On Mon, 2020-08-10 at 19:36 -0400, Chuck Lever wrote:
> > > > On Aug 10, 2020, at 11:35 AM, James Bottomley
> > > > <James.Bottomley@HansenPartnership.com> wrote:
[...]
> > > > The first basic is that a merkle tree allows unit at a time
> > > > verification. First of all we should agree on the unit.  Since
> > > > we always fault a page at a time, I think our merkle tree unit
> > > > should be a page not a block.
> > > 
> > > Remote filesystems will need to agree that the size of that unit
> > > is the same everywhere, or the unit size could be stored in the
> > > per-filemetadata.
> > > 
> > > 
> > > > Next, we should agree where the check gates for the per page
> > > > accesses should be ... definitely somewhere in readpage, I
> > > > suspect and finally we should agree how the merkle tree is
> > > > presented at the gate.  I think there are three ways:
> > > > 
> > > >  1. Ahead of time transfer:  The merkle tree is transferred and
> > > > verified
> > > >     at some time before the accesses begin, so we already have
> > > > a
> > > >     verified copy and can compare against the lower leaf.
> > > >  2. Async transfer:  We provide an async mechanism to transfer
> > > > the
> > > >     necessary components, so when presented with a unit, we
> > > > check the
> > > >     log n components required to get to the root
> > > >  3. The protocol actually provides the capability of 2 (like
> > > > the SCSI
> > > >     DIF/DIX), so to IMA all the pieces get presented instead of
> > > > IMA
> > > >     having to manage the tree
> > > 
> > > A Merkle tree is potentially large enough that it cannot be
> > > stored in an extended attribute. In addition, an extended
> > > attribute is not a byte stream that you can seek into or read
> > > small parts of, it is retrieved in a single shot.
> > 
> > Well you wouldn't store the tree would you, just the head
> > hash.  The rest of the tree can be derived from the data.  You need
> > to distinguish between what you *must* have to verify integrity
> > (the head hash, possibly signed)
> 
> We're dealing with an untrusted storage device, and for a remote
> filesystem, an untrusted network.
> 
> Mimi's earlier point is that any IMA metadata format that involves
> unsigned digests is exposed to an alteration attack at rest or in
> transit, thus will not provide a robust end-to-end integrity
> guarantee.
> 
> Therefore, tree root digests must be cryptographically signed to be
> properly protected in these environments. Verifying that signature
> should be done infrequently relative to reading a file's content.

I'm not disagreeing there has to be a way for the relying party to
trust the root hash.

> > and what is nice to have to speed up the verification
> > process.  The choice for the latter is cache or reconstruct
> > depending on the resources available.  If the tree gets cached on
> > the server, that would be a server implementation detail invisible
> > to the client.
> 
> We assume that storage targets (for block or file) are not trusted.
> Therefore storage clients cannot rely on intermediate results (eg,
> middle nodes in a Merkle tree) unless those results are generated
> within the client's trust envelope.

Yes, they can ... because supplied nodes can be verified.  That's the
whole point of a merkle tree.  As long as I'm sure of the root hash I
can verify all the rest even if supplied by an untrusted source.  If
you consider a simple merkle tree covering 4 blocks:

       R
     /   \
  H11     H12
  / \     / \
H21 H22 H23 H24
 |    |   |   |
B1   B2  B3  B4

Assume I have the verified root hash R.  If you supply B3 you also
supply H24 and H11 as proof.  I verify by hashing B3 to produce H23
then hash H23 and H24 to produce H12 and if H12 and your supplied H11
hash to R the tree is correct and the B3 you supplied must likewise be
correct.

> So: if the storage target is considered inside the client's trust
> envelope, it can cache or store durably any intermediate parts of
> the verification process. If not, the network and file storage is
> considered untrusted, and the client has to rely on nothing but the
> signed digest of the tree root.
> 
> We could build a scheme around, say, fscache, that might save the
> intermediate results durably and locally.

I agree we want caching on the client, but we can always page in from
the remote as long as we page enough to verify up to R, so we're always
sure the remote supplied genuine information.

> > > For this reason, the idea was to save only the signature of the
> > > tree's root on durable storage. The client would retrieve that
> > > signature possibly at open time, and reconstruct the tree at that
> > > time.
> > 
> > Right that's the integrity data you must have.
> > 
> > > Or the tree could be partially constructed on-demand at the time
> > > each unit is to be checked (say, as part of 2. above).
> > 
> > Whether it's reconstructed or cached can be an implementation
> > detail. You clearly have to reconstruct once, but whether you have
> > to do it again depends on the memory available for caching and all
> > the other resource calls in the system.
> > 
> > > The client would have to reconstruct that tree again if memory
> > > pressure caused some or all of the tree to be evicted, so perhaps
> > > an on-demand mechanism is preferable.
> > 
> > Right, but I think that's implementation detail.  Probably what we
> > need is a way to get the log(N) verification hashes from the server
> > and it's up to the client whether it caches them or not.
> 
> Agreed, these are implementation details. But see above about the
> trustworthiness of the intermediate hashes. If they are conveyed
> on an untrusted network, then they can't be trusted either.

Yes, they can, provided enough of them are asked for to verify.  If you
look at the simple example above, suppose I have cached H11 and H12,
but I've lost the entire H2X layer.  I want to verify B3 so I also ask
you for your copy of H24.  Then I generate H23 from B3 and Hash H23 and
H24.  If this doesn't hash to H12 I know either you supplied me the
wrong block or lied about H24.  However, if it all hashes correctly I
know you supplied me with both the correct B3 and the correct H24.

> > > > There are also a load of minor things like how we get the head
> > > > hash, which must be presented and verified ahead of time for
> > > > each of the above 3.
> > > 
> > > Also, changes to a file's content and its tree signature are not
> > > atomic. If a file is mutable, then there is the period between
> > > when the file content has changed and when the signature is
> > > updated. Some discussion of how a client is to behave in those
> > > situations will be necessary.
> > 
> > For IMA, if you write to a checked file, it gets rechecked the next
> > time the gate (open/exec/mmap) is triggered.  This means you must
> > complete the update and have the new integrity data in-place before
> > triggering the check.  I think this could apply equally to a merkel
> > tree based system.  It's a sort of Doctor, Doctor it hurts when I
> > do this situation.
> 
> I imagine it's a common situation where a "yum update" process is
> modifying executables while clients are running them. To prevent
> a read from pulling refreshed content before the new tree root is
> available, it would have to block temporarily until the verification
> process succeeds with the updated tree root.

No ... it's not.  Yum specifically worries about that today because if
you update running binaries, it causes a crash.  Yum constructs the
entire new file then atomically links it into place and deletes the old
inode to prevent these crashes.  It never allows you to get into the
situation where you can execute something that will be modified. 
That's also why you have to restart stuff after a yum update because if
you didn't it would still be attached to the deleted inode.

James
James Bottomley Aug. 11, 2020, 6:28 p.m. UTC | #26
On Tue, 2020-08-11 at 10:48 -0400, Chuck Lever wrote:
> Mimi's earlier point is that any IMA metadata format that involves
> unsigned digests is exposed to an alteration attack at rest or in
> transit, thus will not provide a robust end-to-end integrity
> guarantee.

I don't believe that is Mimi's point, because it's mostly not correct:
the xattr mechanism does provide this today.  The point is the
mechanism we use for storing IMA hashes and signatures today is xattrs
because they have robust security properties for local filesystems that
the kernel enforces.  This use goes beyond IMA, selinux labels for
instance use this property as well.

What I think you're saying is that NFS can't provide the robust
security for xattrs we've been relying on, so you need some other
mechanism for storing them.

I think Mimi's other point is actually that IMA uses a flat hash which
we derive by reading the entire file and then watching for mutations. 
Since you cannot guarantee we get notice of mutation with NFS, the
entire IMA mechanism can't really be applied in its current form and we
have to resort to chunk at a time verifications that a Merkel tree
would provide.  Doesn't this make moot any thinking about
standardisation in NFS for the current IMA flat hash mechanism because
we simply can't use it ... If I were to construct a prototype I'd have
to work out and securely cache the hash of ever chunk when verifying
the flat hash so I could recheck on every chunk read.  I think that's
infeasible for large files.

James
Pavel Machek Aug. 11, 2020, 7:30 p.m. UTC | #27
Hi!

> > > > (eg, a specification) will be critical for remote filesystems.
> > > > 
> > > > If any of this is to be supported by a remote filesystem, then we
> > > > need an unencumbered description of the new metadata format
> > > > rather than code. GPL-encumbered formats cannot be contributed to
> > > > the NFS standard, and are probably difficult for other
> > > > filesystems that are not Linux-native, like SMB, as well.
> > > 
> > > I don't understand what you mean by GPL encumbered formats.  The
> > > GPL is a code licence not a data or document licence.
> > 
> > IETF contributions occur under a BSD-style license incompatible
> > with the GPL.
> > 
> > https://trustee.ietf.org/trust-legal-provisions.html
> > 
> > Non-Linux implementers (of OEM storage devices) rely on such
> > standards processes to indemnify them against licensing claims.
> 
> Well, that simply means we won't be contributing the Linux
> implementation, right? However, IETF doesn't require BSD for all
> implementations, so that's OK.
> 
> > Today, there is no specification for existing IMA metadata formats,
> > there is only code. My lawyer tells me that because the code that
> > implements these formats is under GPL, the formats themselves cannot
> > be contributed to, say, the IETF without express permission from the
> > authors of that code. There are a lot of authors of the Linux IMA
> > code, so this is proving to be an impediment to contribution. That
> > blocks the ability to provide a fully-specified NFS protocol
> > extension to support IMA metadata formats.
> 
> Well, let me put the counterpoint: I can write a book about how
> linux

You should probably talk to your lawyer.

> device drivers work (which includes describing the data formats), for
> instance, without having to get permission from all the authors ... or
> is your lawyer taking the view we should be suing Jonathan Corbet,
> Alessandro Rubini, and Greg Kroah-Hartman for licence infringement?  In
> fact do they think we now have a huge class action possibility against
> O'Reilly  and a host of other publishers ...

Because yes, you can reverse engineer for compatibility reasons --
doing clean room re-implementation (BIOS binary -> BIOS documentation
-> BIOS sources under different license), but that was only tested in
the US, is expensive, and I understand people might be uncomfortable
doing that.

Best regards,
									Pavel
James Morris Aug. 11, 2020, 9:03 p.m. UTC | #28
On Sat, 8 Aug 2020, Chuck Lever wrote:

> My interest is in code integrity enforcement for executables stored
> in NFS files.
> 
> My struggle with IPE is that due to its dependence on dm-verity, it
> does not seem to able to protect content that is stored separately
> from its execution environment and accessed via a file access
> protocol (FUSE, SMB, NFS, etc).

It's not dependent on DM-Verity, that's just one possible integrity 
verification mechanism, and one of two supported in this initial 
version. The other is 'boot_verified' for a verified or otherwise trusted 
rootfs. Future versions will support FS-Verity, at least.

IPE was designed to be extensible in this way, with a strong separation of 
mechanism and policy.

Whatever is implemented for NFS should be able to plug in to IPE pretty 
easily.
Chuck Lever Aug. 12, 2020, 1:56 p.m. UTC | #29
> On Aug 11, 2020, at 2:28 PM, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> 
> On Tue, 2020-08-11 at 10:48 -0400, Chuck Lever wrote:
>> Mimi's earlier point is that any IMA metadata format that involves
>> unsigned digests is exposed to an alteration attack at rest or in
>> transit, thus will not provide a robust end-to-end integrity
>> guarantee.
> 
> I don't believe that is Mimi's point, because it's mostly not correct:
> the xattr mechanism does provide this today.  The point is the
> mechanism we use for storing IMA hashes and signatures today is xattrs
> because they have robust security properties for local filesystems that
> the kernel enforces.  This use goes beyond IMA, selinux labels for
> instance use this property as well.

I don't buy this for a second. If storing a security label in a
local xattr is so secure, we wouldn't have any need for EVM.


> What I think you're saying is that NFS can't provide the robust
> security for xattrs we've been relying on, so you need some other
> mechanism for storing them.

For NFS, there's a network traversal which is an attack surface.

A local xattr can be attacked as well: a device or bus malfunction
can corrupt the content of an xattr, or a privileged user can modify
it.

How does that metadata get from the software provider to the end
user? It's got to go over a network, stored in various ways, some
of which will not be trusted. To attain an unbroken chain of
provenance, that metadata has to be signed.

I don't think the question is the storage mechanism, but rather the
protection mechanism. Signing the metadata protects it in all of
these cases.


> I think Mimi's other point is actually that IMA uses a flat hash which
> we derive by reading the entire file and then watching for mutations. 
> Since you cannot guarantee we get notice of mutation with NFS, the
> entire IMA mechanism can't really be applied in its current form and we
> have to resort to chunk at a time verifications that a Merkel tree
> would provide.

I'm not sure what you mean by this. An NFS client relies on notification
of mutation to maintain the integrity of its cache of NFS file content,
and it's done that since the 1980s.

In addition to examining a file's mtime and ctime as maintained by
the NFS server, a client can rely on the file's NFSv4 change attribute
or an NFSv4 delegation.


> Doesn't this make moot any thinking about
> standardisation in NFS for the current IMA flat hash mechanism because
> we simply can't use it ... If I were to construct a prototype I'd have
> to work out and securely cache the hash of ever chunk when verifying
> the flat hash so I could recheck on every chunk read.  I think that's
> infeasible for large files.
> 
> James
> 

--
Chuck Lever
chucklever@gmail.com
Chuck Lever Aug. 12, 2020, 2:15 p.m. UTC | #30
> On Aug 11, 2020, at 11:53 AM, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> 
> On Tue, 2020-08-11 at 10:48 -0400, Chuck Lever wrote:
>>> On Aug 11, 2020, at 1:43 AM, James Bottomley <James.Bottomley@Hanse
>>> nPartnership.com> wrote:
>>> 
>>> On Mon, 2020-08-10 at 19:36 -0400, Chuck Lever wrote:
>>>>> On Aug 10, 2020, at 11:35 AM, James Bottomley
>>>>> <James.Bottomley@HansenPartnership.com> wrote:
> [...]
>>>>> The first basic is that a merkle tree allows unit at a time
>>>>> verification. First of all we should agree on the unit.  Since
>>>>> we always fault a page at a time, I think our merkle tree unit
>>>>> should be a page not a block.
>>>> 
>>>> Remote filesystems will need to agree that the size of that unit
>>>> is the same everywhere, or the unit size could be stored in the
>>>> per-filemetadata.
>>>> 
>>>> 
>>>>> Next, we should agree where the check gates for the per page
>>>>> accesses should be ... definitely somewhere in readpage, I
>>>>> suspect and finally we should agree how the merkle tree is
>>>>> presented at the gate.  I think there are three ways:
>>>>> 
>>>>> 1. Ahead of time transfer:  The merkle tree is transferred and
>>>>> verified
>>>>>    at some time before the accesses begin, so we already have
>>>>> a
>>>>>    verified copy and can compare against the lower leaf.
>>>>> 2. Async transfer:  We provide an async mechanism to transfer
>>>>> the
>>>>>    necessary components, so when presented with a unit, we
>>>>> check the
>>>>>    log n components required to get to the root
>>>>> 3. The protocol actually provides the capability of 2 (like
>>>>> the SCSI
>>>>>    DIF/DIX), so to IMA all the pieces get presented instead of
>>>>> IMA
>>>>>    having to manage the tree
>>>> 
>>>> A Merkle tree is potentially large enough that it cannot be
>>>> stored in an extended attribute. In addition, an extended
>>>> attribute is not a byte stream that you can seek into or read
>>>> small parts of, it is retrieved in a single shot.
>>> 
>>> Well you wouldn't store the tree would you, just the head
>>> hash.  The rest of the tree can be derived from the data.  You need
>>> to distinguish between what you *must* have to verify integrity
>>> (the head hash, possibly signed)
>> 
>> We're dealing with an untrusted storage device, and for a remote
>> filesystem, an untrusted network.
>> 
>> Mimi's earlier point is that any IMA metadata format that involves
>> unsigned digests is exposed to an alteration attack at rest or in
>> transit, thus will not provide a robust end-to-end integrity
>> guarantee.
>> 
>> Therefore, tree root digests must be cryptographically signed to be
>> properly protected in these environments. Verifying that signature
>> should be done infrequently relative to reading a file's content.
> 
> I'm not disagreeing there has to be a way for the relying party to
> trust the root hash.
> 
>>> and what is nice to have to speed up the verification
>>> process.  The choice for the latter is cache or reconstruct
>>> depending on the resources available.  If the tree gets cached on
>>> the server, that would be a server implementation detail invisible
>>> to the client.
>> 
>> We assume that storage targets (for block or file) are not trusted.
>> Therefore storage clients cannot rely on intermediate results (eg,
>> middle nodes in a Merkle tree) unless those results are generated
>> within the client's trust envelope.
> 
> Yes, they can ... because supplied nodes can be verified.  That's the
> whole point of a merkle tree.  As long as I'm sure of the root hash I
> can verify all the rest even if supplied by an untrusted source.  If
> you consider a simple merkle tree covering 4 blocks:
> 
>       R
>     /   \
>  H11     H12
>  / \     / \
> H21 H22 H23 H24
> |    |   |   |
> B1   B2  B3  B4
> 
> Assume I have the verified root hash R.  If you supply B3 you also
> supply H24 and H11 as proof.  I verify by hashing B3 to produce H23
> then hash H23 and H24 to produce H12 and if H12 and your supplied H11
> hash to R the tree is correct and the B3 you supplied must likewise be
> correct.

I'm not sure what you are proving here. Obviously this has to work
in order for a client to reconstruct the file's Merkle tree given
only R and the file content.

It's the construction of the tree and verification of the hashes that
are potentially expensive. The point of caching intermediate hashes
is so that the client verifies them as few times as possible.  I
don't see value in caching those hashes on an untrusted server --
the client will have to reverify them anyway, and there will be no
savings.

Cache once, as close as you can to where the data will be used.


>> So: if the storage target is considered inside the client's trust
>> envelope, it can cache or store durably any intermediate parts of
>> the verification process. If not, the network and file storage is
>> considered untrusted, and the client has to rely on nothing but the
>> signed digest of the tree root.
>> 
>> We could build a scheme around, say, fscache, that might save the
>> intermediate results durably and locally.
> 
> I agree we want caching on the client, but we can always page in from
> the remote as long as we page enough to verify up to R, so we're always
> sure the remote supplied genuine information.

Agreed.


>>>> For this reason, the idea was to save only the signature of the
>>>> tree's root on durable storage. The client would retrieve that
>>>> signature possibly at open time, and reconstruct the tree at that
>>>> time.
>>> 
>>> Right that's the integrity data you must have.
>>> 
>>>> Or the tree could be partially constructed on-demand at the time
>>>> each unit is to be checked (say, as part of 2. above).
>>> 
>>> Whether it's reconstructed or cached can be an implementation
>>> detail. You clearly have to reconstruct once, but whether you have
>>> to do it again depends on the memory available for caching and all
>>> the other resource calls in the system.
>>> 
>>>> The client would have to reconstruct that tree again if memory
>>>> pressure caused some or all of the tree to be evicted, so perhaps
>>>> an on-demand mechanism is preferable.
>>> 
>>> Right, but I think that's implementation detail.  Probably what we
>>> need is a way to get the log(N) verification hashes from the server
>>> and it's up to the client whether it caches them or not.
>> 
>> Agreed, these are implementation details. But see above about the
>> trustworthiness of the intermediate hashes. If they are conveyed
>> on an untrusted network, then they can't be trusted either.
> 
> Yes, they can, provided enough of them are asked for to verify.  If you
> look at the simple example above, suppose I have cached H11 and H12,
> but I've lost the entire H2X layer.  I want to verify B3 so I also ask
> you for your copy of H24.  Then I generate H23 from B3 and Hash H23 and
> H24.  If this doesn't hash to H12 I know either you supplied me the
> wrong block or lied about H24.  However, if it all hashes correctly I
> know you supplied me with both the correct B3 and the correct H24.

My point is there is a difference between a trusted cache and an
untrusted cache. I argue there is not much value in a cache where
the hashes have to be verified again.


>>>>> There are also a load of minor things like how we get the head
>>>>> hash, which must be presented and verified ahead of time for
>>>>> each of the above 3.
>>>> 
>>>> Also, changes to a file's content and its tree signature are not
>>>> atomic. If a file is mutable, then there is the period between
>>>> when the file content has changed and when the signature is
>>>> updated. Some discussion of how a client is to behave in those
>>>> situations will be necessary.
>>> 
>>> For IMA, if you write to a checked file, it gets rechecked the next
>>> time the gate (open/exec/mmap) is triggered.  This means you must
>>> complete the update and have the new integrity data in-place before
>>> triggering the check.  I think this could apply equally to a merkel
>>> tree based system.  It's a sort of Doctor, Doctor it hurts when I
>>> do this situation.
>> 
>> I imagine it's a common situation where a "yum update" process is
>> modifying executables while clients are running them. To prevent
>> a read from pulling refreshed content before the new tree root is
>> available, it would have to block temporarily until the verification
>> process succeeds with the updated tree root.
> 
> No ... it's not.  Yum specifically worries about that today because if
> you update running binaries, it causes a crash.  Yum constructs the
> entire new file then atomically links it into place and deletes the old
> inode to prevent these crashes.  It never allows you to get into the
> situation where you can execute something that will be modified. 
> That's also why you have to restart stuff after a yum update because if
> you didn't it would still be attached to the deleted inode.

Fair enough.

--
Chuck Lever
chucklever@gmail.com
Chuck Lever Aug. 12, 2020, 2:18 p.m. UTC | #31
> On Aug 11, 2020, at 5:03 PM, James Morris <jmorris@namei.org> wrote:
> 
> On Sat, 8 Aug 2020, Chuck Lever wrote:
> 
>> My interest is in code integrity enforcement for executables stored
>> in NFS files.
>> 
>> My struggle with IPE is that due to its dependence on dm-verity, it
>> does not seem to able to protect content that is stored separately
>> from its execution environment and accessed via a file access
>> protocol (FUSE, SMB, NFS, etc).
> 
> It's not dependent on DM-Verity, that's just one possible integrity 
> verification mechanism, and one of two supported in this initial 
> version. The other is 'boot_verified' for a verified or otherwise trusted 
> rootfs. Future versions will support FS-Verity, at least.
> 
> IPE was designed to be extensible in this way, with a strong separation of 
> mechanism and policy.

I got that, but it looked to me like the whole system relied on having
access to the block device under the filesystem. That's not possible
for a remote filesystem like Ceph or NFS.

I'm happy to take a closer look if someone can point me the right way.


--
Chuck Lever
chucklever@gmail.com
Chuck Lever Aug. 12, 2020, 2:45 p.m. UTC | #32
> On Aug 11, 2020, at 11:32 AM, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> 
> On Tue, 2020-08-11 at 10:48 -0400, Chuck Lever wrote:
>>> On Aug 11, 2020, at 1:43 AM, James Bottomley
>>> <James.Bottomley@HansenPartnership.com> wrote:
>>> On Mon, 2020-08-10 at 19:36 -0400, Chuck Lever wrote:
> [...]
>>>> Thanks for the help! I just want to emphasize that documentation
>>>> (eg, a specification) will be critical for remote filesystems.
>>>> 
>>>> If any of this is to be supported by a remote filesystem, then we
>>>> need an unencumbered description of the new metadata format
>>>> rather than code. GPL-encumbered formats cannot be contributed to
>>>> the NFS standard, and are probably difficult for other
>>>> filesystems that are not Linux-native, like SMB, as well.
>>> 
>>> I don't understand what you mean by GPL encumbered formats.  The
>>> GPL is a code licence not a data or document licence.
>> 
>> IETF contributions occur under a BSD-style license incompatible
>> with the GPL.
>> 
>> https://trustee.ietf.org/trust-legal-provisions.html
>> 
>> Non-Linux implementers (of OEM storage devices) rely on such
>> standards processes to indemnify them against licensing claims.
> 
> Well, that simply means we won't be contributing the Linux
> implementation, right?

At the present time, there is nothing but the Linux implementation.
There's no English description, there's no specification of the
formats, the format is described only by source code.

The only way to contribute current IMA metadata formats to an open
standards body like the IETF is to look at encumbered code first.
We would effectively be contributing an implementation in this case.

(I'm not saying the current formats should or should not be
contributed; merely that there is a legal stumbling block to doing
so that can be avoided for newly defined formats).


> Well, let me put the counterpoint: I can write a book about how linux
> device drivers work (which includes describing the data formats)


Our position is that someone who reads that book and implements those
formats under a non-GPL-compatible license would be in breach of the
GPL.

The point of the standards process is to indemnify implementing
and distributing under _any_ license what has been published by the
standards body. That legally enables everyone to use the published
protocol/format in their own code no matter how it happens to be
licensed.


> Fine, good grief, people who take a sensible view of this can write the
> data format down and publish it under any licence you like then you can
> pick it up again safely.


That's what I proposed. Write it down under the IETF Trust legal
provisions license. And I volunteered to do that.

All I'm saying is that description needs to come before code.


--
Chuck Lever
chucklever@gmail.com
James Bottomley Aug. 12, 2020, 3:42 p.m. UTC | #33
On Wed, 2020-08-12 at 09:56 -0400, Chuck Lever wrote:
> > On Aug 11, 2020, at 2:28 PM, James Bottomley <James.Bottomley@Hanse
> > nPartnership.com> wrote:
> > 
> > On Tue, 2020-08-11 at 10:48 -0400, Chuck Lever wrote:
> > > Mimi's earlier point is that any IMA metadata format that
> > > involves unsigned digests is exposed to an alteration attack at
> > > rest or in transit, thus will not provide a robust end-to-end
> > > integrity guarantee.
> > 
> > I don't believe that is Mimi's point, because it's mostly not
> > correct: the xattr mechanism does provide this today.  The point is
> > the mechanism we use for storing IMA hashes and signatures today is
> > xattrs because they have robust security properties for local
> > filesystems that the kernel enforces.  This use goes beyond IMA,
> > selinux labels for instance use this property as well.
> 
> I don't buy this for a second. If storing a security label in a
> local xattr is so secure, we wouldn't have any need for EVM.

What don't you buy?  Security xattrs can only be updated by local root.
 If you trust local root, the xattr mechanism is fine ... it's the only
one a lot of LSMs use, for instance.  If you don't trust local root or
worry about offline backups, you use EVM.  A thing isn't secure or
insecure, it depends on the threat model.  However, if you don't trust
the NFS server it doesn't matter whether you do or don't trust local
root, you can't believe the contents of the xattr.

> > What I think you're saying is that NFS can't provide the robust
> > security for xattrs we've been relying on, so you need some other
> > mechanism for storing them.
> 
> For NFS, there's a network traversal which is an attack surface.
> 
> A local xattr can be attacked as well: a device or bus malfunction
> can corrupt the content of an xattr, or a privileged user can modify
> it.
> 
> How does that metadata get from the software provider to the end
> user? It's got to go over a network, stored in various ways, some
> of which will not be trusted. To attain an unbroken chain of
> provenance, that metadata has to be signed.
> 
> I don't think the question is the storage mechanism, but rather the
> protection mechanism. Signing the metadata protects it in all of
> these cases.

I think we're saying about the same thing.  For most people the
security mechanism of local xattrs is sufficient.  If you're paranoid,
you don't believe it is and you use EVM.

> > I think Mimi's other point is actually that IMA uses a flat hash
> > which we derive by reading the entire file and then watching for
> > mutations. Since you cannot guarantee we get notice of mutation
> > with NFS, the entire IMA mechanism can't really be applied in its
> > current form and we have to resort to chunk at a time verifications
> > that a Merkel tree would provide.
> 
> I'm not sure what you mean by this. An NFS client relies on
> notification of mutation to maintain the integrity of its cache of
> NFS file content, and it's done that since the 1980s.

Mutation detection is part of the current IMA security model.  If IMA
sees a file mutate it has to be rehashed the next time it passes the
gate.  If we can't trust the NFS server, we can't trust the NFS
mutation notification and we have to have a different mechanism to
check the file.

> In addition to examining a file's mtime and ctime as maintained by
> the NFS server, a client can rely on the file's NFSv4 change
> attribute or an NFSv4 delegation.

And that's secure in the face of a malicious or compromised server?

The bottom line is still, I think we can't use linear hashes with an
open/exec/mmap gate with NFS and we have to move to chunk at a time
verification like that provided by a merkel tree.

James
James Bottomley Aug. 12, 2020, 3:51 p.m. UTC | #34
On Wed, 2020-08-12 at 10:15 -0400, Chuck Lever wrote:
> > On Aug 11, 2020, at 11:53 AM, James Bottomley
> > <James.Bottomley@HansenPartnership.com> wrote:
> > 
> > On Tue, 2020-08-11 at 10:48 -0400, Chuck Lever wrote:
[...]
> > > > 
> > > > and what is nice to have to speed up the verification
> > > > process.  The choice for the latter is cache or reconstruct
> > > > depending on the resources available.  If the tree gets cached
> > > > on the server, that would be a server implementation detail
> > > > invisible to the client.
> > > 
> > > We assume that storage targets (for block or file) are not
> > > trusted. Therefore storage clients cannot rely on intermediate
> > > results (eg, middle nodes in a Merkle tree) unless those results
> > > are generated within the client's trust envelope.
> > 
> > Yes, they can ... because supplied nodes can be verified.  That's
> > the whole point of a merkle tree.  As long as I'm sure of the root
> > hash I can verify all the rest even if supplied by an untrusted
> > source.  If you consider a simple merkle tree covering 4 blocks:
> > 
> >       R
> >     /   \
> >  H11     H12
> >  / \     / \
> > H21 H22 H23 H24
> > >    |   |   |
> > 
> > B1   B2  B3  B4
> > 
> > Assume I have the verified root hash R.  If you supply B3 you also
> > supply H24 and H11 as proof.  I verify by hashing B3 to produce H23
> > then hash H23 and H24 to produce H12 and if H12 and your supplied
> > H11 hash to R the tree is correct and the B3 you supplied must
> > likewise be correct.
> 
> I'm not sure what you are proving here. Obviously this has to work
> in order for a client to reconstruct the file's Merkle tree given
> only R and the file content.

You implied the server can't be trusted to generate the merkel tree. 
I'm showing above it can because of the tree path based verification.

> It's the construction of the tree and verification of the hashes that
> are potentially expensive. The point of caching intermediate hashes
> is so that the client verifies them as few times as possible.  I
> don't see value in caching those hashes on an untrusted server --
> the client will have to reverify them anyway, and there will be no
> savings.

I'm not making any claim about server caching, I'm just saying the
client can request pieces of the tree from the server without having to
reconstruct the whole thing itself because it can verify their
correctness.

> Cache once, as close as you can to where the data will be used.
> 
> 
> > > So: if the storage target is considered inside the client's trust
> > > envelope, it can cache or store durably any intermediate parts of
> > > the verification process. If not, the network and file storage is
> > > considered untrusted, and the client has to rely on nothing but
> > > the signed digest of the tree root.
> > > 
> > > We could build a scheme around, say, fscache, that might save the
> > > intermediate results durably and locally.
> > 
> > I agree we want caching on the client, but we can always page in
> > from the remote as long as we page enough to verify up to R, so
> > we're always sure the remote supplied genuine information.
> 
> Agreed.
> 
> 
> > > > > For this reason, the idea was to save only the signature of
> > > > > the tree's root on durable storage. The client would retrieve
> > > > > that signature possibly at open time, and reconstruct the
> > > > > tree at that time.
> > > > 
> > > > Right that's the integrity data you must have.
> > > > 
> > > > > Or the tree could be partially constructed on-demand at the
> > > > > time each unit is to be checked (say, as part of 2. above).
> > > > 
> > > > Whether it's reconstructed or cached can be an implementation
> > > > detail. You clearly have to reconstruct once, but whether you
> > > > have to do it again depends on the memory available for caching
> > > > and all the other resource calls in the system.
> > > > 
> > > > > The client would have to reconstruct that tree again if
> > > > > memory pressure caused some or all of the tree to be evicted,
> > > > > so perhaps an on-demand mechanism is preferable.
> > > > 
> > > > Right, but I think that's implementation detail.  Probably what
> > > > we need is a way to get the log(N) verification hashes from the
> > > > server and it's up to the client whether it caches them or not.
> > > 
> > > Agreed, these are implementation details. But see above about the
> > > trustworthiness of the intermediate hashes. If they are conveyed
> > > on an untrusted network, then they can't be trusted either.
> > 
> > Yes, they can, provided enough of them are asked for to verify.  If
> > you look at the simple example above, suppose I have cached H11 and
> > H12, but I've lost the entire H2X layer.  I want to verify B3 so I
> > also ask you for your copy of H24.  Then I generate H23 from B3 and
> > Hash H23 and H24.  If this doesn't hash to H12 I know either you
> > supplied me the wrong block or lied about H24.  However, if it all
> > hashes correctly I know you supplied me with both the correct B3
> > and the correct H24.
> 
> My point is there is a difference between a trusted cache and an
> untrusted cache. I argue there is not much value in a cache where
> the hashes have to be verified again.

And my point isn't about caching, it's about where the tree comes from.
 I claim and you agree the client can get the tree from the server a
piece at a time (because it can path verify it) and doesn't have to
generate it itself.  How much of the tree the client has to store and
whether the server caches, reads it in from somewhere or reconstructs
it is an implementation detail.

James
Deven Bowers Aug. 12, 2020, 5:07 p.m. UTC | #35
On 8/12/2020 7:18 AM, Chuck Lever wrote:
> 
> 
>> On Aug 11, 2020, at 5:03 PM, James Morris <jmorris@namei.org> wrote:
>>
>> On Sat, 8 Aug 2020, Chuck Lever wrote:
>>
>>> My interest is in code integrity enforcement for executables stored
>>> in NFS files.
>>>
>>> My struggle with IPE is that due to its dependence on dm-verity, it
>>> does not seem to able to protect content that is stored separately
>>> from its execution environment and accessed via a file access
>>> protocol (FUSE, SMB, NFS, etc).
>>
>> It's not dependent on DM-Verity, that's just one possible integrity
>> verification mechanism, and one of two supported in this initial
>> version. The other is 'boot_verified' for a verified or otherwise trusted
>> rootfs. Future versions will support FS-Verity, at least.
>>
>> IPE was designed to be extensible in this way, with a strong separation of
>> mechanism and policy.
> 
> I got that, but it looked to me like the whole system relied on having
> access to the block device under the filesystem. That's not possible
> for a remote filesystem like Ceph or NFS.

Block device structure no, (though that's what the currently used, to be
fair). It really has a hard dependency on the file structure,
specifically the ability to determine whether that file structure can be 
used to navigate back to the integrity claim provided by the mechanism.

In the current world of IPE, the integrity claim is the root-hash or 
root-hash-signature on the block device, provided by dm-verity's 
setsecurity hooks (also introduced in this series).

> 
> I'm happy to take a closer look if someone can point me the right way.
> 

Sure, if you look at the 2nd patch, you want to look at the file 
"security/ipe/ipe-property.h", it defines what methods are required to
be implemented by a mechanism to work with IPE. It passes the engine
context which is defined as:

  struct ipe_engine_ctx {
  	enum ipe_op op;
  	enum ipe_hook hook;
  	const struct file *file;
  	const char *audit_pathname;
	const struct ipe_bdev_blob *sec_bdev;
  };

Now, if the security blob existed for the block_device, it would be
in sec_bdev, but that may be NULL, as well to be fair.

If you want a more worked example of how integration works, patches 8
and 10 introduce the dm-verity properties mentioned in this patch.
Chuck Lever Aug. 13, 2020, 2:21 p.m. UTC | #36
> On Aug 12, 2020, at 11:42 AM, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> 
> On Wed, 2020-08-12 at 09:56 -0400, Chuck Lever wrote:
>>> On Aug 11, 2020, at 2:28 PM, James Bottomley <James.Bottomley@Hanse
>>> nPartnership.com> wrote:
>>> 
>>> On Tue, 2020-08-11 at 10:48 -0400, Chuck Lever wrote:
>>>> Mimi's earlier point is that any IMA metadata format that
>>>> involves unsigned digests is exposed to an alteration attack at
>>>> rest or in transit, thus will not provide a robust end-to-end
>>>> integrity guarantee.
>>> 
>>> I don't believe that is Mimi's point, because it's mostly not
>>> correct: the xattr mechanism does provide this today.  The point is
>>> the mechanism we use for storing IMA hashes and signatures today is
>>> xattrs because they have robust security properties for local
>>> filesystems that the kernel enforces.  This use goes beyond IMA,
>>> selinux labels for instance use this property as well.
>> 
>> I don't buy this for a second. If storing a security label in a
>> local xattr is so secure, we wouldn't have any need for EVM.
> 
> What don't you buy?  Security xattrs can only be updated by local root.
> If you trust local root, the xattr mechanism is fine ... it's the only
> one a lot of LSMs use, for instance.  If you don't trust local root or
> worry about offline backups, you use EVM.  A thing isn't secure or
> insecure, it depends on the threat model.  However, if you don't trust
> the NFS server it doesn't matter whether you do or don't trust local
> root, you can't believe the contents of the xattr.
> 
>>> What I think you're saying is that NFS can't provide the robust
>>> security for xattrs we've been relying on, so you need some other
>>> mechanism for storing them.
>> 
>> For NFS, there's a network traversal which is an attack surface.
>> 
>> A local xattr can be attacked as well: a device or bus malfunction
>> can corrupt the content of an xattr, or a privileged user can modify
>> it.
>> 
>> How does that metadata get from the software provider to the end
>> user? It's got to go over a network, stored in various ways, some
>> of which will not be trusted. To attain an unbroken chain of
>> provenance, that metadata has to be signed.
>> 
>> I don't think the question is the storage mechanism, but rather the
>> protection mechanism. Signing the metadata protects it in all of
>> these cases.
> 
> I think we're saying about the same thing.

Roughly.


> For most people the
> security mechanism of local xattrs is sufficient.  If you're paranoid,
> you don't believe it is and you use EVM.

When IMA metadata happens to be stored in local filesystems in
a trusted xattr, it's going to enjoy the protection you describe
without needing the addition of a cryptographic signature.

However, that metadata doesn't live its whole life there. It
can reside in a tar file, it can cross a network, it can live
on a back-up tape. I think we agree that any time that metadata
is in transit or at rest outside of a Linux local filesystem, it
is exposed.

Thus I'm interested in a metadata protection mechanism that does
not rely on the security characteristics of a particular storage
container. For me, a cryptographic signature fits that bill
nicely.


>>> I think Mimi's other point is actually that IMA uses a flat hash
>>> which we derive by reading the entire file and then watching for
>>> mutations. Since you cannot guarantee we get notice of mutation
>>> with NFS, the entire IMA mechanism can't really be applied in its
>>> current form and we have to resort to chunk at a time verifications
>>> that a Merkel tree would provide.
>> 
>> I'm not sure what you mean by this. An NFS client relies on
>> notification of mutation to maintain the integrity of its cache of
>> NFS file content, and it's done that since the 1980s.
> 
> Mutation detection is part of the current IMA security model.  If IMA
> sees a file mutate it has to be rehashed the next time it passes the
> gate.  If we can't trust the NFS server, we can't trust the NFS
> mutation notification and we have to have a different mechanism to
> check the file.

When an NFS server lies about mtime and ctime, then NFS is completely
broken. Untrusted NFS server doesn't mean "broken behavior" -- I
would think that local filesystems will have the same problem if
they can't trust a local block device to store filesystem metadata
like indirect blocks and timestamps.

It's not clear to me that IMA as currently implemented can protect
against broken storage devices or incorrect filesystem behavior.


>> In addition to examining a file's mtime and ctime as maintained by
>> the NFS server, a client can rely on the file's NFSv4 change
>> attribute or an NFSv4 delegation.
> 
> And that's secure in the face of a malicious or compromised server?
> 
> The bottom line is still, I think we can't use linear hashes with an
> open/exec/mmap gate with NFS and we have to move to chunk at a time
> verification like that provided by a merkel tree.

That's fine until we claim that remote filesystems require one form of
metadata and local filesystems use some other form.

To guarantee an unbroken chain of provenance, everyone has to use the
same portable metadata format that is signed once by the content creator.
That's essentially why I believe the Merkle-based metadata format must
require that the tree root is signed.


--
Chuck Lever
chucklever@gmail.com
James Bottomley Aug. 13, 2020, 2:42 p.m. UTC | #37
On Thu, 2020-08-13 at 10:21 -0400, Chuck Lever wrote:
> > On Aug 12, 2020, at 11:42 AM, James Bottomley <James.Bottomley@Hans
> > enPartnership.com> wrote:
[...]
> > For most people the security mechanism of local xattrs is
> > sufficient.  If you're paranoid, you don't believe it is and you
> > use EVM.
> 
> When IMA metadata happens to be stored in local filesystems in
> a trusted xattr, it's going to enjoy the protection you describe
> without needing the addition of a cryptographic signature.
> 
> However, that metadata doesn't live its whole life there. It
> can reside in a tar file, it can cross a network, it can live
> on a back-up tape. I think we agree that any time that metadata
> is in transit or at rest outside of a Linux local filesystem, it
> is exposed.
> 
> Thus I'm interested in a metadata protection mechanism that does
> not rely on the security characteristics of a particular storage
> container. For me, a cryptographic signature fits that bill
> nicely.

Sure, but one of the points about IMA is a separation of mechanism from
policy.  Signed hashes (called appraisal in IMA terms) is just one
policy you can decide to require or not or even make it conditional on
other things.

> > > > I think Mimi's other point is actually that IMA uses a flat
> > > > hash which we derive by reading the entire file and then
> > > > watching for mutations. Since you cannot guarantee we get
> > > > notice of mutation with NFS, the entire IMA mechanism can't
> > > > really be applied in its current form and we have to resort to
> > > > chunk at a time verifications that a Merkel tree would provide.
> > > 
> > > I'm not sure what you mean by this. An NFS client relies on
> > > notification of mutation to maintain the integrity of its cache
> > > of NFS file content, and it's done that since the 1980s.
> > 
> > Mutation detection is part of the current IMA security model.  If
> > IMA sees a file mutate it has to be rehashed the next time it
> > passes the gate.  If we can't trust the NFS server, we can't trust
> > the NFS mutation notification and we have to have a different
> > mechanism to check the file.
> 
> When an NFS server lies about mtime and ctime, then NFS is completely
> broken. Untrusted NFS server doesn't mean "broken behavior" -- I
> would think that local filesystems will have the same problem if
> they can't trust a local block device to store filesystem metadata
> like indirect blocks and timestamps.
> 
> It's not clear to me that IMA as currently implemented can protect
> against broken storage devices or incorrect filesystem behavior.

IMA doesn't really care about the storage.  The gate check will fail if
the storage corrupts the file because the hashes won't match.  The
mechanism for modification notification is the province of the
filesystem and there are definitely some which don't do it (or other fs
features) correctly and thus can't use IMA.

> > > In addition to examining a file's mtime and ctime as maintained
> > > by the NFS server, a client can rely on the file's NFSv4 change
> > > attribute or an NFSv4 delegation.
> > 
> > And that's secure in the face of a malicious or compromised server?
> > 
> > The bottom line is still, I think we can't use linear hashes with
> > an open/exec/mmap gate with NFS and we have to move to chunk at a
> > time verification like that provided by a merkel tree.
> 
> That's fine until we claim that remote filesystems require one form
> of metadata and local filesystems use some other form.
> 
> To guarantee an unbroken chain of provenance, everyone has to use the
> same portable metadata format that is signed once by the content
> creator. That's essentially why I believe the Merkle-based metadata
> format must require that the tree root is signed.

Well, no, that would be optional policy.  We should certainly support
signed head hashes and require it if the policy said so, but we
shouldn't enforce it without the policy.

Suppose I'm a cloud service provider exporting files over NFS on the
control (private) network.  I use IMA to measure untrusted tenants to
get a feel for what they're doing, but since I control the NFS server,
the client and the private network, I wouldn't feel the requirement to
have signed hashes because I trust other mechanisms for the security.

James
Chuck Lever Aug. 13, 2020, 2:42 p.m. UTC | #38
> On Aug 12, 2020, at 11:51 AM, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> 
> On Wed, 2020-08-12 at 10:15 -0400, Chuck Lever wrote:
>>> On Aug 11, 2020, at 11:53 AM, James Bottomley
>>> <James.Bottomley@HansenPartnership.com> wrote:
>>> 
>>> On Tue, 2020-08-11 at 10:48 -0400, Chuck Lever wrote:
> [...]
>>>>> 
>>>>> and what is nice to have to speed up the verification
>>>>> process.  The choice for the latter is cache or reconstruct
>>>>> depending on the resources available.  If the tree gets cached
>>>>> on the server, that would be a server implementation detail
>>>>> invisible to the client.
>>>> 
>>>> We assume that storage targets (for block or file) are not
>>>> trusted. Therefore storage clients cannot rely on intermediate
>>>> results (eg, middle nodes in a Merkle tree) unless those results
>>>> are generated within the client's trust envelope.
>>> 
>>> Yes, they can ... because supplied nodes can be verified.  That's
>>> the whole point of a merkle tree.  As long as I'm sure of the root
>>> hash I can verify all the rest even if supplied by an untrusted
>>> source.  If you consider a simple merkle tree covering 4 blocks:
>>> 
>>>      R
>>>    /   \
>>> H11     H12
>>> / \     / \
>>> H21 H22 H23 H24
>>>>   |   |   |
>>> 
>>> B1   B2  B3  B4
>>> 
>>> Assume I have the verified root hash R.  If you supply B3 you also
>>> supply H24 and H11 as proof.  I verify by hashing B3 to produce H23
>>> then hash H23 and H24 to produce H12 and if H12 and your supplied
>>> H11 hash to R the tree is correct and the B3 you supplied must
>>> likewise be correct.
>> 
>> I'm not sure what you are proving here. Obviously this has to work
>> in order for a client to reconstruct the file's Merkle tree given
>> only R and the file content.
> 
> You implied the server can't be trusted to generate the merkel tree. 
> I'm showing above it can because of the tree path based verification.

What I was implying is that clients can't trust intermediate Merkle
tree content that is not also signed. So far we are talking about
signing only the tree root.

The storage server can store the tree durably, but if the intermediate
parts of the tree are not signed, the client has to verify them anyway,
and that reduces the value of storing potentially large data structures.


>> It's the construction of the tree and verification of the hashes that
>> are potentially expensive. The point of caching intermediate hashes
>> is so that the client verifies them as few times as possible.  I
>> don't see value in caching those hashes on an untrusted server --
>> the client will have to reverify them anyway, and there will be no
>> savings.
> 
> I'm not making any claim about server caching, I'm just saying the
> client can request pieces of the tree from the server without having to
> reconstruct the whole thing itself because it can verify their
> correctness.

To be clear, my concern is about how much of the tree might be stored
in a Merkle-based metadata format. I just don't see that it has much
value to store more than the signed tree root, because the client will
have to reconstitute or verify some tree contents on most every read.

For sufficiently large files, the tree itself can be larger than what
can be stored in an xattr. This is the same problem that fs-verity
faces. And, as I stated earlier, xattr objects are read in their
entirety, they can't be seeked into or read piecemeal.

What it seemed to me that you were suggesting was an offloaded cache
of the Merkle tree. Either the whole tree is stored on the storage
server, or the storage server provides a service that reconstitutes
that tree on behalf of clients. (Please correct me if I misunderstood).
I just don't think that will be practicable or provide the kind of
benefit you might want.


>> Cache once, as close as you can to where the data will be used.
>> 
>> 
>>>> So: if the storage target is considered inside the client's trust
>>>> envelope, it can cache or store durably any intermediate parts of
>>>> the verification process. If not, the network and file storage is
>>>> considered untrusted, and the client has to rely on nothing but
>>>> the signed digest of the tree root.
>>>> 
>>>> We could build a scheme around, say, fscache, that might save the
>>>> intermediate results durably and locally.
>>> 
>>> I agree we want caching on the client, but we can always page in
>>> from the remote as long as we page enough to verify up to R, so
>>> we're always sure the remote supplied genuine information.
>> 
>> Agreed.
>> 
>> 
>>>>>> For this reason, the idea was to save only the signature of
>>>>>> the tree's root on durable storage. The client would retrieve
>>>>>> that signature possibly at open time, and reconstruct the
>>>>>> tree at that time.
>>>>> 
>>>>> Right that's the integrity data you must have.
>>>>> 
>>>>>> Or the tree could be partially constructed on-demand at the
>>>>>> time each unit is to be checked (say, as part of 2. above).
>>>>> 
>>>>> Whether it's reconstructed or cached can be an implementation
>>>>> detail. You clearly have to reconstruct once, but whether you
>>>>> have to do it again depends on the memory available for caching
>>>>> and all the other resource calls in the system.
>>>>> 
>>>>>> The client would have to reconstruct that tree again if
>>>>>> memory pressure caused some or all of the tree to be evicted,
>>>>>> so perhaps an on-demand mechanism is preferable.
>>>>> 
>>>>> Right, but I think that's implementation detail.  Probably what
>>>>> we need is a way to get the log(N) verification hashes from the
>>>>> server and it's up to the client whether it caches them or not.
>>>> 
>>>> Agreed, these are implementation details. But see above about the
>>>> trustworthiness of the intermediate hashes. If they are conveyed
>>>> on an untrusted network, then they can't be trusted either.
>>> 
>>> Yes, they can, provided enough of them are asked for to verify.  If
>>> you look at the simple example above, suppose I have cached H11 and
>>> H12, but I've lost the entire H2X layer.  I want to verify B3 so I
>>> also ask you for your copy of H24.  Then I generate H23 from B3 and
>>> Hash H23 and H24.  If this doesn't hash to H12 I know either you
>>> supplied me the wrong block or lied about H24.  However, if it all
>>> hashes correctly I know you supplied me with both the correct B3
>>> and the correct H24.
>> 
>> My point is there is a difference between a trusted cache and an
>> untrusted cache. I argue there is not much value in a cache where
>> the hashes have to be verified again.
> 
> And my point isn't about caching, it's about where the tree comes from.
> I claim and you agree the client can get the tree from the server a
> piece at a time (because it can path verify it) and doesn't have to
> generate it itself.

OK, let's focus on where the tree comes from. It is certainly
possible to build protocol to exchange parts of a Merkle tree. The
question is how it might be stored on the server. There are some
underlying assumptions about the metadata storage mechanism that
should be stated up front.

Current forms of IMA metadata are limited in size and stored in a
container that is read and written in a single operation. If we stick
with that container format, I don't see a way to store a Merkle tree
in there for all file sizes.

Thus it seems to me that we cannot begin to consider the tree-on-the-
server model unless there is a proposed storage mechanism for that
whole tree. Otherwise, the client must have the primary role in
unpacking and verifying the tree.

Storing only the tree root in the metadata means the metadata format
is nicely bounded in size.


> How much of the tree the client has to store and
> whether the server caches, reads it in from somewhere or reconstructs
> it is an implementation detail.

Sure.


--
Chuck Lever
chucklever@gmail.com
Chuck Lever Aug. 13, 2020, 2:56 p.m. UTC | #39
> On Aug 13, 2020, at 10:42 AM, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> 
> On Thu, 2020-08-13 at 10:21 -0400, Chuck Lever wrote:
>>> On Aug 12, 2020, at 11:42 AM, James Bottomley <James.Bottomley@Hans
>>> enPartnership.com> wrote:
> [...]
>>> For most people the security mechanism of local xattrs is
>>> sufficient.  If you're paranoid, you don't believe it is and you
>>> use EVM.
>> 
>> When IMA metadata happens to be stored in local filesystems in
>> a trusted xattr, it's going to enjoy the protection you describe
>> without needing the addition of a cryptographic signature.
>> 
>> However, that metadata doesn't live its whole life there. It
>> can reside in a tar file, it can cross a network, it can live
>> on a back-up tape. I think we agree that any time that metadata
>> is in transit or at rest outside of a Linux local filesystem, it
>> is exposed.
>> 
>> Thus I'm interested in a metadata protection mechanism that does
>> not rely on the security characteristics of a particular storage
>> container. For me, a cryptographic signature fits that bill
>> nicely.
> 
> Sure, but one of the points about IMA is a separation of mechanism from
> policy.  Signed hashes (called appraisal in IMA terms) is just one
> policy you can decide to require or not or even make it conditional on
> other things.

AFAICT, the current EVM_IMA_DIGSIG and EVM_PORTABLE_DIGSIG formats are
always signed. The policy choice is whether or not to verify the
signature, not whether or not the metadata format is signed.


--
Chuck Lever
chucklever@gmail.com
James Bottomley Aug. 13, 2020, 3:10 p.m. UTC | #40
On Thu, 2020-08-13 at 10:42 -0400, Chuck Lever wrote:
> > On Aug 12, 2020, at 11:51 AM, James Bottomley <James.Bottomley@Hans
> > enPartnership.com> wrote:
> > On Wed, 2020-08-12 at 10:15 -0400, Chuck Lever wrote:
> > > > On Aug 11, 2020, at 11:53 AM, James Bottomley
> > > > <James.Bottomley@HansenPartnership.com> wrote:
> > > > On Tue, 2020-08-11 at 10:48 -0400, Chuck Lever wrote:
[...]
> > > > > > > The client would have to reconstruct that tree again if
> > > > > > > memory pressure caused some or all of the tree to be
> > > > > > > evicted, so perhaps an on-demand mechanism is preferable.
> > > > > > 
> > > > > > Right, but I think that's implementation detail.  Probably
> > > > > > what we need is a way to get the log(N) verification hashes
> > > > > > from the server and it's up to the client whether it caches
> > > > > > them or not.
> > > > > 
> > > > > Agreed, these are implementation details. But see above about
> > > > > the trustworthiness of the intermediate hashes. If they are
> > > > > conveyed on an untrusted network, then they can't be trusted
> > > > > either.
> > > > 
> > > > Yes, they can, provided enough of them are asked for to
> > > > verify.  If you look at the simple example above, suppose I
> > > > have cached H11 and H12, but I've lost the entire H2X layer.  I
> > > > want to verify B3 so I also ask you for your copy of H24.  Then
> > > > I generate H23 from B3 and Hash H23 and H24.  If this doesn't
> > > > hash to H12 I know either you supplied me the wrong block or
> > > > lied about H24.  However, if it all hashes correctly I know you
> > > > supplied me with both the correct B3 and the correct H24.
> > > 
> > > My point is there is a difference between a trusted cache and an
> > > untrusted cache. I argue there is not much value in a cache where
> > > the hashes have to be verified again.
> > 
> > And my point isn't about caching, it's about where the tree comes
> > from. I claim and you agree the client can get the tree from the
> > server a piece at a time (because it can path verify it) and
> > doesn't have to generate it itself.
> 
> OK, let's focus on where the tree comes from. It is certainly
> possible to build protocol to exchange parts of a Merkle tree.

Which is what I think we need to extend IMA to do.

>  The question is how it might be stored on the server.

I think the only thing the server has to guarantee to store is the head
hash, possibly signed.

>  There are some underlying assumptions about the metadata storage
> mechanism that should be stated up front.
> 
> Current forms of IMA metadata are limited in size and stored in a
> container that is read and written in a single operation. If we stick
> with that container format, I don't see a way to store a Merkle tree
> in there for all file sizes.

Well, I don't think you need to.  The only thing that needs to be
stored is the head hash.  Everything else can be reconstructed.  If you
asked me to implement it locally, I'd probably put the head hash in an
xattr but use a CAM based cache for the merkel trees and construct the
tree on first access if it weren't already in the cache.

However, the above isn't what fs-verity does: it stores the tree in a
hidden section of the file.  That's why I don't think we'd mandate
anything about tree storage.  Just describe the partial retrieval
properties we'd like and leave the rest as an implementation detail.

> Thus it seems to me that we cannot begin to consider the tree-on-the-
> server model unless there is a proposed storage mechanism for that
> whole tree. Otherwise, the client must have the primary role in
> unpacking and verifying the tree.

Well, as I said,  I don't think you need to store the tree.  You
certainly could decide to store the entire tree (as fs-verity does) if
it fitted your use case, but it's not required.  Perhaps even in my
case I'd make the CAM based cache persistent, like android's dalvik
cache.

James


> Storing only the tree root in the metadata means the metadata format
> is nicely bounded in size.
Chuck Lever Aug. 14, 2020, 2:21 p.m. UTC | #41
> On Aug 13, 2020, at 11:10 AM, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> 
> On Thu, 2020-08-13 at 10:42 -0400, Chuck Lever wrote:
>>> On Aug 12, 2020, at 11:51 AM, James Bottomley <James.Bottomley@Hans
>>> enPartnership.com> wrote:
>>> On Wed, 2020-08-12 at 10:15 -0400, Chuck Lever wrote:
>>>>> On Aug 11, 2020, at 11:53 AM, James Bottomley
>>>>> <James.Bottomley@HansenPartnership.com> wrote:
>>>>> On Tue, 2020-08-11 at 10:48 -0400, Chuck Lever wrote:
> [...]
>>>>>>>> The client would have to reconstruct that tree again if
>>>>>>>> memory pressure caused some or all of the tree to be
>>>>>>>> evicted, so perhaps an on-demand mechanism is preferable.
>>>>>>> 
>>>>>>> Right, but I think that's implementation detail.  Probably
>>>>>>> what we need is a way to get the log(N) verification hashes
>>>>>>> from the server and it's up to the client whether it caches
>>>>>>> them or not.
>>>>>> 
>>>>>> Agreed, these are implementation details. But see above about
>>>>>> the trustworthiness of the intermediate hashes. If they are
>>>>>> conveyed on an untrusted network, then they can't be trusted
>>>>>> either.
>>>>> 
>>>>> Yes, they can, provided enough of them are asked for to
>>>>> verify.  If you look at the simple example above, suppose I
>>>>> have cached H11 and H12, but I've lost the entire H2X layer.  I
>>>>> want to verify B3 so I also ask you for your copy of H24.  Then
>>>>> I generate H23 from B3 and Hash H23 and H24.  If this doesn't
>>>>> hash to H12 I know either you supplied me the wrong block or
>>>>> lied about H24.  However, if it all hashes correctly I know you
>>>>> supplied me with both the correct B3 and the correct H24.
>>>> 
>>>> My point is there is a difference between a trusted cache and an
>>>> untrusted cache. I argue there is not much value in a cache where
>>>> the hashes have to be verified again.
>>> 
>>> And my point isn't about caching, it's about where the tree comes
>>> from. I claim and you agree the client can get the tree from the
>>> server a piece at a time (because it can path verify it) and
>>> doesn't have to generate it itself.
>> 
>> OK, let's focus on where the tree comes from. It is certainly
>> possible to build protocol to exchange parts of a Merkle tree.
> 
> Which is what I think we need to extend IMA to do.
> 
>> The question is how it might be stored on the server.
> 
> I think the only thing the server has to guarantee to store is the head
> hash, possibly signed.
> 
>> There are some underlying assumptions about the metadata storage
>> mechanism that should be stated up front.
>> 
>> Current forms of IMA metadata are limited in size and stored in a
>> container that is read and written in a single operation. If we stick
>> with that container format, I don't see a way to store a Merkle tree
>> in there for all file sizes.
> 
> Well, I don't think you need to.  The only thing that needs to be
> stored is the head hash.  Everything else can be reconstructed.  If you
> asked me to implement it locally, I'd probably put the head hash in an
> xattr but use a CAM based cache for the merkel trees and construct the
> tree on first access if it weren't already in the cache.

The contents of the security.ima xattr might be modeled after
EVM_IMA_DIGSIG:

- a format enumerator (used by all IMA metadata formats)
- the tree's unit size
- a fingerprint of the signer's certificate
  - digest algorithm name and full digest
- the root hash, always signed
  - signing algorithm name and signature

The rest of the hash tree is always stored somewhere else or
constructed on-demand.

My experience of security communities both within and outside the
IETF is that they would insist on always having a signature.

If one doesn't care about signing, a self-signed certificate can be
automatically provisioned when ima-evm-utils is installed that can
be used for those cases. That would make the signature process
invisible to any administrator who doesn't care about signed
metadata.

Because storage in NFS would cross trust boundaries, it would have
to require the use of a signed root hash. I don't want to be in the
position where copying a file with an unsigned root hash into NFS
makes it unreadable because of a change in policy.


> However, the above isn't what fs-verity does: it stores the tree in a
> hidden section of the file.  That's why I don't think we'd mandate
> anything about tree storage.  Just describe the partial retrieval
> properties we'd like and leave the rest as an implementation detail.

I'm starting to consider how much compatibility with fs-verity is
required. There are several forms of hash-tree, and a specification
of the IMA metadata format would need to describe exactly how to
form the tree root. If we want compatibility with fs-verity, then
it is reasonable to assume that this IMA metadata format might be
required to use the same hash tree construction algorithm that
fs-verity uses.

The original Merkle tree concept was patented 40 years ago. I'm not
clear yet on whether the patent encumbers the use of Merkle trees
in any way, but since their usage seems pretty widespread in P2P
and BitCoin applications, I'm guessing the answer to that is
favorable. More research needed.

There is an implementation used by several GNU utilities that is
available as a piece of GPL code. It could be a potential blocker
if that was the tree algorithm that fs-verity uses -- as discussed
in the other thread.

Apparently there are some known weaknesses in older hash tree
algorithms, including at least one CVE. We could choose a recent
algorithm, but perhaps there needs to be a degree of extensibility
in case that algorithm needs to be updated due to a subsequent
security issue.

Tree construction could include a few items besides file content to
help secure the hash further. For instance the file's size and mtime,
as well as the depth of the tree, could be included in the signature.
But that depends on whether it can be done while maintaining
compatibility with fs-verity.

I would feel better if someone with more domain expertise chimed in.


>> Thus it seems to me that we cannot begin to consider the tree-on-the-
>> server model unless there is a proposed storage mechanism for that
>> whole tree. Otherwise, the client must have the primary role in
>> unpacking and verifying the tree.
> 
> Well, as I said,  I don't think you need to store the tree.

We basically agree there.


> You certainly could decide to store the entire tree (as fs-verity does) if
> it fitted your use case, but it's not required.  Perhaps even in my
> case I'd make the CAM based cache persistent, like android's dalvik
> cache.
> 
> James
> 
> 
>> Storing only the tree root in the metadata means the metadata format
>> is nicely bounded in size.

--
Chuck Lever
chucklever@gmail.com