diff mbox

[RESEND,08/12] ima: added parser for RPM data type

Message ID 20170801102036.15371-1-roberto.sassu@huawei.com (mailing list archive)
State New, archived
Headers show

Commit Message

Roberto Sassu Aug. 1, 2017, 10:20 a.m. UTC
This patch introduces a parser for RPM packages. It extracts the digests
from the RPMTAG_FILEDIGESTS header section and converts them to binary data
before adding them to the hash table.

The advantage of this data type is that verifiers can determine who
produced that data, as headers are signed by Linux distributions vendors.
RPM headers signatures can be provided as digest list metadata.

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
---
 security/integrity/ima/ima_digest_list.c | 86 +++++++++++++++++++++++++++++++-
 1 file changed, 85 insertions(+), 1 deletion(-)

Comments

Christoph Hellwig Aug. 1, 2017, 10:27 a.m. UTC | #1
On Tue, Aug 01, 2017 at 12:20:36PM +0200, Roberto Sassu wrote:
> This patch introduces a parser for RPM packages. It extracts the digests
> from the RPMTAG_FILEDIGESTS header section and converts them to binary data
> before adding them to the hash table.
> 
> The advantage of this data type is that verifiers can determine who
> produced that data, as headers are signed by Linux distributions vendors.
> RPM headers signatures can be provided as digest list metadata.

Err, parsing arbitrary file formats has no business in the kernel.
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Roberto Sassu Aug. 1, 2017, 10:58 a.m. UTC | #2
On 8/1/2017 12:27 PM, Christoph Hellwig wrote:
> On Tue, Aug 01, 2017 at 12:20:36PM +0200, Roberto Sassu wrote:
>> This patch introduces a parser for RPM packages. It extracts the digests
>> from the RPMTAG_FILEDIGESTS header section and converts them to binary data
>> before adding them to the hash table.
>>
>> The advantage of this data type is that verifiers can determine who
>> produced that data, as headers are signed by Linux distributions vendors.
>> RPM headers signatures can be provided as digest list metadata.
>
> Err, parsing arbitrary file formats has no business in the kernel.

The benefit of this choice is that no actions are required for
Linux distribution vendors to support the solution I'm proposing,
because they already provide signed digest lists (RPM headers).

Since the proof of loading a digest list is the digest of the
digest list (included in the list metadata), if RPM headers are
converted to a different format, remote attestation verifiers
cannot check the signature.

If the concern is security, it would be possible to prevent unsigned
RPM headers from being parsed, if the PGP key type is upstreamed
(adding in CC keyrings@vger.kernel.org).

Roberto
James Morris Aug. 2, 2017, 7:22 a.m. UTC | #3
On Tue, 1 Aug 2017, Roberto Sassu wrote:

> On 8/1/2017 12:27 PM, Christoph Hellwig wrote:
> > On Tue, Aug 01, 2017 at 12:20:36PM +0200, Roberto Sassu wrote:
> > > This patch introduces a parser for RPM packages. It extracts the digests
> > > from the RPMTAG_FILEDIGESTS header section and converts them to binary
> > > data
> > > before adding them to the hash table.
> > >
> > > The advantage of this data type is that verifiers can determine who
> > > produced that data, as headers are signed by Linux distributions vendors.
> > > RPM headers signatures can be provided as digest list metadata.
> >
> > Err, parsing arbitrary file formats has no business in the kernel.
> 
> The benefit of this choice is that no actions are required for
> Linux distribution vendors to support the solution I'm proposing,
> because they already provide signed digest lists (RPM headers).
> 
> Since the proof of loading a digest list is the digest of the
> digest list (included in the list metadata), if RPM headers are
> converted to a different format, remote attestation verifiers
> cannot check the signature.
> 
> If the concern is security, it would be possible to prevent unsigned
> RPM headers from being parsed, if the PGP key type is upstreamed
> (adding in CC keyrings@vger.kernel.org).

It's a security concern and also a layering violation, there should be no 
need to parse package file formats in the kernel.

I'm not really clear on exactly how this patch series works.  Can you 
provide a more concrete explanation of what steps would occur during boot 
and attestation?
Roberto Sassu Aug. 2, 2017, 11:22 a.m. UTC | #4
On 8/2/2017 9:22 AM, James Morris wrote:
> On Tue, 1 Aug 2017, Roberto Sassu wrote:
>
>> On 8/1/2017 12:27 PM, Christoph Hellwig wrote:
>>> On Tue, Aug 01, 2017 at 12:20:36PM +0200, Roberto Sassu wrote:
>>>> This patch introduces a parser for RPM packages. It extracts the digests
>>>> from the RPMTAG_FILEDIGESTS header section and converts them to binary
>>>> data
>>>> before adding them to the hash table.
>>>>
>>>> The advantage of this data type is that verifiers can determine who
>>>> produced that data, as headers are signed by Linux distributions vendors.
>>>> RPM headers signatures can be provided as digest list metadata.
>>>
>>> Err, parsing arbitrary file formats has no business in the kernel.
>>
>> The benefit of this choice is that no actions are required for
>> Linux distribution vendors to support the solution I'm proposing,
>> because they already provide signed digest lists (RPM headers).
>>
>> Since the proof of loading a digest list is the digest of the
>> digest list (included in the list metadata), if RPM headers are
>> converted to a different format, remote attestation verifiers
>> cannot check the signature.
>>
>> If the concern is security, it would be possible to prevent unsigned
>> RPM headers from being parsed, if the PGP key type is upstreamed
>> (adding in CC keyrings@vger.kernel.org).
>
> It's a security concern and also a layering violation, there should be no
> need to parse package file formats in the kernel.
>
> I'm not really clear on exactly how this patch series works.  Can you
> provide a more concrete explanation of what steps would occur during boot
> and attestation?

The main idea of this patch set is that, if a system executes
or reads good files (e.g. those from a Linux distribution),
the difference between the assertion 'a file could have possibly
been accessed' and 'a file has been accessed' is not relevant
for verifiers that only check the provenance of software.

Then, for those verifiers, a measurement representing the list of
good files which could have possibly been accessed gives the same
guarantees of individual file measurements.

The patch set introduces two data types:

- digest list: contains the digests of good files
- list metadata: contains the digest, the signature and the path
                  of each digest list to load (why loading many
                  lists instead of one will be clear after I explain
                  the remote attestation verification process)

Steps at boot:

1) systemd reads the path of list metadata from /etc/ima/digest-lists
    and writes it to a new securityfs file created by IMA
2) IMA reads and parses the list metadata (same mechanism for
    loading a policy, already upstreamed)
3) for each list metadata, IMA reads and parses the digest list
    and adds the file digests to a hash table
4) when a file is accessed, IMA calculates the digest as before
    and searches the file digest in the new hash table; if the
    digest is found, IMA sets the IMA_MEASURED flag in the inode
    metadata and clears the IMA_MEASURE action

Notes:

- list metadata and digest lists are measured before IMA reads them
- the digest of digest lists is also added to the hash table, otherwise
   there would be a measurement for each digest list

The measurement list looks like:

10 <template digest> ima-ng <digest> boot_aggregate
10 <template digest> ima-ng <digest> systemd (exe + libs)
10 <template digest> ima-ng <digest> /etc/ima/digest-lists
10 <template digest> ima-ng <digest> <list metadata>
10 <template digest> ima-ng <digest> <unknown files>


Steps during the verification:

Case 1: list metadata and digest lists are provided to verifiers

This is necessary when:
- digest lists are not signed
- verifiers do not trust the signer
- verifiers want to perform more checks on digest lists
   (digest lists may contain digest of outdated software)

Verifiers:

1) parse the list metadata received
2) for each digest list received, calculate the digest and
    compare it with the digest included in the list metadata
3) calculate the digest of list metadata and compare it with
    the digest in the measurement list
4) calculate the digest of the path of list metadata and compare
    it with the digest of /etc/ima/digest-lists in the measurement list
5) check boot_aggregate, systemd exe and libs, and unknown files
6) check the digest lists


Case 2: only list metadata is provided to verifiers

Verifiers:

1) parse the list metadata received
2) for each digest list, verify the signature
3) calculate the digest of the path of list metadata and compare
    it with the digest of /etc/ima/digest-lists in the measurement list
4) check boot_aggregate, systemd exe and libs, and unknown files

In Case 2, the verification process is simplified, because if
the signature of digest lists is valid, this means that possibly
accessed files are provided by the signer.

The problem here is that verifiers know the digest of possibly
accessed files from the measurement done by IMA at the time
digest lists are read. If IMA cannot parse the original (signed)
digest list, it would measure something that cannot be verified
with the signature.

RPM-based distributions already provide signed digest lists
(the RPM headers) for each package. To avoid the performance
penalty due to extending a PCR for each digest list, only
an entry for the list metadata is added to the measurement list.

For RPM-based Linux distributions, the full lifecycle for configuring
IMA can be implemented with very low effort. The tasks are:

1) systemd patch to load list metadata: just extend the existing
    patch to load the policy
2) userspace tools to parse the RPM database: done
3) dracut module to generate and include the digest lists and
    metadata into the initial ram disk: at the moment this is done
    from the dracut command line
4) plugin for the software management that executes the tool
    to generate the digest list for updated packages: to be implemented

Roberto
Roberto Sassu Aug. 9, 2017, 9:15 a.m. UTC | #5
On 8/2/2017 9:22 AM, James Morris wrote:
> On Tue, 1 Aug 2017, Roberto Sassu wrote:
>
>> On 8/1/2017 12:27 PM, Christoph Hellwig wrote:
>>> On Tue, Aug 01, 2017 at 12:20:36PM +0200, Roberto Sassu wrote:
>>>> This patch introduces a parser for RPM packages. It extracts the digests
>>>> from the RPMTAG_FILEDIGESTS header section and converts them to binary
>>>> data
>>>> before adding them to the hash table.
>>>>
>>>> The advantage of this data type is that verifiers can determine who
>>>> produced that data, as headers are signed by Linux distributions vendors.
>>>> RPM headers signatures can be provided as digest list metadata.
>>>
>>> Err, parsing arbitrary file formats has no business in the kernel.
>>
>> The benefit of this choice is that no actions are required for
>> Linux distribution vendors to support the solution I'm proposing,
>> because they already provide signed digest lists (RPM headers).
>>
>> Since the proof of loading a digest list is the digest of the
>> digest list (included in the list metadata), if RPM headers are
>> converted to a different format, remote attestation verifiers
>> cannot check the signature.
>>
>> If the concern is security, it would be possible to prevent unsigned
>> RPM headers from being parsed, if the PGP key type is upstreamed
>> (adding in CC keyrings@vger.kernel.org).
>
> It's a security concern and also a layering violation, there should be no
> need to parse package file formats in the kernel.

Parsing RPMs is not strictly necessary. Digests from the headers
can be extracted and written to a new file using the compact data
format (introduced with patch 7/12).

At boot time, IMA measures this file before digests are uploaded to the
kernel. At this point, only files with unknown digest will be added
to the measurement list. At verification time, verifiers recreate the
measurement list by merging together the digests uploaded to the
kernel with the unknown digests. Then, they verify the obtained list.

There are two ways to verify the digests: searching them in a reference
database, or checking a signature. With the 'ima-sig' measurement list
template, it is possible to verify signatures for each accessed file.
With this patch set, it is possible to verify the signature of
the file containing the digests uploaded to the kernel. If the data
format changes, the signature cannot be verified.

To avoid this limitation, the parsers could be moved to a userspace
tool which then uploads the parsed digests to the kernel. IMA would
measure the original files. But, if the tool is compromised, it could
load digests not included in the parsed files. With the current solution
this problem does not arise because no changes can be done by userspace
applications to the uploaded data while digests are parsed by IMA.

I could remove the RPM parser from the patch set for now.

Is the remaining part of the patch set ok, and is the explanation of
what it does clear?

Thanks

Roberto


> I'm not really clear on exactly how this patch series works.  Can you
> provide a more concrete explanation of what steps would occur during boot
> and attestation?
>
Mimi Zohar Aug. 9, 2017, 2:30 p.m. UTC | #6
On Wed, 2017-08-09 at 11:15 +0200, Roberto Sassu wrote:
> On 8/2/2017 9:22 AM, James Morris wrote:
> > On Tue, 1 Aug 2017, Roberto Sassu wrote:
> >
> >> On 8/1/2017 12:27 PM, Christoph Hellwig wrote:
> >>> On Tue, Aug 01, 2017 at 12:20:36PM +0200, Roberto Sassu wrote:
> >>>> This patch introduces a parser for RPM packages. It extracts the digests
> >>>> from the RPMTAG_FILEDIGESTS header section and converts them to binary
> >>>> data
> >>>> before adding them to the hash table.
> >>>>
> >>>> The advantage of this data type is that verifiers can determine who
> >>>> produced that data, as headers are signed by Linux distributions vendors.
> >>>> RPM headers signatures can be provided as digest list metadata.
> >>>
> >>> Err, parsing arbitrary file formats has no business in the kernel.
> >>
> >> The benefit of this choice is that no actions are required for
> >> Linux distribution vendors to support the solution I'm proposing,
> >> because they already provide signed digest lists (RPM headers).
> >>
> >> Since the proof of loading a digest list is the digest of the
> >> digest list (included in the list metadata), if RPM headers are
> >> converted to a different format, remote attestation verifiers
> >> cannot check the signature.
> >>
> >> If the concern is security, it would be possible to prevent unsigned
> >> RPM headers from being parsed, if the PGP key type is upstreamed
> >> (adding in CC keyrings@vger.kernel.org).
> >
> > It's a security concern and also a layering violation, there should be no
> > need to parse package file formats in the kernel.
> 
> Parsing RPMs is not strictly necessary. Digests from the headers
> can be extracted and written to a new file using the compact data
> format (introduced with patch 7/12).
> 
> At boot time, IMA measures this file before digests are uploaded to the
> kernel. At this point, only files with unknown digest will be added
> to the measurement list. At verification time, verifiers recreate the
> measurement list by merging together the digests uploaded to the
> kernel with the unknown digests. Then, they verify the obtained list.
> 
> There are two ways to verify the digests: searching them in a reference
> database, or checking a signature. With the 'ima-sig' measurement list
> template, it is possible to verify signatures for each accessed file.
> With this patch set, it is possible to verify the signature of
> the file containing the digests uploaded to the kernel. If the data
> format changes, the signature cannot be verified.
> 
> To avoid this limitation, the parsers could be moved to a userspace
> tool which then uploads the parsed digests to the kernel. IMA would
> measure the original files. But, if the tool is compromised, it could
> load digests not included in the parsed files. With the current solution
> this problem does not arise because no changes can be done by userspace
> applications to the uploaded data while digests are parsed by IMA.
> 
> I could remove the RPM parser from the patch set for now.
> 
> Is the remaining part of the patch set ok, and is the explanation of
> what it does clear?

From a trusted boot perspective, file measurements are added to the
measurement list, before access to the file is given.  The measurement
list contains ALL measurements, as defined by policy.  This patch set
changes that meaning to be all measurements, as defined by policy,
with the exception of those in a white list.

Changing the fundamental meaning of the measurement list is not
acceptable.  You could define a new securityfs file to differentiate
between the full measurement list and this abbreviated one.  But
before making this sort of change, I would prefer to address the
underlying problem - TPM peformance.

There are a couple of things that could be done to improve the TPM
driver performance, itself.  Once all of these options have been
pursued, we could then consider batching the measurements to the TPM,
meaning that the measurement list would still contain all the file
measurements, but instead of extending the TPM for each measurement, a
batched hash - a hash of a group of file measurements - would be
extended into the TPM.

Mimi

> > I'm not really clear on exactly how this patch series works.  Can you
> > provide a more concrete explanation of what steps would occur during boot
> > and attestation?
> >

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Roberto Sassu Aug. 9, 2017, 5:18 p.m. UTC | #7
On 8/9/2017 4:30 PM, Mimi Zohar wrote:
> On Wed, 2017-08-09 at 11:15 +0200, Roberto Sassu wrote:
>> On 8/2/2017 9:22 AM, James Morris wrote:
>>> On Tue, 1 Aug 2017, Roberto Sassu wrote:
>>>
>>>> On 8/1/2017 12:27 PM, Christoph Hellwig wrote:
>>>>> On Tue, Aug 01, 2017 at 12:20:36PM +0200, Roberto Sassu wrote:
>>>>>> This patch introduces a parser for RPM packages. It extracts the digests
>>>>>> from the RPMTAG_FILEDIGESTS header section and converts them to binary
>>>>>> data
>>>>>> before adding them to the hash table.
>>>>>>
>>>>>> The advantage of this data type is that verifiers can determine who
>>>>>> produced that data, as headers are signed by Linux distributions vendors.
>>>>>> RPM headers signatures can be provided as digest list metadata.
>>>>>
>>>>> Err, parsing arbitrary file formats has no business in the kernel.
>>>>
>>>> The benefit of this choice is that no actions are required for
>>>> Linux distribution vendors to support the solution I'm proposing,
>>>> because they already provide signed digest lists (RPM headers).
>>>>
>>>> Since the proof of loading a digest list is the digest of the
>>>> digest list (included in the list metadata), if RPM headers are
>>>> converted to a different format, remote attestation verifiers
>>>> cannot check the signature.
>>>>
>>>> If the concern is security, it would be possible to prevent unsigned
>>>> RPM headers from being parsed, if the PGP key type is upstreamed
>>>> (adding in CC keyrings@vger.kernel.org).
>>>
>>> It's a security concern and also a layering violation, there should be no
>>> need to parse package file formats in the kernel.
>>
>> Parsing RPMs is not strictly necessary. Digests from the headers
>> can be extracted and written to a new file using the compact data
>> format (introduced with patch 7/12).
>>
>> At boot time, IMA measures this file before digests are uploaded to the
>> kernel. At this point, only files with unknown digest will be added
>> to the measurement list. At verification time, verifiers recreate the
>> measurement list by merging together the digests uploaded to the
>> kernel with the unknown digests. Then, they verify the obtained list.
>>
>> There are two ways to verify the digests: searching them in a reference
>> database, or checking a signature. With the 'ima-sig' measurement list
>> template, it is possible to verify signatures for each accessed file.
>> With this patch set, it is possible to verify the signature of
>> the file containing the digests uploaded to the kernel. If the data
>> format changes, the signature cannot be verified.
>>
>> To avoid this limitation, the parsers could be moved to a userspace
>> tool which then uploads the parsed digests to the kernel. IMA would
>> measure the original files. But, if the tool is compromised, it could
>> load digests not included in the parsed files. With the current solution
>> this problem does not arise because no changes can be done by userspace
>> applications to the uploaded data while digests are parsed by IMA.
>>
>> I could remove the RPM parser from the patch set for now.
>>
>> Is the remaining part of the patch set ok, and is the explanation of
>> what it does clear?
>
> From a trusted boot perspective, file measurements are added to the
> measurement list, before access to the file is given.  The measurement
> list contains ALL measurements, as defined by policy.  This patch set
> changes that meaning to be all measurements, as defined by policy,
> with the exception of those in a white list.

The digest list is also measured, so the measurement list is complete.
Verifiers have to check the digest of digest lists. Otherwise, they
would get an unknown digest and conclude that the system being verified
has been compromised.

If you prefer, I could add a new policy rule option to avoid file
measurements if the digest is in the digest list.


> Changing the fundamental meaning of the measurement list is not
> acceptable.  You could define a new securityfs file to differentiate
> between the full measurement list and this abbreviated one.  But

There cannot be two measurement lists at the same time. Providing the
full measurement list (containing the digest of files being accessed)
implies that its integrity must be protected with PCR extends, making
the optimization done by this patch set useless.


> before making this sort of change, I would prefer to address the
> underlying problem - TPM peformance.

Even if the TPM driver performance improves significantly (17 seconds
for 1000 extends), the boot time delay would be still noticeable
(8.5 seconds for normal boot + 24 seconds for 1400 PCR extends).

In my opinion, this patch set is useful without considering the
performance improvement: reduced size of measurement lists and
verification of digest list signatures, instead of file signatures,
where signatures are already provided by Linux distributions.


> There are a couple of things that could be done to improve the TPM
> driver performance, itself.  Once all of these options have been
> pursued, we could then consider batching the measurements to the TPM,
> meaning that the measurement list would still contain all the file
> measurements, but instead of extending the TPM for each measurement, a
> batched hash - a hash of a group of file measurements - would be
> extended into the TPM.

Probably, I didn't explain clearly that this patch set does not decrease
the security of IMA.

Extending the PCR for a group of file measurements means that the system
can be compromised between two PCR extends without detection because
a malicious binary could alter IMA before the next extend.

This patch set extends the PCR with the digest of digest lists, before
files are accessed. No actions happen before either the digest lists
have been measured or the file measurement is added to the measurement
list, if the file digest is not included in the digest list.

Roberto


> Mimi
>
>>> I'm not really clear on exactly how this patch series works.  Can you
>>> provide a more concrete explanation of what steps would occur during boot
>>> and attestation?
>>>
>
Mimi Zohar Aug. 10, 2017, 1:12 p.m. UTC | #8
On Wed, 2017-08-09 at 19:18 +0200, Roberto Sassu wrote:
> On 8/9/2017 4:30 PM, Mimi Zohar wrote:
> > On Wed, 2017-08-09 at 11:15 +0200, Roberto Sassu wrote:
> >> On 8/2/2017 9:22 AM, James Morris wrote:
> >>> On Tue, 1 Aug 2017, Roberto Sassu wrote:
> >>>
> >>>> On 8/1/2017 12:27 PM, Christoph Hellwig wrote:
> >>>>> On Tue, Aug 01, 2017 at 12:20:36PM +0200, Roberto Sassu wrote:
> >>>>>> This patch introduces a parser for RPM packages. It extracts the digests
> >>>>>> from the RPMTAG_FILEDIGESTS header section and converts them to binary
> >>>>>> data
> >>>>>> before adding them to the hash table.
> >>>>>>
> >>>>>> The advantage of this data type is that verifiers can determine who
> >>>>>> produced that data, as headers are signed by Linux distributions vendors.
> >>>>>> RPM headers signatures can be provided as digest list metadata.
> >>>>>
> >>>>> Err, parsing arbitrary file formats has no business in the kernel.
> >>>>
> >>>> The benefit of this choice is that no actions are required for
> >>>> Linux distribution vendors to support the solution I'm proposing,
> >>>> because they already provide signed digest lists (RPM headers).
> >>>>
> >>>> Since the proof of loading a digest list is the digest of the
> >>>> digest list (included in the list metadata), if RPM headers are
> >>>> converted to a different format, remote attestation verifiers
> >>>> cannot check the signature.
> >>>>
> >>>> If the concern is security, it would be possible to prevent unsigned
> >>>> RPM headers from being parsed, if the PGP key type is upstreamed
> >>>> (adding in CC keyrings@vger.kernel.org).
> >>>
> >>> It's a security concern and also a layering violation, there should be no
> >>> need to parse package file formats in the kernel.
> >>
> >> Parsing RPMs is not strictly necessary. Digests from the headers
> >> can be extracted and written to a new file using the compact data
> >> format (introduced with patch 7/12).
> >>
> >> At boot time, IMA measures this file before digests are uploaded to the
> >> kernel. At this point, only files with unknown digest will be added
> >> to the measurement list. At verification time, verifiers recreate the
> >> measurement list by merging together the digests uploaded to the
> >> kernel with the unknown digests. Then, they verify the obtained list.
> >>
> >> There are two ways to verify the digests: searching them in a reference
> >> database, or checking a signature. With the 'ima-sig' measurement list
> >> template, it is possible to verify signatures for each accessed file.
> >> With this patch set, it is possible to verify the signature of
> >> the file containing the digests uploaded to the kernel. If the data
> >> format changes, the signature cannot be verified.
> >>
> >> To avoid this limitation, the parsers could be moved to a userspace
> >> tool which then uploads the parsed digests to the kernel. IMA would
> >> measure the original files. But, if the tool is compromised, it could
> >> load digests not included in the parsed files. With the current solution
> >> this problem does not arise because no changes can be done by userspace
> >> applications to the uploaded data while digests are parsed by IMA.
> >>
> >> I could remove the RPM parser from the patch set for now.
> >>
> >> Is the remaining part of the patch set ok, and is the explanation of
> >> what it does clear?
> >
> > From a trusted boot perspective, file measurements are added to the
> > measurement list, before access to the file is given.  The measurement
> > list contains ALL measurements, as defined by policy.  This patch set
> > changes that meaning to be all measurements, as defined by policy,
> > with the exception of those in a white list.
> 
> The digest list is also measured, so the measurement list is complete.
> Verifiers have to check the digest of digest lists. Otherwise, they
> would get an unknown digest and conclude that the system being verified
> has been compromised.

Your proposal is basically a pre-approved "batched" measurement, of a
set of known good measurements, without the corresponding list of
measurements that this "batched" measurement represents.  Right?

This pre-approved "batched" measurement represents not what has been
accessed/executed on the system, but what potentially could be
accessed/executed.  That's a major difference.

> If you prefer, I could add a new policy rule option to avoid file
> measurements if the digest is in the digest list.

Huh?  Patch "ima: don't report measurements if digests are included in
the loaded lists" is already doing this.

> 
> > Changing the fundamental meaning of the measurement list is not
> > acceptable.  You could define a new securityfs file to differentiate
> > between the full measurement list and this abbreviated one.  But
> 
> There cannot be two measurement lists at the same time. Providing the
> full measurement list (containing the digest of files being accessed)
> implies that its integrity must be protected with PCR extends, making
> the optimization done by this patch set useless.

True, so you would be able to configure the system with one or the
other type of list, not both.  At least there would be a clear
understanding of what that list represents.

> 
> > before making this sort of change, I would prefer to address the
> > underlying problem - TPM peformance.
> 
> Even if the TPM driver performance improves significantly (17 seconds
> for 1000 extends), the boot time delay would be still noticeable
> (8.5 seconds for normal boot + 24 seconds for 1400 PCR extends).

Agreed, there is still room for more TPM improvements.  Just Nayna's
one patch, without any other changes, brought the timing down from 53s
for a 1000 extends to just 11s.  (The initial patch was Nack'ed, but
we're working with the tpmdd and the TCG's device driver work group
(DDWG).)

> In my opinion, this patch set is useful without considering the
> performance improvement: reduced size of measurement lists and
> verification of digest list signatures, instead of file signatures,
> where signatures are already provided by Linux distributions.

Right, there's always a trade off.  My suggestion, assuming we go with
this approach, would be to make that trade off clear by using
different lists.

> 
> > There are a couple of things that could be done to improve the TPM
> > driver performance, itself.  Once all of these options have been
> > pursued, we could then consider batching the measurements to the TPM,
> > meaning that the measurement list would still contain all the file
> > measurements, but instead of extending the TPM for each measurement, a
> > batched hash - a hash of a group of file measurements - would be
> > extended into the TPM.
> 
> Probably, I didn't explain clearly that this patch set does not decrease
> the security of IMA.
> 
> Extending the PCR for a group of file measurements means that the system
> can be compromised between two PCR extends without detection because
> a malicious binary could alter IMA before the next extend.

Currently, a measurement is added to the measurement list and then is
used to extend the TPM, before returning to the caller.

A performance improvement would still first add the measurement to the
measurement list, but would then queue and wait for the measurement to
extend the TPM, before returning to the caller.  In a multi threaded
environment, the queued measurements could be "batched" - a hash of a
set of hashes - to extend the TPM.

The delay would be at most two times it takes to extend the TPM - one
to complete an existing current "batched" extend and another new
"batched" extend.

The difficulty with this approach is identifying which measurements
are included in which "batched" measurement.  This approach provides
the same guarantees as previously.

Before making the TPM performance problem an IMA issue and "fixing" it
in IMA, I would prefer that the TPM performance be addressed directly.

Mimi

> 
> This patch set extends the PCR with the digest of digest lists, before
> files are accessed. No actions happen before either the digest lists
> have been measured or the file measurement is added to the measurement
> list, if the file digest is not included in the digest list.


--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Roberto Sassu Aug. 17, 2017, 9:15 a.m. UTC | #9
On 8/10/2017 3:12 PM, Mimi Zohar wrote:
> On Wed, 2017-08-09 at 19:18 +0200, Roberto Sassu wrote:
>> On 8/9/2017 4:30 PM, Mimi Zohar wrote:
>>> On Wed, 2017-08-09 at 11:15 +0200, Roberto Sassu wrote:
>>>> On 8/2/2017 9:22 AM, James Morris wrote:
>>>>> On Tue, 1 Aug 2017, Roberto Sassu wrote:
>>>>>
>>>>>> On 8/1/2017 12:27 PM, Christoph Hellwig wrote:
>>>>>>> On Tue, Aug 01, 2017 at 12:20:36PM +0200, Roberto Sassu wrote:
>>>>>>>> This patch introduces a parser for RPM packages. It extracts the digests
>>>>>>>> from the RPMTAG_FILEDIGESTS header section and converts them to binary
>>>>>>>> data
>>>>>>>> before adding them to the hash table.
>>>>>>>>
>>>>>>>> The advantage of this data type is that verifiers can determine who
>>>>>>>> produced that data, as headers are signed by Linux distributions vendors.
>>>>>>>> RPM headers signatures can be provided as digest list metadata.
>>>>>>>
>>>>>>> Err, parsing arbitrary file formats has no business in the kernel.
>>>>>>
>>>>>> The benefit of this choice is that no actions are required for
>>>>>> Linux distribution vendors to support the solution I'm proposing,
>>>>>> because they already provide signed digest lists (RPM headers).
>>>>>>
>>>>>> Since the proof of loading a digest list is the digest of the
>>>>>> digest list (included in the list metadata), if RPM headers are
>>>>>> converted to a different format, remote attestation verifiers
>>>>>> cannot check the signature.
>>>>>>
>>>>>> If the concern is security, it would be possible to prevent unsigned
>>>>>> RPM headers from being parsed, if the PGP key type is upstreamed
>>>>>> (adding in CC keyrings@vger.kernel.org).
>>>>>
>>>>> It's a security concern and also a layering violation, there should be no
>>>>> need to parse package file formats in the kernel.
>>>>
>>>> Parsing RPMs is not strictly necessary. Digests from the headers
>>>> can be extracted and written to a new file using the compact data
>>>> format (introduced with patch 7/12).
>>>>
>>>> At boot time, IMA measures this file before digests are uploaded to the
>>>> kernel. At this point, only files with unknown digest will be added
>>>> to the measurement list. At verification time, verifiers recreate the
>>>> measurement list by merging together the digests uploaded to the
>>>> kernel with the unknown digests. Then, they verify the obtained list.
>>>>
>>>> There are two ways to verify the digests: searching them in a reference
>>>> database, or checking a signature. With the 'ima-sig' measurement list
>>>> template, it is possible to verify signatures for each accessed file.
>>>> With this patch set, it is possible to verify the signature of
>>>> the file containing the digests uploaded to the kernel. If the data
>>>> format changes, the signature cannot be verified.
>>>>
>>>> To avoid this limitation, the parsers could be moved to a userspace
>>>> tool which then uploads the parsed digests to the kernel. IMA would
>>>> measure the original files. But, if the tool is compromised, it could
>>>> load digests not included in the parsed files. With the current solution
>>>> this problem does not arise because no changes can be done by userspace
>>>> applications to the uploaded data while digests are parsed by IMA.
>>>>
>>>> I could remove the RPM parser from the patch set for now.
>>>>
>>>> Is the remaining part of the patch set ok, and is the explanation of
>>>> what it does clear?
>>>
>>> From a trusted boot perspective, file measurements are added to the
>>> measurement list, before access to the file is given.  The measurement
>>> list contains ALL measurements, as defined by policy.  This patch set
>>> changes that meaning to be all measurements, as defined by policy,
>>> with the exception of those in a white list.
>>
>> The digest list is also measured, so the measurement list is complete.
>> Verifiers have to check the digest of digest lists. Otherwise, they
>> would get an unknown digest and conclude that the system being verified
>> has been compromised.
>
> Your proposal is basically a pre-approved "batched" measurement, of a
> set of known good measurements, without the corresponding list of
> measurements that this "batched" measurement represents.  Right?

Yes, correct.


> This pre-approved "batched" measurement represents not what has been
> accessed/executed on the system, but what potentially could be
> accessed/executed.  That's a major difference.
>
>> If you prefer, I could add a new policy rule option to avoid file
>> measurements if the digest is in the digest list.
>
> Huh?  Patch "ima: don't report measurements if digests are included in
> the loaded lists" is already doing this.

Since the content of the measurement list depends on the policy,
adding a new option would give a better understanding of what the
measurement list represents. But, I agree that this is redundant.


>>> Changing the fundamental meaning of the measurement list is not
>>> acceptable.  You could define a new securityfs file to differentiate
>>> between the full measurement list and this abbreviated one.  But
>>
>> There cannot be two measurement lists at the same time. Providing the
>> full measurement list (containing the digest of files being accessed)
>> implies that its integrity must be protected with PCR extends, making
>> the optimization done by this patch set useless.
>
> True, so you would be able to configure the system with one or the
> other type of list, not both.  At least there would be a clear
> understanding of what that list represents.
>
>>
>>> before making this sort of change, I would prefer to address the
>>> underlying problem - TPM peformance.
>>
>> Even if the TPM driver performance improves significantly (17 seconds
>> for 1000 extends), the boot time delay would be still noticeable
>> (8.5 seconds for normal boot + 24 seconds for 1400 PCR extends).
>
> Agreed, there is still room for more TPM improvements.  Just Nayna's
> one patch, without any other changes, brought the timing down from 53s
> for a 1000 extends to just 11s.  (The initial patch was Nack'ed, but
> we're working with the tpmdd and the TCG's device driver work group
> (DDWG).)
>
>> In my opinion, this patch set is useful without considering the
>> performance improvement: reduced size of measurement lists and
>> verification of digest list signatures, instead of file signatures,
>> where signatures are already provided by Linux distributions.
>
> Right, there's always a trade off.  My suggestion, assuming we go with
> this approach, would be to make that trade off clear by using
> different lists.

You mean to add a new kernel command line option to create new
securityfs files instead of the existing ones
(ascii_runtime_measurements, binary_runtime_measurements)? I would
prefer a solution that does not change the interfaces, otherwise
remote attestation agents have to be modified. They can handle
the new list type, as the data format didn't change.

Thanks

Roberto


>>> There are a couple of things that could be done to improve the TPM
>>> driver performance, itself.  Once all of these options have been
>>> pursued, we could then consider batching the measurements to the TPM,
>>> meaning that the measurement list would still contain all the file
>>> measurements, but instead of extending the TPM for each measurement, a
>>> batched hash - a hash of a group of file measurements - would be
>>> extended into the TPM.
>>
>> Probably, I didn't explain clearly that this patch set does not decrease
>> the security of IMA.
>>
>> Extending the PCR for a group of file measurements means that the system
>> can be compromised between two PCR extends without detection because
>> a malicious binary could alter IMA before the next extend.
>
> Currently, a measurement is added to the measurement list and then is
> used to extend the TPM, before returning to the caller.
>
> A performance improvement would still first add the measurement to the
> measurement list, but would then queue and wait for the measurement to
> extend the TPM, before returning to the caller.  In a multi threaded
> environment, the queued measurements could be "batched" - a hash of a
> set of hashes - to extend the TPM.
>
> The delay would be at most two times it takes to extend the TPM - one
> to complete an existing current "batched" extend and another new
> "batched" extend.
>
> The difficulty with this approach is identifying which measurements
> are included in which "batched" measurement.  This approach provides
> the same guarantees as previously.
>
> Before making the TPM performance problem an IMA issue and "fixing" it
> in IMA, I would prefer that the TPM performance be addressed directly.
>
> Mimi
>
>>
>> This patch set extends the PCR with the digest of digest lists, before
>> files are accessed. No actions happen before either the digest lists
>> have been measured or the file measurement is added to the measurement
>> list, if the file digest is not included in the digest list.
>
>
diff mbox

Patch

diff --git a/security/integrity/ima/ima_digest_list.c b/security/integrity/ima/ima_digest_list.c
index c1ef79a..0b5916d 100644
--- a/security/integrity/ima/ima_digest_list.c
+++ b/security/integrity/ima/ima_digest_list.c
@@ -19,11 +19,13 @@ 
 #include "ima.h"
 #include "ima_template_lib.h"
 
+#define RPMTAG_FILEDIGESTS 1035
+
 enum digest_metadata_fields {DATA_ALGO, DATA_DIGEST, DATA_SIGNATURE,
 			     DATA_FILE_PATH, DATA_REF_ID, DATA_TYPE,
 			     DATA__LAST};
 
-enum digest_data_types {DATA_TYPE_COMPACT_LIST};
+enum digest_data_types {DATA_TYPE_COMPACT_LIST, DATA_TYPE_RPM};
 
 enum compact_list_entry_ids {COMPACT_LIST_ID_DIGEST};
 
@@ -33,6 +35,20 @@  struct compact_list_hdr {
 	u32 datalen;
 } __packed;
 
+struct rpm_hdr {
+	u32 magic;
+	u32 reserved;
+	u32 tags;
+	u32 datasize;
+} __packed;
+
+struct rpm_entryinfo {
+	int32_t tag;
+	u32 type;
+	int32_t offset;
+	u32 count;
+} __packed;
+
 static int ima_parse_compact_list(loff_t size, void *buf)
 {
 	void *bufp = buf, *bufendp = buf + size;
@@ -80,6 +96,71 @@  static int ima_parse_compact_list(loff_t size, void *buf)
 	return 0;
 }
 
+static int ima_parse_rpm(loff_t size, void *buf)
+{
+	void *bufp = buf, *bufendp = buf + size;
+	struct rpm_hdr *hdr = bufp;
+	u32 tags = be32_to_cpu(hdr->tags);
+	struct rpm_entryinfo *entry;
+	void *datap = bufp + sizeof(*hdr) + tags * sizeof(struct rpm_entryinfo);
+	int digest_len = hash_digest_size[ima_hash_algo];
+	u8 digest[digest_len];
+	int ret, i, j;
+
+	const unsigned char rpm_header_magic[8] = {
+		0x8e, 0xad, 0xe8, 0x01, 0x00, 0x00, 0x00, 0x00
+	};
+
+	if (size < sizeof(*hdr)) {
+		pr_err("Missing RPM header\n");
+		return -EINVAL;
+	}
+
+	if (memcmp(bufp, rpm_header_magic, sizeof(rpm_header_magic))) {
+		pr_err("Invalid RPM header\n");
+		return -EINVAL;
+	}
+
+	bufp += sizeof(*hdr);
+
+	for (i = 0; i < tags && (bufp + sizeof(*entry)) <= bufendp;
+	     i++, bufp += sizeof(*entry)) {
+		entry = bufp;
+
+		if (be32_to_cpu(entry->tag) != RPMTAG_FILEDIGESTS)
+			continue;
+
+		datap += be32_to_cpu(entry->offset);
+
+		for (j = 0; j < be32_to_cpu(entry->count) &&
+		     datap < bufendp; j++) {
+			if (strlen(datap) == 0) {
+				datap++;
+				continue;
+			}
+
+			if (datap + digest_len * 2 + 1 > bufendp) {
+				pr_err("RPM header read at invalid offset\n");
+				return -EINVAL;
+			}
+
+			ret = hex2bin(digest, datap, digest_len);
+			if (ret < 0)
+				return -EINVAL;
+
+			ret = ima_add_digest_data_entry(digest);
+			if (ret < 0 && ret != -EEXIST)
+				return ret;
+
+			datap += digest_len * 2 + 1;
+		}
+
+		break;
+	}
+
+	return 0;
+}
+
 static int ima_parse_digest_list_data(struct ima_field_data *data)
 {
 	void *digest_list;
@@ -107,6 +188,9 @@  static int ima_parse_digest_list_data(struct ima_field_data *data)
 	case DATA_TYPE_COMPACT_LIST:
 		ret = ima_parse_compact_list(digest_list_size, digest_list);
 		break;
+	case DATA_TYPE_RPM:
+		ret = ima_parse_rpm(digest_list_size, digest_list);
+		break;
 	default:
 		pr_err("Parser for data type %d not implemented\n", data_type);
 		ret = -EINVAL;