mbox series

[v2,0/3] initramfs: add support for xattrs in the initial ram disk

Message ID 20190509112420.15671-1-roberto.sassu@huawei.com (mailing list archive)
Headers show
Series initramfs: add support for xattrs in the initial ram disk | expand

Message

Roberto Sassu May 9, 2019, 11:24 a.m. UTC
This patch set aims at solving the following use case: appraise files from
the initial ram disk. To do that, IMA checks the signature/hash from the
security.ima xattr. Unfortunately, this use case cannot be implemented
currently, as the CPIO format does not support xattrs.

This proposal consists in marshaling pathnames and xattrs in a file called
.xattr-list. They are unmarshaled by the CPIO parser after all files have
been extracted.

The difference from v1 (https://lkml.org/lkml/2018/11/22/1182) is that all
xattrs are stored in a single file and not per file (solves the file name
limitation issue, as it is not necessary to add a suffix to files
containing xattrs).

The difference with another proposal
(https://lore.kernel.org/patchwork/cover/888071/) is that xattrs can be
included in an image without changing the image format, as opposed to
defining a new one. As seen from the discussion, if a new format has to be
defined, it should fix the issues of the existing format, which requires
more time.

To fulfill both requirements, adding support for xattrs in a short time and
defining a new image format properly, this patch set takes an incremental
approach: it introduces a parser of xattrs that can be used either if
xattrs are in a regular file or directly added to the image (this patch set
reuses patch 9/15 of the existing proposal); in addition, it introduces a
wrapper of the xattr parser, to read xattrs from a file.

The changes introduced by this patch set don't cause any compatibility
issue: kernels without the xattr parser simply extracts .xattr-list and
don't unmarshal xattrs; kernels with the xattr parser don't unmarshal
xattrs if .xattr-list is not found in the image.

From the kernel space perspective, backporting this functionality to older
kernels should be very easy. It is sufficient to add a call to the new
function do_readxattrs(). From the user space perspective, no change is
required for the use case. A new dracut module (module-setup.sh) will
execute:

getfattr --absolute-names -d -P -R -e hex -m security.ima \
    <file list> | xattr.awk -b > ${initdir}/.xattr-list

where xattr.awk is the script that marshals xattrs (see patch 3/3). The
same can be done with the initramfs-tools ram disk generator.

Changelog

v1:

- move xattr unmarshaling to CPIO parser


Mimi Zohar (1):
  initramfs: set extended attributes

Roberto Sassu (2):
  fs: add ksys_lsetxattr() wrapper
  initramfs: introduce do_readxattrs()

 fs/xattr.c               |   9 ++-
 include/linux/syscalls.h |   3 +
 init/initramfs.c         | 152 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 161 insertions(+), 3 deletions(-)

Comments

Rob Landley May 9, 2019, 6:34 p.m. UTC | #1
On 5/9/19 6:24 AM, Roberto Sassu wrote:
> This patch set aims at solving the following use case: appraise files from
> the initial ram disk. To do that, IMA checks the signature/hash from the
> security.ima xattr. Unfortunately, this use case cannot be implemented
> currently, as the CPIO format does not support xattrs.
> 
> This proposal consists in marshaling pathnames and xattrs in a file called
> .xattr-list. They are unmarshaled by the CPIO parser after all files have
> been extracted.

So it's in-band signalling that has a higher peak memory requirement.

> The difference with another proposal
> (https://lore.kernel.org/patchwork/cover/888071/) is that xattrs can be
> included in an image without changing the image format, as opposed to
> defining a new one. As seen from the discussion, if a new format has to be
> defined, it should fix the issues of the existing format, which requires
> more time.

So you've explicitly chosen _not_ to address Y2038 while you're there.

Rob
Roberto Sassu May 10, 2019, 6:56 a.m. UTC | #2
On 5/9/2019 8:34 PM, Rob Landley wrote:
> On 5/9/19 6:24 AM, Roberto Sassu wrote:
>> This patch set aims at solving the following use case: appraise files from
>> the initial ram disk. To do that, IMA checks the signature/hash from the
>> security.ima xattr. Unfortunately, this use case cannot be implemented
>> currently, as the CPIO format does not support xattrs.
>>
>> This proposal consists in marshaling pathnames and xattrs in a file called
>> .xattr-list. They are unmarshaled by the CPIO parser after all files have
>> been extracted.
> 
> So it's in-band signalling that has a higher peak memory requirement.

This can be modified. Now I allocate the memory necessary for the path
and all xattrs of a file (max: .xattr-list size - 10 bytes). I could
process each xattr individually (max: 255 + 1 + 65536 bytes).


>> The difference with another proposal
>> (https://lore.kernel.org/patchwork/cover/888071/) is that xattrs can be
>> included in an image without changing the image format, as opposed to
>> defining a new one. As seen from the discussion, if a new format has to be
>> defined, it should fix the issues of the existing format, which requires
>> more time.
> 
> So you've explicitly chosen _not_ to address Y2038 while you're there.

Can you be more specific?

Thanks

Roberto


> Rob
>
Mimi Zohar May 10, 2019, 11:49 a.m. UTC | #3
On Fri, 2019-05-10 at 08:56 +0200, Roberto Sassu wrote:
> On 5/9/2019 8:34 PM, Rob Landley wrote:
> > On 5/9/19 6:24 AM, Roberto Sassu wrote:

> >> The difference with another proposal
> >> (https://lore.kernel.org/patchwork/cover/888071/) is that xattrs can be
> >> included in an image without changing the image format, as opposed to
> >> defining a new one. As seen from the discussion, if a new format has to be
> >> defined, it should fix the issues of the existing format, which requires
> >> more time.
> > 
> > So you've explicitly chosen _not_ to address Y2038 while you're there.
> 
> Can you be more specific?

Right, this patch set avoids incrementing the CPIO magic number and
the resulting changes required (eg. increasing the timestamp field
size), by including a file with the security xattrs in the CPIO.  In
either case, including the security xattrs in the initramfs header or
as a separate file, the initramfs, itself, needs to be signed.

Mimi
Rob Landley May 10, 2019, 8:46 p.m. UTC | #4
On 5/10/19 6:49 AM, Mimi Zohar wrote:
> On Fri, 2019-05-10 at 08:56 +0200, Roberto Sassu wrote:
>> On 5/9/2019 8:34 PM, Rob Landley wrote:
>>> On 5/9/19 6:24 AM, Roberto Sassu wrote:
> 
>>>> The difference with another proposal
>>>> (https://lore.kernel.org/patchwork/cover/888071/) is that xattrs can be
>>>> included in an image without changing the image format, as opposed to
>>>> defining a new one. As seen from the discussion, if a new format has to be
>>>> defined, it should fix the issues of the existing format, which requires
>>>> more time.
>>>
>>> So you've explicitly chosen _not_ to address Y2038 while you're there.
>>
>> Can you be more specific?
> 
> Right, this patch set avoids incrementing the CPIO magic number and
> the resulting changes required (eg. increasing the timestamp field
> size), by including a file with the security xattrs in the CPIO.  In
> either case, including the security xattrs in the initramfs header or
> as a separate file, the initramfs, itself, needs to be signed.

The /init binary in the initramfs runs as root and launches all other processes
on the system. Presumably it can write any xattrs it wants to, and doesn't need
any extra permissions granted to it to do so. But as soon as you start putting
xattrs on _other_ files within the initramfs that are _not_ necessarily running
as PID 1, _that's_ when the need to sign the initramfs comes in?

Presumably the signing occurs on the gzipped file. How does that affect the cpio
parsing _after_ it's decompressed? Why would that be part of _this_ patch?

Rob
Mimi Zohar May 10, 2019, 10:38 p.m. UTC | #5
On Fri, 2019-05-10 at 15:46 -0500, Rob Landley wrote:
> On 5/10/19 6:49 AM, Mimi Zohar wrote:
> > On Fri, 2019-05-10 at 08:56 +0200, Roberto Sassu wrote:
> >> On 5/9/2019 8:34 PM, Rob Landley wrote:
> >>> On 5/9/19 6:24 AM, Roberto Sassu wrote:
> > 
> >>>> The difference with another proposal
> >>>> (https://lore.kernel.org/patchwork/cover/888071/) is that xattrs can be
> >>>> included in an image without changing the image format, as opposed to
> >>>> defining a new one. As seen from the discussion, if a new format has to be
> >>>> defined, it should fix the issues of the existing format, which requires
> >>>> more time.
> >>>
> >>> So you've explicitly chosen _not_ to address Y2038 while you're there.
> >>
> >> Can you be more specific?
> > 
> > Right, this patch set avoids incrementing the CPIO magic number and
> > the resulting changes required (eg. increasing the timestamp field
> > size), by including a file with the security xattrs in the CPIO.  In
> > either case, including the security xattrs in the initramfs header or
> > as a separate file, the initramfs, itself, needs to be signed.
> 
> The /init binary in the initramfs runs as root and launches all other processes
> on the system. Presumably it can write any xattrs it wants to, and doesn't need
> any extra permissions granted to it to do so. But as soon as you start putting
> xattrs on _other_ files within the initramfs that are _not_ necessarily running
> as PID 1, _that's_ when the need to sign the initramfs comes in?
> 
> Presumably the signing occurs on the gzipped file. How does that affect the cpio
> parsing _after_ it's decompressed? Why would that be part of _this_ patch?

The signing and verification of the initramfs is a separate issue, not
part of this patch set.  The only reason for mentioning it here was to
say that both methods of including the security xattrs require the
initramfs be signed.  Just as the kernel image needs to be signed and
verified, the initramfs should be too.

Mimi
Andy Lutomirski May 11, 2019, 10:44 p.m. UTC | #6
On Thu, May 9, 2019 at 4:27 AM Roberto Sassu <roberto.sassu@huawei.com> wrote:
>
> This patch set aims at solving the following use case: appraise files from
> the initial ram disk. To do that, IMA checks the signature/hash from the
> security.ima xattr. Unfortunately, this use case cannot be implemented
> currently, as the CPIO format does not support xattrs.
>
> This proposal consists in marshaling pathnames and xattrs in a file called
> .xattr-list. They are unmarshaled by the CPIO parser after all files have
> been extracted.
>
> The difference from v1 (https://lkml.org/lkml/2018/11/22/1182) is that all
> xattrs are stored in a single file and not per file (solves the file name
> limitation issue, as it is not necessary to add a suffix to files
> containing xattrs).
>
> The difference with another proposal
> (https://lore.kernel.org/patchwork/cover/888071/) is that xattrs can be
> included in an image without changing the image format, as opposed to
> defining a new one. As seen from the discussion, if a new format has to be
> defined, it should fix the issues of the existing format, which requires
> more time.

I read some of those emails.  ISTM that adding TAR support should be
seriously considered.  Sure, it's baroque, but it's very, very well
supported, and it does exactly what we need.

--Andy
Rob Landley May 12, 2019, 4:04 a.m. UTC | #7
On 5/11/19 5:44 PM, Andy Lutomirski wrote:
> I read some of those emails.  ISTM that adding TAR support should be
> seriously considered.  Sure, it's baroque, but it's very, very well
> supported, and it does exactly what we need.

Which means you now have two parsers supported in parallel forevermore, and are
reversing the design decision initially made when this went in without new info.

Also, I just did a tar implementation for toybox: It took me a month to debug it
(_not_ starting from scratch but from a submission), I only just added sparse
file support (because something in the android build was generating a sparse
file), there are historical tarballs I know it won't extract (I'm just testing
against what the current one produces with the default flags), and I haven't
even started on xattr support yet.

Instead I was experimenting with corner cases like "S records replace the
prefix[] field starting at byte 386 with an offset/length pair array, but
prefix[] starts at 345, do those first 41 bytes still function as a prefix and
is there any circumstance under which existing tar binaries will populate them?
Also, why does every instance of an S record generated by gnu/tar end with a
gratuitous length zero segment?"

"cpio -H newc" is a _much_ simpler format. And posix no longer specifies
_either_ format usefully, hasn't for years. From toybox tar's header comment:

 * For the command, see
 *   http://pubs.opengroup.org/onlinepubs/007908799/xcu/tar.html
 * For the modern file format, see
 *
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html#tag_20_92_13_06
 *   https://en.wikipedia.org/wiki/Tar_(computing)#File_format
 *   https://www.gnu.org/software/tar/manual/html_node/Tar-Internals.html

And no, that isn't _enough_ information, you still have to "tar | hd" a lot and
squint. (There's no current spec, it's pieced together from multiple sources
because posix abdicated responsibility for this to Jorg Schilling.)

Rob

P.S. Yes that gnu/dammit page starts with a "this will be deleted" comment which
according to archive.org has been there for over a dozen years.

P.P.S. Sadly, if you want an actually standardized standard format where
implementations adhere to the standard: IETF RFC 1991 was published in 1996 and
remains compatible with files an archivers in service. Or we could stick with
cpio and make minor changes to it, since we have to remain backwards compatible
with it _anyway_....
Rob Landley May 12, 2019, 4:12 a.m. UTC | #8
On 5/11/19 11:04 PM, Rob Landley wrote:
> P.P.S. Sadly, if you want an actually standardized standard format where
> implementations adhere to the standard: IETF RFC 1991 was published in 1996 and

Nope, darn it, checked my notes and that wasn't it. I thought zip had an RFC,
it's just zlib, deflate, and gzip, and that's not the number of any of them.

I still think sticking with a lightly modified cpio makes the most sense,
just... in band signalling that _doesn't_ solve the y2038 problem, the file size
limit, or address sparse files seems kinda silly.

Rob
Dominik Brodowski May 12, 2019, 9:17 a.m. UTC | #9
On Thu, May 09, 2019 at 01:24:17PM +0200, Roberto Sassu wrote:
> This proposal consists in marshaling pathnames and xattrs in a file called
> .xattr-list. They are unmarshaled by the CPIO parser after all files have
> been extracted.

Couldn't this parsing of the .xattr-list file and the setting of the xattrs
be done equivalently by the initramfs' /init? Why is kernel involvement
actually required here?

Thanks,
	Dominik
H. Peter Anvin May 12, 2019, 10:18 a.m. UTC | #10
On May 12, 2019 2:17:48 AM PDT, Dominik Brodowski <linux@dominikbrodowski.net> wrote:
>On Thu, May 09, 2019 at 01:24:17PM +0200, Roberto Sassu wrote:
>> This proposal consists in marshaling pathnames and xattrs in a file
>called
>> .xattr-list. They are unmarshaled by the CPIO parser after all files
>have
>> been extracted.
>
>Couldn't this parsing of the .xattr-list file and the setting of the
>xattrs
>be done equivalently by the initramfs' /init? Why is kernel involvement
>actually required here?
>
>Thanks,
>	Dominik

There are a lot of things that could/should be done that way...
Mimi Zohar May 12, 2019, 12:52 p.m. UTC | #11
On Sun, 2019-05-12 at 11:17 +0200, Dominik Brodowski wrote:
> On Thu, May 09, 2019 at 01:24:17PM +0200, Roberto Sassu wrote:
> > This proposal consists in marshaling pathnames and xattrs in a file called
> > .xattr-list. They are unmarshaled by the CPIO parser after all files have
> > been extracted.
> 
> Couldn't this parsing of the .xattr-list file and the setting of the xattrs
> be done equivalently by the initramfs' /init? Why is kernel involvement
> actually required here?

It's too late.  The /init itself should be signed and verified.

Mimi
Rob Landley May 12, 2019, 5:05 p.m. UTC | #12
On 5/12/19 7:52 AM, Mimi Zohar wrote:
> On Sun, 2019-05-12 at 11:17 +0200, Dominik Brodowski wrote:
>> On Thu, May 09, 2019 at 01:24:17PM +0200, Roberto Sassu wrote:
>>> This proposal consists in marshaling pathnames and xattrs in a file called
>>> .xattr-list. They are unmarshaled by the CPIO parser after all files have
>>> been extracted.
>>
>> Couldn't this parsing of the .xattr-list file and the setting of the xattrs
>> be done equivalently by the initramfs' /init? Why is kernel involvement
>> actually required here?
> 
> It's too late.  The /init itself should be signed and verified.

If the initramfs cpio.gz image was signed and verified by the extractor, how is
the init in it _not_ verified?

Rob