Message ID | 20181004203007.217320-2-mjg59@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/3] VFS: Add a call to obtain a file's hash | expand |
On Thu, 2018-10-04 at 13:30 -0700, Matthew Garrett wrote: > IMA wants to know what the hash of a file is, and currently does so by > reading the entire file and generating the hash. Some filesystems may > have the ability to store the hash in a secure manner resistant to > offline attacks (eg, filesystem-level file signing), and in that case > it's a performance win for IMA to be able to use that rather than having > to re-hash everything. This patch simply adds VFS-level support for > calling down to filesystems. This patch description starts out saying that IMA needs the file hash without explaining why. Without that explanation, simply extracting the file hash included in the file signature might sound plausible, but kind of defeats the purpose of IMA. Mimi > > Signed-off-by: Matthew Garrett <mjg59@google.com> > --- > fs/read_write.c | 24 ++++++++++++++++++++++++ > include/linux/fs.h | 6 +++++- > 2 files changed, 29 insertions(+), 1 deletion(-) > > diff --git a/fs/read_write.c b/fs/read_write.c > index 39b4a21dd933..9ba3ce4bb838 100644 > --- a/fs/read_write.c > +++ b/fs/read_write.c > @@ -2081,3 +2081,27 @@ int vfs_dedupe_file_range(struct file *file, struct file_dedupe_range *same) > return ret; > } > EXPORT_SYMBOL(vfs_dedupe_file_range); > + > +/** > + * vfs_gethash - obtain a file's hash > + * @file: file structure in question > + * @hash_algo: the hash algorithm requested > + * @buf: buffer to return the hash in > + * @size: size allocated for the buffer by the caller > + * > + * This function allows filesystems that support securely storing the hash > + * of a file to return it rather than forcing the kernel to recalculate it. > + * Filesystems that cannot provide guarantees about the hash being resistant > + * to offline attack should not implement this functionality. > + * > + * Returns 0 on success, -EOPNOTSUPP if the filesystem doesn't support it. > + */ > +int vfs_get_hash(struct file *file, enum hash_algo hash, uint8_t *buf, > + size_t size) > +{ > + if (!file->f_op->get_hash) > + return -EOPNOTSUPP; > + > + return file->f_op->get_hash(file, hash, buf, size); > +} > +EXPORT_SYMBOL(vfs_get_hash); > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 6c0b4a1c22ff..540316cfd461 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -40,6 +40,7 @@ > > #include <asm/byteorder.h> > #include <uapi/linux/fs.h> > +#include <uapi/linux/hash_info.h> > > struct backing_dev_info; > struct bdi_writeback; > @@ -1764,6 +1765,8 @@ struct file_operations { > int (*dedupe_file_range)(struct file *, loff_t, struct file *, loff_t, > u64); > int (*fadvise)(struct file *, loff_t, loff_t, int); > + int (*get_hash)(struct file *, enum hash_algo hash, uint8_t *buf, > + size_t size); > } __randomize_layout; > > struct inode_operations { > @@ -1838,7 +1841,8 @@ extern int vfs_dedupe_file_range(struct file *file, > extern int vfs_dedupe_file_range_one(struct file *src_file, loff_t src_pos, > struct file *dst_file, loff_t dst_pos, > u64 len); > - > +extern int vfs_get_hash(struct file *file, enum hash_algo hash, uint8_t *buf, > + size_t size); > > struct super_operations { > struct inode *(*alloc_inode)(struct super_block *sb);
On Thu, Oct 11, 2018 at 8:22 AM Mimi Zohar <zohar@linux.ibm.com> wrote: > > On Thu, 2018-10-04 at 13:30 -0700, Matthew Garrett wrote: > > IMA wants to know what the hash of a file is, and currently does so by > > reading the entire file and generating the hash. Some filesystems may > > have the ability to store the hash in a secure manner resistant to > > offline attacks (eg, filesystem-level file signing), and in that case > > it's a performance win for IMA to be able to use that rather than having > > to re-hash everything. This patch simply adds VFS-level support for > > calling down to filesystems. > > This patch description starts out saying that IMA needs the file hash > without explaining why. Without that explanation, simply extracting > the file hash included in the file signature might sound plausible, > but kind of defeats the purpose of IMA. I'm not sure how it defeats the purpose - IMA wants to know the hash of a file so it can either log it or compare it against a signature, and it currently obtains this hash by reading the entire file at measurement time. If the filesystem later returns different data then IMA won't notice, which allows a malicious filesystem to bypass the measurements - there's no guarantee that we won't evict large parts of the copy of an executable that IMA read, and the filesystem can give us back a modified page when we page it back in. So IMA fundamentally relies on the filesystem to be trustworthy, and if we rely on the filesystem to be trustworthy then we should be able to rely on it to accurately store and provide the hash of a file.
On Thu, Oct 11, 2018 at 11:21 AM Matthew Garrett <mjg59@google.com> wrote: > > On Thu, Oct 11, 2018 at 8:22 AM Mimi Zohar <zohar@linux.ibm.com> wrote: > > > > This patch description starts out saying that IMA needs the file hash > > without explaining why. Without that explanation, simply extracting > > the file hash included in the file signature might sound plausible, > > but kind of defeats the purpose of IMA. > > I'm not sure how it defeats the purpose - IMA wants to know the hash > of a file so it can either log it or compare it against a signature, > and it currently obtains this hash by reading the entire file at > measurement time. If the filesystem later returns different data then > IMA won't notice, which allows a malicious filesystem to bypass the > measurements - there's no guarantee that we won't evict large parts of > the copy of an executable that IMA read, and the filesystem can give > us back a modified page when we page it back in. So IMA fundamentally > relies on the filesystem to be trustworthy, and if we rely on the > filesystem to be trustworthy then we should be able to rely on it to > accurately store and provide the hash of a file. Oh, to clarify on the signature part of things - it would obviously be inappropriate to, say, just read the hash out of security.ima and hand that back. But for a hypothetical case where the filesystem itself verifies the signature, then the filesystem would abort the transaction if the signature didn't match and it seems reasonable to avoid doing the validation twice (once up front and then again on every read)
On Thu, 2018-10-11 at 11:24 -0700, Matthew Garrett wrote: > On Thu, Oct 11, 2018 at 11:21 AM Matthew Garrett <mjg59@google.com> wrote: > > > > On Thu, Oct 11, 2018 at 8:22 AM Mimi Zohar <zohar@linux.ibm.com> wrote: > > > > > > This patch description starts out saying that IMA needs the file hash > > > without explaining why. Without that explanation, simply extracting > > > the file hash included in the file signature might sound plausible, > > > but kind of defeats the purpose of IMA. > > > > I'm not sure how it defeats the purpose - IMA wants to know the hash > > of a file so it can either log it or compare it against a signature, > > and it currently obtains this hash by reading the entire file at > > measurement time. If the filesystem later returns different data then > > IMA won't notice, which allows a malicious filesystem to bypass the > > measurements - there's no guarantee that we won't evict large parts of > > the copy of an executable that IMA read, and the filesystem can give > > us back a modified page when we page it back in. So IMA fundamentally > > relies on the filesystem to be trustworthy, and if we rely on the > > filesystem to be trustworthy then we should be able to rely on it to > > accurately store and provide the hash of a file. > > Oh, to clarify on the signature part of things - it would obviously be > inappropriate to, say, just read the hash out of security.ima and hand > that back. Right, reading it either directly or extracted from the file signature stored in security.ima. > But for a hypothetical case where the filesystem itself > verifies the signature, then the filesystem would abort the > transaction if the signature didn't match and it seems reasonable to > avoid doing the validation twice (once up front and then again on > every read) Right, this is a hypothetical scenario as far as I'm aware, since none of the filesystems are currently calculating and storing the file hash. The default should be for IMA to re-calculate the file hash. Mimi
On Thu, Oct 11, 2018 at 11:37 AM Mimi Zohar <zohar@linux.ibm.com> wrote: > On Thu, 2018-10-11 at 11:24 -0700, Matthew Garrett wrote: > > But for a hypothetical case where the filesystem itself > > verifies the signature, then the filesystem would abort the > > transaction if the signature didn't match and it seems reasonable to > > avoid doing the validation twice (once up front and then again on > > every read) > > Right, this is a hypothetical scenario as far as I'm aware, since none > of the filesystems are currently calculating and storing the file > hash. The default should be for IMA to re-calculate the file hash. There are FUSE filesystems that do.
diff --git a/fs/read_write.c b/fs/read_write.c index 39b4a21dd933..9ba3ce4bb838 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -2081,3 +2081,27 @@ int vfs_dedupe_file_range(struct file *file, struct file_dedupe_range *same) return ret; } EXPORT_SYMBOL(vfs_dedupe_file_range); + +/** + * vfs_gethash - obtain a file's hash + * @file: file structure in question + * @hash_algo: the hash algorithm requested + * @buf: buffer to return the hash in + * @size: size allocated for the buffer by the caller + * + * This function allows filesystems that support securely storing the hash + * of a file to return it rather than forcing the kernel to recalculate it. + * Filesystems that cannot provide guarantees about the hash being resistant + * to offline attack should not implement this functionality. + * + * Returns 0 on success, -EOPNOTSUPP if the filesystem doesn't support it. + */ +int vfs_get_hash(struct file *file, enum hash_algo hash, uint8_t *buf, + size_t size) +{ + if (!file->f_op->get_hash) + return -EOPNOTSUPP; + + return file->f_op->get_hash(file, hash, buf, size); +} +EXPORT_SYMBOL(vfs_get_hash); diff --git a/include/linux/fs.h b/include/linux/fs.h index 6c0b4a1c22ff..540316cfd461 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -40,6 +40,7 @@ #include <asm/byteorder.h> #include <uapi/linux/fs.h> +#include <uapi/linux/hash_info.h> struct backing_dev_info; struct bdi_writeback; @@ -1764,6 +1765,8 @@ struct file_operations { int (*dedupe_file_range)(struct file *, loff_t, struct file *, loff_t, u64); int (*fadvise)(struct file *, loff_t, loff_t, int); + int (*get_hash)(struct file *, enum hash_algo hash, uint8_t *buf, + size_t size); } __randomize_layout; struct inode_operations { @@ -1838,7 +1841,8 @@ extern int vfs_dedupe_file_range(struct file *file, extern int vfs_dedupe_file_range_one(struct file *src_file, loff_t src_pos, struct file *dst_file, loff_t dst_pos, u64 len); - +extern int vfs_get_hash(struct file *file, enum hash_algo hash, uint8_t *buf, + size_t size); struct super_operations { struct inode *(*alloc_inode)(struct super_block *sb);
IMA wants to know what the hash of a file is, and currently does so by reading the entire file and generating the hash. Some filesystems may have the ability to store the hash in a secure manner resistant to offline attacks (eg, filesystem-level file signing), and in that case it's a performance win for IMA to be able to use that rather than having to re-hash everything. This patch simply adds VFS-level support for calling down to filesystems. Signed-off-by: Matthew Garrett <mjg59@google.com> --- fs/read_write.c | 24 ++++++++++++++++++++++++ include/linux/fs.h | 6 +++++- 2 files changed, 29 insertions(+), 1 deletion(-)