Message ID | 20200514092415.5389-1-jth@kernel.org (mailing list archive) |
---|---|
Headers | show |
Series | Add file-system authentication to BTRFS | expand |
On Thu, May 14, 2020 at 11:24:12AM +0200, Johannes Thumshirn wrote: > From: Johannes Thumshirn <johannes.thumshirn@wdc.com> > > This series adds file-system authentication to BTRFS. > > Unlike other verified file-system techniques like fs-verity the > authenticated version of BTRFS does not need extra meta-data on disk. > > This works because in BTRFS every on-disk block has a checksum, for meta-data > the checksum is in the header of each meta-data item. For data blocks, a > separate checksum tree exists, which holds the checksums for each block. > > Currently BRTFS supports CRC32C, XXHASH64, SHA256 and Blake2b for checksumming > these blocks. This series adds a new checksum algorithm, HMAC(SHA-256), which > does need an authentication key. When no, or an incoreect authentication key > is supplied no valid checksum can be generated and a read, fsck or scrub > operation would detect invalid or tampered blocks once the file-system is > mounted again with the correct key. As mentioned in the discussion under LWN article, https://lwn.net/Articles/818842/ ZFS implements split hash where one half is (partial) authenticated hash and the other half is a checksum. This allows to have at least some sort of verification when the auth key is not available. This applies to the fixed size checksum area of metadata blocks, for data we can afford to store both hashes in full. I like this idea, however it brings interesting design decisions, "what if" and corner cases: - what hashes to use for the plain checksum, and thus what's the split - what if one hash matches and the other not - increased checksum calculation time due to doubled block read - whether to store the same parital hash+checksum for data too As the authenticated hash is the main usecase, I'd reserve most of the 32 byte buffer to it and use a weak hash for checksum: 24 bytes for HMAC and 8 bytes for checksum. As an example: sha256+xxhash or blake2b+xxhash. I'd outright skip crc32c for the checksum so we have only small number of authenticated checksums and avoid too many options, eg. hmac-sha256-crc32c etc. The result will be still 2 authenticated hashes with the added checksum hardcoded to xxhash.
On 25/05/2020 15:11, David Sterba wrote: > On Thu, May 14, 2020 at 11:24:12AM +0200, Johannes Thumshirn wrote: >> From: Johannes Thumshirn <johannes.thumshirn@wdc.com> >> >> This series adds file-system authentication to BTRFS. >> >> Unlike other verified file-system techniques like fs-verity the >> authenticated version of BTRFS does not need extra meta-data on disk. >> >> This works because in BTRFS every on-disk block has a checksum, for meta-data >> the checksum is in the header of each meta-data item. For data blocks, a >> separate checksum tree exists, which holds the checksums for each block. >> >> Currently BRTFS supports CRC32C, XXHASH64, SHA256 and Blake2b for checksumming >> these blocks. This series adds a new checksum algorithm, HMAC(SHA-256), which >> does need an authentication key. When no, or an incoreect authentication key >> is supplied no valid checksum can be generated and a read, fsck or scrub >> operation would detect invalid or tampered blocks once the file-system is >> mounted again with the correct key. > > As mentioned in the discussion under LWN article, https://lwn.net/Articles/818842/ > ZFS implements split hash where one half is (partial) authenticated hash > and the other half is a checksum. This allows to have at least some sort > of verification when the auth key is not available. This applies to the > fixed size checksum area of metadata blocks, for data we can afford to > store both hashes in full. > > I like this idea, however it brings interesting design decisions, "what > if" and corner cases: > > - what hashes to use for the plain checksum, and thus what's the split > - what if one hash matches and the other not > - increased checksum calculation time due to doubled block read > - whether to store the same parital hash+checksum for data too > > As the authenticated hash is the main usecase, I'd reserve most of the > 32 byte buffer to it and use a weak hash for checksum: 24 bytes for HMAC > and 8 bytes for checksum. As an example: sha256+xxhash or > blake2b+xxhash. > > I'd outright skip crc32c for the checksum so we have only small number > of authenticated checksums and avoid too many options, eg. > hmac-sha256-crc32c etc. The result will be still 2 authenticated hashes > with the added checksum hardcoded to xxhash. > Hmm I'm really not a fan of this. We would have to use something like sha2-224 to get the room for the 2nd checksum. So we're using a weaker hash just so we can add a second checksum. On the other hand you've asked me to add the known pieces of information into the hashes as a salt to "make attacks harder at a small cost".
On Tue, May 26, 2020 at 07:50:53AM +0000, Johannes Thumshirn wrote: > On 25/05/2020 15:11, David Sterba wrote: > > On Thu, May 14, 2020 at 11:24:12AM +0200, Johannes Thumshirn wrote: > > As mentioned in the discussion under LWN article, https://lwn.net/Articles/818842/ > > ZFS implements split hash where one half is (partial) authenticated hash > > and the other half is a checksum. This allows to have at least some sort > > of verification when the auth key is not available. This applies to the > > fixed size checksum area of metadata blocks, for data we can afford to > > store both hashes in full. > > > > I like this idea, however it brings interesting design decisions, "what > > if" and corner cases: > > > > - what hashes to use for the plain checksum, and thus what's the split > > - what if one hash matches and the other not > > - increased checksum calculation time due to doubled block read > > - whether to store the same parital hash+checksum for data too > > > > As the authenticated hash is the main usecase, I'd reserve most of the > > 32 byte buffer to it and use a weak hash for checksum: 24 bytes for HMAC > > and 8 bytes for checksum. As an example: sha256+xxhash or > > blake2b+xxhash. > > > > I'd outright skip crc32c for the checksum so we have only small number > > of authenticated checksums and avoid too many options, eg. > > hmac-sha256-crc32c etc. The result will be still 2 authenticated hashes > > with the added checksum hardcoded to xxhash. > > Hmm I'm really not a fan of this. We would have to use something like > sha2-224 to get the room for the 2nd checksum. So we're using a weaker > hash just so we can add a second checksum. The idea is to calculate full hash (32 bytes) and store only the part (24 bytes). Yes this means there's some information loss and weakening, but enables a usecase. > On the other hand you've asked > me to add the known pieces of information into the hashes as a salt to > "make attacks harder at a small cost". Yes and this makes it harder to attack the hash, it should be there regardless of the additional checksums.
On 26/05/2020 13:54, David Sterba wrote: > On Tue, May 26, 2020 at 07:50:53AM +0000, Johannes Thumshirn wrote: >> On 25/05/2020 15:11, David Sterba wrote: >>> On Thu, May 14, 2020 at 11:24:12AM +0200, Johannes Thumshirn wrote: >>> As mentioned in the discussion under LWN article, https://lwn.net/Articles/818842/ >>> ZFS implements split hash where one half is (partial) authenticated hash >>> and the other half is a checksum. This allows to have at least some sort >>> of verification when the auth key is not available. This applies to the >>> fixed size checksum area of metadata blocks, for data we can afford to >>> store both hashes in full. >>> >>> I like this idea, however it brings interesting design decisions, "what >>> if" and corner cases: >>> >>> - what hashes to use for the plain checksum, and thus what's the split >>> - what if one hash matches and the other not >>> - increased checksum calculation time due to doubled block read >>> - whether to store the same parital hash+checksum for data too >>> >>> As the authenticated hash is the main usecase, I'd reserve most of the >>> 32 byte buffer to it and use a weak hash for checksum: 24 bytes for HMAC >>> and 8 bytes for checksum. As an example: sha256+xxhash or >>> blake2b+xxhash. >>> >>> I'd outright skip crc32c for the checksum so we have only small number >>> of authenticated checksums and avoid too many options, eg. >>> hmac-sha256-crc32c etc. The result will be still 2 authenticated hashes >>> with the added checksum hardcoded to xxhash. >> >> Hmm I'm really not a fan of this. We would have to use something like >> sha2-224 to get the room for the 2nd checksum. So we're using a weaker >> hash just so we can add a second checksum. > > The idea is to calculate full hash (32 bytes) and store only the part > (24 bytes). Yes this means there's some information loss and weakening, > but enables a usecase. I'm not enough a security expert to be able to judge this. Eric can I hear your opinion on this? Thanks, Johannes
On 2020/5/14 下午5:24, Johannes Thumshirn wrote: > From: Johannes Thumshirn <johannes.thumshirn@wdc.com> > > This series adds file-system authentication to BTRFS. > > Unlike other verified file-system techniques like fs-verity the > authenticated version of BTRFS does not need extra meta-data on disk. > > This works because in BTRFS every on-disk block has a checksum, for meta-data > the checksum is in the header of each meta-data item. For data blocks, a > separate checksum tree exists, which holds the checksums for each block. > > Currently BRTFS supports CRC32C, XXHASH64, SHA256 and Blake2b for checksumming > these blocks. This series adds a new checksum algorithm, HMAC(SHA-256), which > does need an authentication key. When no, or an incoreect authentication key > is supplied no valid checksum can be generated and a read, fsck or scrub > operation would detect invalid or tampered blocks once the file-system is > mounted again with the correct key. > > Getting the key inside the kernel is out of scope of this implementation, the > file-system driver assumes the key is already in the kernel's keyring at mount > time. > > There was interest in also using keyed Blake2b from the community, but this > support is not yet included. > > I have CCed Eric Biggers and Richard Weinberger in the submission, as they > previously have worked on filesystem authentication and I hope we can get > input from them as well. > > Example usage: > Create a file-system with authentication key 0123456 > mkfs.btrfs --csum "hmac(sha256)" --auth-key 0123456 /dev/disk > > Add the key to the kernel's keyring as keyid 'btrfs:foo' > keyctl add logon btrfs:foo 0123456 @u > > Mount the fs using the 'btrfs:foo' key > mount -t btrfs -o auth_key=btrfs:foo,auth_hash_name="hmac(sha256)" /dev/disk /mnt/point > > Note, this is a re-base of the work I did when I was still at SUSE, hence the > S-o-b being my SUSE address, while the Author being with my WDC address (to > not generate bouncing mails). > > Changes since v2: > - Select CONFIG_CRYPTO_HMAC and CONFIG_KEYS (kbuild robot) > - Fix double free in error path > - Fix memory leak in error path > - Disallow nodatasum and nodatacow when authetication is use (Eric) Since we're disabling NODATACOW usages, can we also disable the following features? - v1 space cache V1 space cache uses NODATACOW file to store space cache, althouhg it has inline csum, but it's fixed to crc32c. So attacker can easily utilize this hole to mess space cache, and do some DoS attack. - fallocate I'm not 100% sure about this, but since nodatacow is already a second class citizen in btrfs, maybe not supporting fallocate is not a strange move. Thanks, Qu > - Pass in authentication algorithm as mount option (Eric) > - Don't use the work "replay" in the documentation, as it is wrong and > harmful in this context (Eric) > - Force key name to begin with 'btrfs:' (Eric) > - Use '4' as on-disk checksum type for HMAC(SHA256) to not have holes in the > checksum types array. > > Changes since v1: > - None, only rebased the series > > Johannes Thumshirn (3): > btrfs: rename btrfs_parse_device_options back to > btrfs_parse_early_options > btrfs: add authentication support > btrfs: document btrfs authentication > > .../filesystems/btrfs-authentication.rst | 168 ++++++++++++++++++ > fs/btrfs/Kconfig | 2 + > fs/btrfs/ctree.c | 22 ++- > fs/btrfs/ctree.h | 5 +- > fs/btrfs/disk-io.c | 71 +++++++- > fs/btrfs/ioctl.c | 7 +- > fs/btrfs/super.c | 65 ++++++- > include/uapi/linux/btrfs_tree.h | 1 + > 8 files changed, 326 insertions(+), 15 deletions(-) > create mode 100644 Documentation/filesystems/btrfs-authentication.rst >
On Wed, May 27, 2020 at 10:08:06AM +0800, Qu Wenruo wrote: > > Changes since v2: > > - Select CONFIG_CRYPTO_HMAC and CONFIG_KEYS (kbuild robot) > > - Fix double free in error path > > - Fix memory leak in error path > > - Disallow nodatasum and nodatacow when authetication is use (Eric) > > Since we're disabling NODATACOW usages, can we also disable the > following features? > - v1 space cache > V1 space cache uses NODATACOW file to store space cache, althouhg it > has inline csum, but it's fixed to crc32c. So attacker can easily > utilize this hole to mess space cache, and do some DoS attack. That's a good point. The v1 space cache will be phased out but it won't be in a timeframe we'll get in the authentication. At this point we don't even have a way to select v2 at mkfs time (it's work in progress though), so it would be required to switch to v2 on the first mount. > - fallocate > I'm not 100% sure about this, but since nodatacow is already a second > class citizen in btrfs, maybe not supporting fallocate is not a > strange move. Fallocate is a standard file operation, not supporting would be quite strange. What's the problem with fallocate and authentication?
On 2020/5/27 下午7:27, David Sterba wrote: > On Wed, May 27, 2020 at 10:08:06AM +0800, Qu Wenruo wrote: >>> Changes since v2: >>> - Select CONFIG_CRYPTO_HMAC and CONFIG_KEYS (kbuild robot) >>> - Fix double free in error path >>> - Fix memory leak in error path >>> - Disallow nodatasum and nodatacow when authetication is use (Eric) >> >> Since we're disabling NODATACOW usages, can we also disable the >> following features? >> - v1 space cache >> V1 space cache uses NODATACOW file to store space cache, althouhg it >> has inline csum, but it's fixed to crc32c. So attacker can easily >> utilize this hole to mess space cache, and do some DoS attack. > > That's a good point. > > The v1 space cache will be phased out but it won't be in a timeframe > we'll get in the authentication. At this point we don't even have a way > to select v2 at mkfs time (it's work in progress though), so it would be > required to switch to v2 on the first mount. > >> - fallocate >> I'm not 100% sure about this, but since nodatacow is already a second >> class citizen in btrfs, maybe not supporting fallocate is not a >> strange move. > > Fallocate is a standard file operation, not supporting would be quite > strange. What's the problem with fallocate and authentication? > As said, I'm not that sure about preallocate, but that's the remaining user of nodatacow. Although it's a pretty common interface, but in btrfs it doesn't really make much sense. In case like fallocate then snapshot use case, there is really no benefit from writing into fallocated range. Not to mention the extra cross-ref check involved when writing into possible preallocated range. Thanks, Qu
On Wed, May 27, 2020 at 10:08:06AM +0800, Qu Wenruo wrote: > > Changes since v2: > > - Select CONFIG_CRYPTO_HMAC and CONFIG_KEYS (kbuild robot) > > - Fix double free in error path > > - Fix memory leak in error path > > - Disallow nodatasum and nodatacow when authetication is use (Eric) > > Since we're disabling NODATACOW usages, can we also disable the > following features? > - v1 space cache > V1 space cache uses NODATACOW file to store space cache, althouhg it > has inline csum, but it's fixed to crc32c. So attacker can easily > utilize this hole to mess space cache, and do some DoS attack. > > - fallocate > I'm not 100% sure about this, but since nodatacow is already a second > class citizen in btrfs, maybe not supporting fallocate is not a > strange move. - swapfile NODATACOW is required for swapfile, so authentication and swapfile are mutualy exclusive.
On Tue, May 26, 2020 at 12:44:28PM +0000, Johannes Thumshirn wrote: > On 26/05/2020 13:54, David Sterba wrote: > > On Tue, May 26, 2020 at 07:50:53AM +0000, Johannes Thumshirn wrote: > >> On 25/05/2020 15:11, David Sterba wrote: > >>> I'd outright skip crc32c for the checksum so we have only small number > >>> of authenticated checksums and avoid too many options, eg. > >>> hmac-sha256-crc32c etc. The result will be still 2 authenticated hashes > >>> with the added checksum hardcoded to xxhash. > >> > >> Hmm I'm really not a fan of this. We would have to use something like > >> sha2-224 to get the room for the 2nd checksum. So we're using a weaker > >> hash just so we can add a second checksum. > > > > The idea is to calculate full hash (32 bytes) and store only the part > > (24 bytes). Yes this means there's some information loss and weakening, > > but enables a usecase. > > I'm not enough a security expert to be able to judge this. Eric can I hear > your opinion on this? Given that this has implications on strength and the usecases, I'd rather let the filesystem provide the options and let the user choose and not make the decision for their behalf. This would increase number of authenticated hashes to 4 in the end: 1. authenticated with 32byte/256bit hash (sha256, blake2b) + full strength - no way to verify checksums without the key 2. authenticated with 24bytes/192bit hash (sha256, blake2b) where the last 8 bytes are xxhash64 ~ weakened strength but should be still sufficient + possibility to verify checksums without the key - slight perf cost for the 2nd hash As option 2 needs some evaluation and reasoning whether it does not compromise the security, I don't insist on having it implemented in the first phase. I have a prototype code for that so it might live in linux-next for some time before we'd merge it. Regarding backward compatibility, the checksums are easy compared to other features. The supported status can be deteremined directly from superblock so adding new types of checksum do not require compat bits and the code for that.
From: Johannes Thumshirn <johannes.thumshirn@wdc.com> This series adds file-system authentication to BTRFS. Unlike other verified file-system techniques like fs-verity the authenticated version of BTRFS does not need extra meta-data on disk. This works because in BTRFS every on-disk block has a checksum, for meta-data the checksum is in the header of each meta-data item. For data blocks, a separate checksum tree exists, which holds the checksums for each block. Currently BRTFS supports CRC32C, XXHASH64, SHA256 and Blake2b for checksumming these blocks. This series adds a new checksum algorithm, HMAC(SHA-256), which does need an authentication key. When no, or an incoreect authentication key is supplied no valid checksum can be generated and a read, fsck or scrub operation would detect invalid or tampered blocks once the file-system is mounted again with the correct key. Getting the key inside the kernel is out of scope of this implementation, the file-system driver assumes the key is already in the kernel's keyring at mount time. There was interest in also using keyed Blake2b from the community, but this support is not yet included. I have CCed Eric Biggers and Richard Weinberger in the submission, as they previously have worked on filesystem authentication and I hope we can get input from them as well. Example usage: Create a file-system with authentication key 0123456 mkfs.btrfs --csum "hmac(sha256)" --auth-key 0123456 /dev/disk Add the key to the kernel's keyring as keyid 'btrfs:foo' keyctl add logon btrfs:foo 0123456 @u Mount the fs using the 'btrfs:foo' key mount -t btrfs -o auth_key=btrfs:foo,auth_hash_name="hmac(sha256)" /dev/disk /mnt/point Note, this is a re-base of the work I did when I was still at SUSE, hence the S-o-b being my SUSE address, while the Author being with my WDC address (to not generate bouncing mails). Changes since v2: - Select CONFIG_CRYPTO_HMAC and CONFIG_KEYS (kbuild robot) - Fix double free in error path - Fix memory leak in error path - Disallow nodatasum and nodatacow when authetication is use (Eric) - Pass in authentication algorithm as mount option (Eric) - Don't use the work "replay" in the documentation, as it is wrong and harmful in this context (Eric) - Force key name to begin with 'btrfs:' (Eric) - Use '4' as on-disk checksum type for HMAC(SHA256) to not have holes in the checksum types array. Changes since v1: - None, only rebased the series Johannes Thumshirn (3): btrfs: rename btrfs_parse_device_options back to btrfs_parse_early_options btrfs: add authentication support btrfs: document btrfs authentication .../filesystems/btrfs-authentication.rst | 168 ++++++++++++++++++ fs/btrfs/Kconfig | 2 + fs/btrfs/ctree.c | 22 ++- fs/btrfs/ctree.h | 5 +- fs/btrfs/disk-io.c | 71 +++++++- fs/btrfs/ioctl.c | 7 +- fs/btrfs/super.c | 65 ++++++- include/uapi/linux/btrfs_tree.h | 1 + 8 files changed, 326 insertions(+), 15 deletions(-) create mode 100644 Documentation/filesystems/btrfs-authentication.rst