mbox series

[RFC,0/5] Btrfs: add interface for writing compressed extent directly

Message ID cover.1565900769.git.osandov@fb.com (mailing list archive)
Headers show
Series Btrfs: add interface for writing compressed extent directly | expand

Message

Omar Sandoval Aug. 15, 2019, 9:04 p.m. UTC
From: Omar Sandoval <osandov@fb.com>

Hello,

This series adds a way to write compressed data directly to Btrfs. The
intended use case is making send/receive on compressed file systems more
efficient; however, the interface is general enough that it could be
used in other scenarios. Patch 5 is the main change; see that for more
details.

Patches 1-3 are small fixes/cleanups that I ran into while implementing
this; they should go in regardless of the remainder of the series. Patch
4 exports a required VFS interface.

An example program and test case are available at [1].

To preemptively address a few concerns:

- Writing arbitrary, untrusted data which we feed to the decompression
  algorithm can be a security risk. For that reason, the ioctl is
  restricted to CAP_SYS_ADMIN. The Btrfs code is properly hardened
  against invalid compressed data/incorrect lengths, and the compression
  libraries are mature, but better safe than sorry for now.
- If the user is writing their own compressed data rather than just
  blindly feeding in something from btrfs send, they need to know some
  implementation details about the compression format. For zlib, there
  are no special requirements. For zstd, a non-default compression
  parameter must be used. For lzo, we have our own wrapper format since
  lzo doesn't have a standard wrapper format. It feels a little wrong to
  expose these details, but they are part of the on-disk format, so they
  must be stable regardless.
- The permissions checks duplicated from the VFS code are fairly
  minimal.

This series is based on misc-next.

This is an RFC, so please, comment away.

Thanks!

1: https://github.com/osandov/xfstests/tree/btrfs-compressed-write

Omar Sandoval (5):
  Btrfs: use correct count in btrfs_file_write_iter()
  Btrfs: treat RWF_{,D}SYNC writes as sync for CRCs
  Btrfs: stop clearing EXTENT_DIRTY in inode I/O tree
  fs: export rw_verify_area()
  Btrfs: add ioctl for directly writing compressed data

 fs/btrfs/compression.c       |   6 +-
 fs/btrfs/compression.h       |  14 +--
 fs/btrfs/ctree.h             |  12 ++
 fs/btrfs/extent_io.c         |   6 +-
 fs/btrfs/file.c              |  22 ++--
 fs/btrfs/free-space-cache.c  |   9 +-
 fs/btrfs/inode.c             | 232 +++++++++++++++++++++++++++++++----
 fs/btrfs/ioctl.c             | 101 ++++++++++++++-
 fs/btrfs/tests/inode-tests.c |  12 +-
 fs/internal.h                |   5 -
 fs/read_write.c              |   1 +
 include/linux/fs.h           |   1 +
 include/uapi/linux/btrfs.h   |  63 ++++++++++
 13 files changed, 415 insertions(+), 69 deletions(-)

Comments

Omar Sandoval Aug. 15, 2019, 9:14 p.m. UTC | #1
On Thu, Aug 15, 2019 at 02:04:01PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
> 
> Hello,
> 
> This series adds a way to write compressed data directly to Btrfs. The
> intended use case is making send/receive on compressed file systems more
> efficient; however, the interface is general enough that it could be
> used in other scenarios. Patch 5 is the main change; see that for more
> details.
> 
> Patches 1-3 are small fixes/cleanups that I ran into while implementing
> this; they should go in regardless of the remainder of the series. Patch
> 4 exports a required VFS interface.
> 
> An example program and test case are available at [1].
> 
> To preemptively address a few concerns:
> 
> - Writing arbitrary, untrusted data which we feed to the decompression
>   algorithm can be a security risk. For that reason, the ioctl is
>   restricted to CAP_SYS_ADMIN. The Btrfs code is properly hardened
>   against invalid compressed data/incorrect lengths, and the compression
>   libraries are mature, but better safe than sorry for now.
> - If the user is writing their own compressed data rather than just
>   blindly feeding in something from btrfs send, they need to know some
>   implementation details about the compression format. For zlib, there
>   are no special requirements. For zstd, a non-default compression
>   parameter must be used. For lzo, we have our own wrapper format since
>   lzo doesn't have a standard wrapper format. It feels a little wrong to
>   expose these details, but they are part of the on-disk format, so they
>   must be stable regardless.
> - The permissions checks duplicated from the VFS code are fairly
>   minimal.
> 
> This series is based on misc-next.
> 
> This is an RFC, so please, comment away.
> 
> Thanks!
> 
> 1: https://github.com/osandov/xfstests/tree/btrfs-compressed-write
> 
> Omar Sandoval (5):
>   Btrfs: use correct count in btrfs_file_write_iter()
>   Btrfs: treat RWF_{,D}SYNC writes as sync for CRCs
>   Btrfs: stop clearing EXTENT_DIRTY in inode I/O tree
>   fs: export rw_verify_area()
>   Btrfs: add ioctl for directly writing compressed data
> 
>  fs/btrfs/compression.c       |   6 +-
>  fs/btrfs/compression.h       |  14 +--
>  fs/btrfs/ctree.h             |  12 ++
>  fs/btrfs/extent_io.c         |   6 +-
>  fs/btrfs/file.c              |  22 ++--
>  fs/btrfs/free-space-cache.c  |   9 +-
>  fs/btrfs/inode.c             | 232 +++++++++++++++++++++++++++++++----
>  fs/btrfs/ioctl.c             | 101 ++++++++++++++-
>  fs/btrfs/tests/inode-tests.c |  12 +-
>  fs/internal.h                |   5 -
>  fs/read_write.c              |   1 +
>  include/linux/fs.h           |   1 +
>  include/uapi/linux/btrfs.h   |  63 ++++++++++
>  13 files changed, 415 insertions(+), 69 deletions(-)

I forgot to CC fsdevel. I'll do that for v2.
David Sterba Aug. 27, 2019, 6:31 p.m. UTC | #2
On Thu, Aug 15, 2019 at 02:04:01PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
> 
> Hello,
> 
> This series adds a way to write compressed data directly to Btrfs. The
> intended use case is making send/receive on compressed file systems more
> efficient; however, the interface is general enough that it could be
> used in other scenarios. Patch 5 is the main change; see that for more
> details.
> 
> Patches 1-3 are small fixes/cleanups that I ran into while implementing
> this; they should go in regardless of the remainder of the series.

1-3 added to misc-next, thanks. I haven't looked at the rest yet.