mbox series

[0/5] block: loop: add file format subsystem and QCOW2 file format driver

Message ID 20190823225619.15530-1-development@manuel-bentele.de (mailing list archive)
Headers show
Series block: loop: add file format subsystem and QCOW2 file format driver | expand

Message

Manuel Bentele Aug. 23, 2019, 10:56 p.m. UTC
From: Manuel Bentele <development@manuel-bentele.de>

Hi

Regarding to the following discussion [1] on the mailing list I show you 
the result of my work as announced at the end of the discussion [2].

The discussion was about the project topic of how to implement the 
reading/writing of QCOW2 in the kernel. The project focuses on an read-only 
in-kernel QCOW2 implementation to increase the read/write performance 
and tries to avoid nbd. Furthermore, the project is part of a project 
series to develop a in-kernel network boot infrastructure that has no need 
for any user space interaction (e.g. nbd) anymore.

During the discussion, it turned out that the implementation as device 
mapper target is not applicable. The device mapper stacks different 
functionality such as compression or encryption on multiple block device 
layers whereas an implementation for the QCOW2 container format provides 
these functionalities on one block device layer. Using FUSE is also not 
possible due to performance reasons and user space interaction.

Therefore, I propose the extension of the loop device module. I created a 
new file format subsystem which is part of the loop device module. The file 
format subsystem abstracts the direct file access and provides an driver 
API to implement various disk file formats such as QCOW2, VDI and VMDK. 
File format drivers are implemented as kernel modules and can be registered 
by the file format subsystem.

The patch series contains documentation for the file format subsystem and 
the loop device module, too. Also, it provides a default RAW file format 
driver and a read-only QCOW2 driver. The RAW file format driver is based on 
the file specific parts of the existing loop device implementation and 
preserves the default behaviour of a loop device. More specific information 
can be found in the commit logs of the following patches.

Regards,
Manuel

[1] https://www.spinics.net/lists/linux-block/msg39538.html
[2] https://www.spinics.net/lists/linux-block/msg40479.html

Manuel Bentele (5):
  block: loop: add file format subsystem for loop devices
  doc: admin-guide: add loop block device documentation
  doc: driver-api: add loop file format subsystem API documentation
  block: loop: add QCOW2 loop file format driver (read-only)
  doc: admin-guide: add QCOW2 file format to loop device documentation

 Documentation/admin-guide/blockdev/index.rst  |   1 +
 Documentation/admin-guide/blockdev/loop.rst   |  85 ++
 Documentation/driver-api/index.rst            |   1 +
 Documentation/driver-api/loop-file-fmt.rst    | 137 +++
 arch/alpha/configs/defconfig                  |   1 +
 arch/arc/configs/axs103_defconfig             |   1 +
 arch/arc/configs/axs103_smp_defconfig         |   1 +
 arch/arm/configs/am200epdkit_defconfig        |   1 +
 arch/arm/configs/aspeed_g4_defconfig          |   1 +
 arch/arm/configs/aspeed_g5_defconfig          |   1 +
 arch/arm/configs/assabet_defconfig            |   1 +
 arch/arm/configs/at91_dt_defconfig            |   1 +
 arch/arm/configs/axm55xx_defconfig            |   1 +
 arch/arm/configs/badge4_defconfig             |   1 +
 arch/arm/configs/cerfcube_defconfig           |   1 +
 arch/arm/configs/cm_x2xx_defconfig            |   1 +
 arch/arm/configs/cm_x300_defconfig            |   1 +
 arch/arm/configs/cns3420vb_defconfig          |   1 +
 arch/arm/configs/colibri_pxa270_defconfig     |   1 +
 arch/arm/configs/collie_defconfig             |   1 +
 arch/arm/configs/corgi_defconfig              |   1 +
 arch/arm/configs/davinci_all_defconfig        |   1 +
 arch/arm/configs/dove_defconfig               |   1 +
 arch/arm/configs/em_x270_defconfig            |   1 +
 arch/arm/configs/eseries_pxa_defconfig        |   1 +
 arch/arm/configs/exynos_defconfig             |   1 +
 arch/arm/configs/ezx_defconfig                |   1 +
 arch/arm/configs/footbridge_defconfig         |   1 +
 arch/arm/configs/h3600_defconfig              |   1 +
 arch/arm/configs/imote2_defconfig             |   1 +
 arch/arm/configs/imx_v6_v7_defconfig          |   1 +
 arch/arm/configs/integrator_defconfig         |   1 +
 arch/arm/configs/iop32x_defconfig             |   1 +
 arch/arm/configs/ixp4xx_defconfig             |   1 +
 arch/arm/configs/jornada720_defconfig         |   1 +
 arch/arm/configs/keystone_defconfig           |   1 +
 arch/arm/configs/lpc32xx_defconfig            |   1 +
 arch/arm/configs/milbeaut_m10v_defconfig      |   1 +
 arch/arm/configs/mini2440_defconfig           |   1 +
 arch/arm/configs/multi_v5_defconfig           |   1 +
 arch/arm/configs/multi_v7_defconfig           |   1 +
 arch/arm/configs/mv78xx0_defconfig            |   1 +
 arch/arm/configs/mvebu_v5_defconfig           |   1 +
 arch/arm/configs/netwinder_defconfig          |   1 +
 arch/arm/configs/nhk8815_defconfig            |   1 +
 arch/arm/configs/omap1_defconfig              |   1 +
 arch/arm/configs/omap2plus_defconfig          |   1 +
 arch/arm/configs/orion5x_defconfig            |   1 +
 arch/arm/configs/oxnas_v6_defconfig           |   1 +
 arch/arm/configs/palmz72_defconfig            |   1 +
 arch/arm/configs/pleb_defconfig               |   1 +
 arch/arm/configs/prima2_defconfig             |   1 +
 arch/arm/configs/pxa3xx_defconfig             |   1 +
 arch/arm/configs/pxa_defconfig                |   1 +
 arch/arm/configs/qcom_defconfig               |   1 +
 arch/arm/configs/rpc_defconfig                |   1 +
 arch/arm/configs/s3c2410_defconfig            |   1 +
 arch/arm/configs/s3c6400_defconfig            |   1 +
 arch/arm/configs/s5pv210_defconfig            |   1 +
 arch/arm/configs/sama5_defconfig              |   1 +
 arch/arm/configs/simpad_defconfig             |   1 +
 arch/arm/configs/socfpga_defconfig            |   1 +
 arch/arm/configs/spitz_defconfig              |   1 +
 arch/arm/configs/tango4_defconfig             |   1 +
 arch/arm/configs/tegra_defconfig              |   1 +
 arch/arm/configs/trizeps4_defconfig           |   1 +
 arch/arm/configs/viper_defconfig              |   1 +
 arch/arm/configs/zeus_defconfig               |   1 +
 arch/arm/configs/zx_defconfig                 |   1 +
 arch/arm64/configs/defconfig                  |   1 +
 arch/c6x/configs/dsk6455_defconfig            |   1 +
 arch/c6x/configs/evmc6457_defconfig           |   1 +
 arch/c6x/configs/evmc6472_defconfig           |   1 +
 arch/c6x/configs/evmc6474_defconfig           |   1 +
 arch/c6x/configs/evmc6678_defconfig           |   1 +
 arch/csky/configs/defconfig                   |   1 +
 arch/hexagon/configs/comet_defconfig          |   1 +
 arch/ia64/configs/bigsur_defconfig            |   1 +
 arch/ia64/configs/generic_defconfig           |   1 +
 arch/ia64/configs/gensparse_defconfig         |   1 +
 arch/ia64/configs/tiger_defconfig             |   1 +
 arch/ia64/configs/zx1_defconfig               |   1 +
 arch/m68k/configs/amiga_defconfig             |   1 +
 arch/m68k/configs/apollo_defconfig            |   1 +
 arch/m68k/configs/atari_defconfig             |   1 +
 arch/m68k/configs/bvme6000_defconfig          |   1 +
 arch/m68k/configs/hp300_defconfig             |   1 +
 arch/m68k/configs/mac_defconfig               |   1 +
 arch/m68k/configs/multi_defconfig             |   1 +
 arch/m68k/configs/mvme147_defconfig           |   1 +
 arch/m68k/configs/mvme16x_defconfig           |   1 +
 arch/m68k/configs/q40_defconfig               |   1 +
 arch/m68k/configs/sun3_defconfig              |   1 +
 arch/m68k/configs/sun3x_defconfig             |   1 +
 arch/mips/configs/bigsur_defconfig            |   1 +
 arch/mips/configs/cavium_octeon_defconfig     |   1 +
 arch/mips/configs/cobalt_defconfig            |   1 +
 arch/mips/configs/decstation_64_defconfig     |   1 +
 arch/mips/configs/decstation_defconfig        |   1 +
 arch/mips/configs/decstation_r4k_defconfig    |   1 +
 arch/mips/configs/fuloong2e_defconfig         |   1 +
 arch/mips/configs/generic/board-ocelot.config |   1 +
 arch/mips/configs/gpr_defconfig               |   1 +
 arch/mips/configs/ip27_defconfig              |   1 +
 arch/mips/configs/ip32_defconfig              |   1 +
 arch/mips/configs/jazz_defconfig              |   1 +
 arch/mips/configs/lemote2f_defconfig          |   1 +
 arch/mips/configs/loongson1b_defconfig        |   1 +
 arch/mips/configs/loongson1c_defconfig        |   1 +
 arch/mips/configs/loongson3_defconfig         |   1 +
 arch/mips/configs/malta_defconfig             |   1 +
 arch/mips/configs/malta_kvm_defconfig         |   1 +
 arch/mips/configs/malta_kvm_guest_defconfig   |   1 +
 arch/mips/configs/malta_qemu_32r6_defconfig   |   1 +
 arch/mips/configs/maltaaprp_defconfig         |   1 +
 arch/mips/configs/maltasmvp_defconfig         |   1 +
 arch/mips/configs/maltasmvp_eva_defconfig     |   1 +
 arch/mips/configs/maltaup_defconfig           |   1 +
 arch/mips/configs/maltaup_xpa_defconfig       |   1 +
 arch/mips/configs/markeins_defconfig          |   1 +
 arch/mips/configs/mips_paravirt_defconfig     |   1 +
 arch/mips/configs/nlm_xlp_defconfig           |   1 +
 arch/mips/configs/nlm_xlr_defconfig           |   1 +
 arch/mips/configs/pic32mzda_defconfig         |   1 +
 arch/mips/configs/pistachio_defconfig         |   1 +
 arch/mips/configs/pnx8335_stb225_defconfig    |   1 +
 arch/mips/configs/rbtx49xx_defconfig          |   1 +
 arch/mips/configs/rm200_defconfig             |   1 +
 arch/mips/configs/tb0219_defconfig            |   1 +
 arch/mips/configs/tb0226_defconfig            |   1 +
 arch/mips/configs/tb0287_defconfig            |   1 +
 arch/nios2/configs/10m50_defconfig            |   1 +
 arch/nios2/configs/3c120_defconfig            |   1 +
 arch/parisc/configs/712_defconfig             |   1 +
 arch/parisc/configs/a500_defconfig            |   1 +
 arch/parisc/configs/b180_defconfig            |   1 +
 arch/parisc/configs/c3000_defconfig           |   1 +
 arch/parisc/configs/c8000_defconfig           |   1 +
 arch/parisc/configs/defconfig                 |   1 +
 arch/parisc/configs/generic-32bit_defconfig   |   1 +
 arch/parisc/configs/generic-64bit_defconfig   |   1 +
 arch/powerpc/configs/40x/virtex_defconfig     |   1 +
 arch/powerpc/configs/44x/sam440ep_defconfig   |   1 +
 arch/powerpc/configs/44x/virtex5_defconfig    |   1 +
 arch/powerpc/configs/52xx/cm5200_defconfig    |   1 +
 arch/powerpc/configs/52xx/lite5200b_defconfig |   1 +
 arch/powerpc/configs/52xx/motionpro_defconfig |   1 +
 arch/powerpc/configs/52xx/tqm5200_defconfig   |   1 +
 arch/powerpc/configs/83xx/asp8347_defconfig   |   1 +
 .../configs/83xx/mpc8313_rdb_defconfig        |   1 +
 .../configs/83xx/mpc8315_rdb_defconfig        |   1 +
 .../configs/83xx/mpc832x_mds_defconfig        |   1 +
 .../configs/83xx/mpc832x_rdb_defconfig        |   1 +
 .../configs/83xx/mpc834x_itx_defconfig        |   1 +
 .../configs/83xx/mpc834x_itxgp_defconfig      |   1 +
 .../configs/83xx/mpc834x_mds_defconfig        |   1 +
 .../configs/83xx/mpc836x_mds_defconfig        |   1 +
 .../configs/83xx/mpc836x_rdk_defconfig        |   1 +
 .../configs/83xx/mpc837x_mds_defconfig        |   1 +
 .../configs/83xx/mpc837x_rdb_defconfig        |   1 +
 arch/powerpc/configs/85xx/ge_imp3a_defconfig  |   1 +
 arch/powerpc/configs/85xx/ksi8560_defconfig   |   1 +
 .../configs/85xx/mpc8540_ads_defconfig        |   1 +
 .../configs/85xx/mpc8560_ads_defconfig        |   1 +
 .../configs/85xx/mpc85xx_cds_defconfig        |   1 +
 arch/powerpc/configs/85xx/sbc8548_defconfig   |   1 +
 arch/powerpc/configs/85xx/socrates_defconfig  |   1 +
 arch/powerpc/configs/85xx/stx_gp3_defconfig   |   1 +
 arch/powerpc/configs/85xx/tqm8540_defconfig   |   1 +
 arch/powerpc/configs/85xx/tqm8541_defconfig   |   1 +
 arch/powerpc/configs/85xx/tqm8548_defconfig   |   1 +
 arch/powerpc/configs/85xx/tqm8555_defconfig   |   1 +
 arch/powerpc/configs/85xx/tqm8560_defconfig   |   1 +
 .../configs/85xx/xes_mpc85xx_defconfig        |   1 +
 arch/powerpc/configs/amigaone_defconfig       |   1 +
 arch/powerpc/configs/cell_defconfig           |   1 +
 arch/powerpc/configs/chrp32_defconfig         |   1 +
 arch/powerpc/configs/ep8248e_defconfig        |   1 +
 arch/powerpc/configs/fsl-emb-nonhw.config     |   1 +
 arch/powerpc/configs/g5_defconfig             |   1 +
 arch/powerpc/configs/gamecube_defconfig       |   1 +
 arch/powerpc/configs/holly_defconfig          |   1 +
 arch/powerpc/configs/linkstation_defconfig    |   1 +
 arch/powerpc/configs/mgcoge_defconfig         |   1 +
 arch/powerpc/configs/mpc5200_defconfig        |   1 +
 arch/powerpc/configs/mpc7448_hpc2_defconfig   |   1 +
 arch/powerpc/configs/mpc8272_ads_defconfig    |   1 +
 arch/powerpc/configs/mpc83xx_defconfig        |   1 +
 arch/powerpc/configs/mpc866_ads_defconfig     |   1 +
 arch/powerpc/configs/mvme5100_defconfig       |   1 +
 arch/powerpc/configs/pasemi_defconfig         |   1 +
 arch/powerpc/configs/pmac32_defconfig         |   1 +
 arch/powerpc/configs/powernv_defconfig        |   1 +
 arch/powerpc/configs/ppc64_defconfig          |   1 +
 arch/powerpc/configs/ppc64e_defconfig         |   1 +
 arch/powerpc/configs/ppc6xx_defconfig         |   1 +
 arch/powerpc/configs/pq2fads_defconfig        |   1 +
 arch/powerpc/configs/ps3_defconfig            |   1 +
 arch/powerpc/configs/pseries_defconfig        |   1 +
 arch/powerpc/configs/skiroot_defconfig        |   1 +
 arch/powerpc/configs/wii_defconfig            |   1 +
 arch/riscv/configs/defconfig                  |   1 +
 arch/riscv/configs/rv32_defconfig             |   1 +
 arch/s390/configs/debug_defconfig             |   1 +
 arch/s390/configs/defconfig                   |   1 +
 arch/sh/configs/cayman_defconfig              |   1 +
 arch/sh/configs/landisk_defconfig             |   1 +
 arch/sh/configs/lboxre2_defconfig             |   1 +
 arch/sh/configs/rsk7264_defconfig             |   1 +
 arch/sh/configs/sdk7780_defconfig             |   1 +
 arch/sh/configs/sdk7786_defconfig             |   1 +
 arch/sh/configs/se7206_defconfig              |   1 +
 arch/sh/configs/se7780_defconfig              |   1 +
 arch/sh/configs/sh03_defconfig                |   1 +
 arch/sh/configs/sh2007_defconfig              |   1 +
 arch/sh/configs/sh7785lcr_32bit_defconfig     |   1 +
 arch/sh/configs/shmin_defconfig               |   1 +
 arch/sh/configs/titan_defconfig               |   1 +
 arch/sparc/configs/sparc32_defconfig          |   1 +
 arch/sparc/configs/sparc64_defconfig          |   1 +
 arch/um/configs/i386_defconfig                |   1 +
 arch/um/configs/x86_64_defconfig              |   1 +
 arch/unicore32/configs/defconfig              |   1 +
 arch/x86/configs/i386_defconfig               |   1 +
 arch/x86/configs/x86_64_defconfig             |   1 +
 arch/xtensa/configs/audio_kc705_defconfig     |   1 +
 arch/xtensa/configs/cadence_csp_defconfig     |   1 +
 arch/xtensa/configs/generic_kc705_defconfig   |   1 +
 arch/xtensa/configs/nommu_kc705_defconfig     |   1 +
 arch/xtensa/configs/smp_lx200_defconfig       |   1 +
 arch/xtensa/configs/virt_defconfig            |   1 +
 drivers/block/Kconfig                         |  73 +-
 drivers/block/Makefile                        |   4 +-
 drivers/block/loop/Kconfig                    |  93 ++
 drivers/block/loop/Makefile                   |  13 +
 drivers/block/{ => loop}/cryptoloop.c         |   2 +-
 drivers/block/loop/loop_file_fmt.c            | 328 ++++++
 drivers/block/loop/loop_file_fmt.h            | 351 +++++++
 drivers/block/loop/loop_file_fmt_qcow_cache.c | 218 ++++
 drivers/block/loop/loop_file_fmt_qcow_cache.h |  51 +
 .../block/loop/loop_file_fmt_qcow_cluster.c   | 270 +++++
 .../block/loop/loop_file_fmt_qcow_cluster.h   |  23 +
 drivers/block/loop/loop_file_fmt_qcow_main.c  | 945 ++++++++++++++++++
 drivers/block/loop/loop_file_fmt_qcow_main.h  | 417 ++++++++
 drivers/block/loop/loop_file_fmt_raw.c        | 449 +++++++++
 drivers/block/{loop.c => loop/loop_main.c}    | 567 ++++-------
 drivers/block/{loop.h => loop/loop_main.h}    |  14 +-
 include/uapi/linux/loop.h                     |  14 +-
 248 files changed, 3861 insertions(+), 422 deletions(-)
 create mode 100644 Documentation/admin-guide/blockdev/loop.rst
 create mode 100644 Documentation/driver-api/loop-file-fmt.rst
 create mode 100644 drivers/block/loop/Kconfig
 create mode 100644 drivers/block/loop/Makefile
 rename drivers/block/{ => loop}/cryptoloop.c (99%)
 create mode 100644 drivers/block/loop/loop_file_fmt.c
 create mode 100644 drivers/block/loop/loop_file_fmt.h
 create mode 100644 drivers/block/loop/loop_file_fmt_qcow_cache.c
 create mode 100644 drivers/block/loop/loop_file_fmt_qcow_cache.h
 create mode 100644 drivers/block/loop/loop_file_fmt_qcow_cluster.c
 create mode 100644 drivers/block/loop/loop_file_fmt_qcow_cluster.h
 create mode 100644 drivers/block/loop/loop_file_fmt_qcow_main.c
 create mode 100644 drivers/block/loop/loop_file_fmt_qcow_main.h
 create mode 100644 drivers/block/loop/loop_file_fmt_raw.c
 rename drivers/block/{loop.c => loop/loop_main.c} (86%)
 rename drivers/block/{loop.h => loop/loop_main.h} (92%)

Comments

Bart Van Assche Aug. 24, 2019, 3:37 a.m. UTC | #1
On 8/23/19 3:56 PM, development@manuel-bentele.de wrote:
> During the discussion, it turned out that the implementation as device
> mapper target is not applicable. The device mapper stacks different
> functionality such as compression or encryption on multiple block device
> layers whereas an implementation for the QCOW2 container format provides
> these functionalities on one block device layer.

Hi Manuel,

Is there a more detailed discussion available of this subject? Are you 
familiar with the dm-crypt driver?

Thanks,

Bart.
Manuel Bentele Aug. 24, 2019, 9:14 a.m. UTC | #2
Hi Bart

Thanks for your quick reply.

On 8/24/19 5:37 AM, Bart Van Assche wrote:
> On 8/23/19 3:56 PM, development@manuel-bentele.de wrote:
>> During the discussion, it turned out that the implementation as device
>> mapper target is not applicable. The device mapper stacks different
>> functionality such as compression or encryption on multiple block device
>> layers whereas an implementation for the QCOW2 container format provides
>> these functionalities on one block device layer.
>
> Hi Manuel,
>
> Is there a more detailed discussion available of this subject?
No, the only discussion is the referenced one [1]. But there was a
similar discussion in the master's thesis of Francesc Zacarias Ribot
[2]. Unfortunately, I found no attempt on the mailing list that proposes
his solution.

> Are you familiar with the dm-crypt driver?
I don't know the specific implementation details, but I use this driver
personally and I like it. Do you want to propose that only the storage
aspect of the QCOW2 container format should be used and all other
functionality inside the container should be provided by available
device mapper targets?

> [...]

Regards,
Manuel

[1] https://www.spinics.net/lists/linux-block/msg39538.html
[2] Francesc Zacarias Ribot: QLOOP Linux driver to mount QCOW2 virtual
disks; June 23, 2010;
https://upcommons.upc.edu/bitstream/handle/2099.1/9619/65757.pdf
Manuel Bentele Aug. 24, 2019, 11:10 a.m. UTC | #3
Hi

I realized that the first patch of my patch series is missing, although 
I successfully send them to the block mailing list. In addition to that, 
I checked my mail server's log and I personally received a copy of my 
patches. Also, I have not received any undelivered mail message from the 
mailing list sever. So everything seems fine.

In preparation for submitting the patch series, I checked the size of 
each patch. I can confirm that all patches are smaller than the 300kB 
size limit stated in the documentation [1]. Did I do something wrong?

Regards,
Manuel

[1] 
https://www.kernel.org/doc/html/v5.3-rc5/process/submitting-patches.html#e-mail-size

On 8/24/19 12:56 AM, development@manuel-bentele.de wrote:
> From: Manuel Bentele <development@manuel-bentele.de>
>
> Hi
>
> Regarding to the following discussion [1] on the mailing list I show you
> the result of my work as announced at the end of the discussion [2].
>
> The discussion was about the project topic of how to implement the
> reading/writing of QCOW2 in the kernel. The project focuses on an read-only
> in-kernel QCOW2 implementation to increase the read/write performance
> and tries to avoid nbd. Furthermore, the project is part of a project
> series to develop a in-kernel network boot infrastructure that has no need
> for any user space interaction (e.g. nbd) anymore.
>
> During the discussion, it turned out that the implementation as device
> mapper target is not applicable. The device mapper stacks different
> functionality such as compression or encryption on multiple block device
> layers whereas an implementation for the QCOW2 container format provides
> these functionalities on one block device layer. Using FUSE is also not
> possible due to performance reasons and user space interaction.
>
> Therefore, I propose the extension of the loop device module. I created a
> new file format subsystem which is part of the loop device module. The file
> format subsystem abstracts the direct file access and provides an driver
> API to implement various disk file formats such as QCOW2, VDI and VMDK.
> File format drivers are implemented as kernel modules and can be registered
> by the file format subsystem.
>
> The patch series contains documentation for the file format subsystem and
> the loop device module, too. Also, it provides a default RAW file format
> driver and a read-only QCOW2 driver. The RAW file format driver is based on
> the file specific parts of the existing loop device implementation and
> preserves the default behaviour of a loop device. More specific information
> can be found in the commit logs of the following patches.
>
> Regards,
> Manuel
>
> [1] https://www.spinics.net/lists/linux-block/msg39538.html
> [2] https://www.spinics.net/lists/linux-block/msg40479.html
>
> Manuel Bentele (5):
>    block: loop: add file format subsystem for loop devices
>    doc: admin-guide: add loop block device documentation
>    doc: driver-api: add loop file format subsystem API documentation
>    block: loop: add QCOW2 loop file format driver (read-only)
>    doc: admin-guide: add QCOW2 file format to loop device documentation
>
>   Documentation/admin-guide/blockdev/index.rst  |   1 +
>   Documentation/admin-guide/blockdev/loop.rst   |  85 ++
>   Documentation/driver-api/index.rst            |   1 +
>   Documentation/driver-api/loop-file-fmt.rst    | 137 +++
>   arch/alpha/configs/defconfig                  |   1 +
>   arch/arc/configs/axs103_defconfig             |   1 +
>   arch/arc/configs/axs103_smp_defconfig         |   1 +
>   arch/arm/configs/am200epdkit_defconfig        |   1 +
>   arch/arm/configs/aspeed_g4_defconfig          |   1 +
>   arch/arm/configs/aspeed_g5_defconfig          |   1 +
>   arch/arm/configs/assabet_defconfig            |   1 +
>   arch/arm/configs/at91_dt_defconfig            |   1 +
>   arch/arm/configs/axm55xx_defconfig            |   1 +
>   arch/arm/configs/badge4_defconfig             |   1 +
>   arch/arm/configs/cerfcube_defconfig           |   1 +
>   arch/arm/configs/cm_x2xx_defconfig            |   1 +
>   arch/arm/configs/cm_x300_defconfig            |   1 +
>   arch/arm/configs/cns3420vb_defconfig          |   1 +
>   arch/arm/configs/colibri_pxa270_defconfig     |   1 +
>   arch/arm/configs/collie_defconfig             |   1 +
>   arch/arm/configs/corgi_defconfig              |   1 +
>   arch/arm/configs/davinci_all_defconfig        |   1 +
>   arch/arm/configs/dove_defconfig               |   1 +
>   arch/arm/configs/em_x270_defconfig            |   1 +
>   arch/arm/configs/eseries_pxa_defconfig        |   1 +
>   arch/arm/configs/exynos_defconfig             |   1 +
>   arch/arm/configs/ezx_defconfig                |   1 +
>   arch/arm/configs/footbridge_defconfig         |   1 +
>   arch/arm/configs/h3600_defconfig              |   1 +
>   arch/arm/configs/imote2_defconfig             |   1 +
>   arch/arm/configs/imx_v6_v7_defconfig          |   1 +
>   arch/arm/configs/integrator_defconfig         |   1 +
>   arch/arm/configs/iop32x_defconfig             |   1 +
>   arch/arm/configs/ixp4xx_defconfig             |   1 +
>   arch/arm/configs/jornada720_defconfig         |   1 +
>   arch/arm/configs/keystone_defconfig           |   1 +
>   arch/arm/configs/lpc32xx_defconfig            |   1 +
>   arch/arm/configs/milbeaut_m10v_defconfig      |   1 +
>   arch/arm/configs/mini2440_defconfig           |   1 +
>   arch/arm/configs/multi_v5_defconfig           |   1 +
>   arch/arm/configs/multi_v7_defconfig           |   1 +
>   arch/arm/configs/mv78xx0_defconfig            |   1 +
>   arch/arm/configs/mvebu_v5_defconfig           |   1 +
>   arch/arm/configs/netwinder_defconfig          |   1 +
>   arch/arm/configs/nhk8815_defconfig            |   1 +
>   arch/arm/configs/omap1_defconfig              |   1 +
>   arch/arm/configs/omap2plus_defconfig          |   1 +
>   arch/arm/configs/orion5x_defconfig            |   1 +
>   arch/arm/configs/oxnas_v6_defconfig           |   1 +
>   arch/arm/configs/palmz72_defconfig            |   1 +
>   arch/arm/configs/pleb_defconfig               |   1 +
>   arch/arm/configs/prima2_defconfig             |   1 +
>   arch/arm/configs/pxa3xx_defconfig             |   1 +
>   arch/arm/configs/pxa_defconfig                |   1 +
>   arch/arm/configs/qcom_defconfig               |   1 +
>   arch/arm/configs/rpc_defconfig                |   1 +
>   arch/arm/configs/s3c2410_defconfig            |   1 +
>   arch/arm/configs/s3c6400_defconfig            |   1 +
>   arch/arm/configs/s5pv210_defconfig            |   1 +
>   arch/arm/configs/sama5_defconfig              |   1 +
>   arch/arm/configs/simpad_defconfig             |   1 +
>   arch/arm/configs/socfpga_defconfig            |   1 +
>   arch/arm/configs/spitz_defconfig              |   1 +
>   arch/arm/configs/tango4_defconfig             |   1 +
>   arch/arm/configs/tegra_defconfig              |   1 +
>   arch/arm/configs/trizeps4_defconfig           |   1 +
>   arch/arm/configs/viper_defconfig              |   1 +
>   arch/arm/configs/zeus_defconfig               |   1 +
>   arch/arm/configs/zx_defconfig                 |   1 +
>   arch/arm64/configs/defconfig                  |   1 +
>   arch/c6x/configs/dsk6455_defconfig            |   1 +
>   arch/c6x/configs/evmc6457_defconfig           |   1 +
>   arch/c6x/configs/evmc6472_defconfig           |   1 +
>   arch/c6x/configs/evmc6474_defconfig           |   1 +
>   arch/c6x/configs/evmc6678_defconfig           |   1 +
>   arch/csky/configs/defconfig                   |   1 +
>   arch/hexagon/configs/comet_defconfig          |   1 +
>   arch/ia64/configs/bigsur_defconfig            |   1 +
>   arch/ia64/configs/generic_defconfig           |   1 +
>   arch/ia64/configs/gensparse_defconfig         |   1 +
>   arch/ia64/configs/tiger_defconfig             |   1 +
>   arch/ia64/configs/zx1_defconfig               |   1 +
>   arch/m68k/configs/amiga_defconfig             |   1 +
>   arch/m68k/configs/apollo_defconfig            |   1 +
>   arch/m68k/configs/atari_defconfig             |   1 +
>   arch/m68k/configs/bvme6000_defconfig          |   1 +
>   arch/m68k/configs/hp300_defconfig             |   1 +
>   arch/m68k/configs/mac_defconfig               |   1 +
>   arch/m68k/configs/multi_defconfig             |   1 +
>   arch/m68k/configs/mvme147_defconfig           |   1 +
>   arch/m68k/configs/mvme16x_defconfig           |   1 +
>   arch/m68k/configs/q40_defconfig               |   1 +
>   arch/m68k/configs/sun3_defconfig              |   1 +
>   arch/m68k/configs/sun3x_defconfig             |   1 +
>   arch/mips/configs/bigsur_defconfig            |   1 +
>   arch/mips/configs/cavium_octeon_defconfig     |   1 +
>   arch/mips/configs/cobalt_defconfig            |   1 +
>   arch/mips/configs/decstation_64_defconfig     |   1 +
>   arch/mips/configs/decstation_defconfig        |   1 +
>   arch/mips/configs/decstation_r4k_defconfig    |   1 +
>   arch/mips/configs/fuloong2e_defconfig         |   1 +
>   arch/mips/configs/generic/board-ocelot.config |   1 +
>   arch/mips/configs/gpr_defconfig               |   1 +
>   arch/mips/configs/ip27_defconfig              |   1 +
>   arch/mips/configs/ip32_defconfig              |   1 +
>   arch/mips/configs/jazz_defconfig              |   1 +
>   arch/mips/configs/lemote2f_defconfig          |   1 +
>   arch/mips/configs/loongson1b_defconfig        |   1 +
>   arch/mips/configs/loongson1c_defconfig        |   1 +
>   arch/mips/configs/loongson3_defconfig         |   1 +
>   arch/mips/configs/malta_defconfig             |   1 +
>   arch/mips/configs/malta_kvm_defconfig         |   1 +
>   arch/mips/configs/malta_kvm_guest_defconfig   |   1 +
>   arch/mips/configs/malta_qemu_32r6_defconfig   |   1 +
>   arch/mips/configs/maltaaprp_defconfig         |   1 +
>   arch/mips/configs/maltasmvp_defconfig         |   1 +
>   arch/mips/configs/maltasmvp_eva_defconfig     |   1 +
>   arch/mips/configs/maltaup_defconfig           |   1 +
>   arch/mips/configs/maltaup_xpa_defconfig       |   1 +
>   arch/mips/configs/markeins_defconfig          |   1 +
>   arch/mips/configs/mips_paravirt_defconfig     |   1 +
>   arch/mips/configs/nlm_xlp_defconfig           |   1 +
>   arch/mips/configs/nlm_xlr_defconfig           |   1 +
>   arch/mips/configs/pic32mzda_defconfig         |   1 +
>   arch/mips/configs/pistachio_defconfig         |   1 +
>   arch/mips/configs/pnx8335_stb225_defconfig    |   1 +
>   arch/mips/configs/rbtx49xx_defconfig          |   1 +
>   arch/mips/configs/rm200_defconfig             |   1 +
>   arch/mips/configs/tb0219_defconfig            |   1 +
>   arch/mips/configs/tb0226_defconfig            |   1 +
>   arch/mips/configs/tb0287_defconfig            |   1 +
>   arch/nios2/configs/10m50_defconfig            |   1 +
>   arch/nios2/configs/3c120_defconfig            |   1 +
>   arch/parisc/configs/712_defconfig             |   1 +
>   arch/parisc/configs/a500_defconfig            |   1 +
>   arch/parisc/configs/b180_defconfig            |   1 +
>   arch/parisc/configs/c3000_defconfig           |   1 +
>   arch/parisc/configs/c8000_defconfig           |   1 +
>   arch/parisc/configs/defconfig                 |   1 +
>   arch/parisc/configs/generic-32bit_defconfig   |   1 +
>   arch/parisc/configs/generic-64bit_defconfig   |   1 +
>   arch/powerpc/configs/40x/virtex_defconfig     |   1 +
>   arch/powerpc/configs/44x/sam440ep_defconfig   |   1 +
>   arch/powerpc/configs/44x/virtex5_defconfig    |   1 +
>   arch/powerpc/configs/52xx/cm5200_defconfig    |   1 +
>   arch/powerpc/configs/52xx/lite5200b_defconfig |   1 +
>   arch/powerpc/configs/52xx/motionpro_defconfig |   1 +
>   arch/powerpc/configs/52xx/tqm5200_defconfig   |   1 +
>   arch/powerpc/configs/83xx/asp8347_defconfig   |   1 +
>   .../configs/83xx/mpc8313_rdb_defconfig        |   1 +
>   .../configs/83xx/mpc8315_rdb_defconfig        |   1 +
>   .../configs/83xx/mpc832x_mds_defconfig        |   1 +
>   .../configs/83xx/mpc832x_rdb_defconfig        |   1 +
>   .../configs/83xx/mpc834x_itx_defconfig        |   1 +
>   .../configs/83xx/mpc834x_itxgp_defconfig      |   1 +
>   .../configs/83xx/mpc834x_mds_defconfig        |   1 +
>   .../configs/83xx/mpc836x_mds_defconfig        |   1 +
>   .../configs/83xx/mpc836x_rdk_defconfig        |   1 +
>   .../configs/83xx/mpc837x_mds_defconfig        |   1 +
>   .../configs/83xx/mpc837x_rdb_defconfig        |   1 +
>   arch/powerpc/configs/85xx/ge_imp3a_defconfig  |   1 +
>   arch/powerpc/configs/85xx/ksi8560_defconfig   |   1 +
>   .../configs/85xx/mpc8540_ads_defconfig        |   1 +
>   .../configs/85xx/mpc8560_ads_defconfig        |   1 +
>   .../configs/85xx/mpc85xx_cds_defconfig        |   1 +
>   arch/powerpc/configs/85xx/sbc8548_defconfig   |   1 +
>   arch/powerpc/configs/85xx/socrates_defconfig  |   1 +
>   arch/powerpc/configs/85xx/stx_gp3_defconfig   |   1 +
>   arch/powerpc/configs/85xx/tqm8540_defconfig   |   1 +
>   arch/powerpc/configs/85xx/tqm8541_defconfig   |   1 +
>   arch/powerpc/configs/85xx/tqm8548_defconfig   |   1 +
>   arch/powerpc/configs/85xx/tqm8555_defconfig   |   1 +
>   arch/powerpc/configs/85xx/tqm8560_defconfig   |   1 +
>   .../configs/85xx/xes_mpc85xx_defconfig        |   1 +
>   arch/powerpc/configs/amigaone_defconfig       |   1 +
>   arch/powerpc/configs/cell_defconfig           |   1 +
>   arch/powerpc/configs/chrp32_defconfig         |   1 +
>   arch/powerpc/configs/ep8248e_defconfig        |   1 +
>   arch/powerpc/configs/fsl-emb-nonhw.config     |   1 +
>   arch/powerpc/configs/g5_defconfig             |   1 +
>   arch/powerpc/configs/gamecube_defconfig       |   1 +
>   arch/powerpc/configs/holly_defconfig          |   1 +
>   arch/powerpc/configs/linkstation_defconfig    |   1 +
>   arch/powerpc/configs/mgcoge_defconfig         |   1 +
>   arch/powerpc/configs/mpc5200_defconfig        |   1 +
>   arch/powerpc/configs/mpc7448_hpc2_defconfig   |   1 +
>   arch/powerpc/configs/mpc8272_ads_defconfig    |   1 +
>   arch/powerpc/configs/mpc83xx_defconfig        |   1 +
>   arch/powerpc/configs/mpc866_ads_defconfig     |   1 +
>   arch/powerpc/configs/mvme5100_defconfig       |   1 +
>   arch/powerpc/configs/pasemi_defconfig         |   1 +
>   arch/powerpc/configs/pmac32_defconfig         |   1 +
>   arch/powerpc/configs/powernv_defconfig        |   1 +
>   arch/powerpc/configs/ppc64_defconfig          |   1 +
>   arch/powerpc/configs/ppc64e_defconfig         |   1 +
>   arch/powerpc/configs/ppc6xx_defconfig         |   1 +
>   arch/powerpc/configs/pq2fads_defconfig        |   1 +
>   arch/powerpc/configs/ps3_defconfig            |   1 +
>   arch/powerpc/configs/pseries_defconfig        |   1 +
>   arch/powerpc/configs/skiroot_defconfig        |   1 +
>   arch/powerpc/configs/wii_defconfig            |   1 +
>   arch/riscv/configs/defconfig                  |   1 +
>   arch/riscv/configs/rv32_defconfig             |   1 +
>   arch/s390/configs/debug_defconfig             |   1 +
>   arch/s390/configs/defconfig                   |   1 +
>   arch/sh/configs/cayman_defconfig              |   1 +
>   arch/sh/configs/landisk_defconfig             |   1 +
>   arch/sh/configs/lboxre2_defconfig             |   1 +
>   arch/sh/configs/rsk7264_defconfig             |   1 +
>   arch/sh/configs/sdk7780_defconfig             |   1 +
>   arch/sh/configs/sdk7786_defconfig             |   1 +
>   arch/sh/configs/se7206_defconfig              |   1 +
>   arch/sh/configs/se7780_defconfig              |   1 +
>   arch/sh/configs/sh03_defconfig                |   1 +
>   arch/sh/configs/sh2007_defconfig              |   1 +
>   arch/sh/configs/sh7785lcr_32bit_defconfig     |   1 +
>   arch/sh/configs/shmin_defconfig               |   1 +
>   arch/sh/configs/titan_defconfig               |   1 +
>   arch/sparc/configs/sparc32_defconfig          |   1 +
>   arch/sparc/configs/sparc64_defconfig          |   1 +
>   arch/um/configs/i386_defconfig                |   1 +
>   arch/um/configs/x86_64_defconfig              |   1 +
>   arch/unicore32/configs/defconfig              |   1 +
>   arch/x86/configs/i386_defconfig               |   1 +
>   arch/x86/configs/x86_64_defconfig             |   1 +
>   arch/xtensa/configs/audio_kc705_defconfig     |   1 +
>   arch/xtensa/configs/cadence_csp_defconfig     |   1 +
>   arch/xtensa/configs/generic_kc705_defconfig   |   1 +
>   arch/xtensa/configs/nommu_kc705_defconfig     |   1 +
>   arch/xtensa/configs/smp_lx200_defconfig       |   1 +
>   arch/xtensa/configs/virt_defconfig            |   1 +
>   drivers/block/Kconfig                         |  73 +-
>   drivers/block/Makefile                        |   4 +-
>   drivers/block/loop/Kconfig                    |  93 ++
>   drivers/block/loop/Makefile                   |  13 +
>   drivers/block/{ => loop}/cryptoloop.c         |   2 +-
>   drivers/block/loop/loop_file_fmt.c            | 328 ++++++
>   drivers/block/loop/loop_file_fmt.h            | 351 +++++++
>   drivers/block/loop/loop_file_fmt_qcow_cache.c | 218 ++++
>   drivers/block/loop/loop_file_fmt_qcow_cache.h |  51 +
>   .../block/loop/loop_file_fmt_qcow_cluster.c   | 270 +++++
>   .../block/loop/loop_file_fmt_qcow_cluster.h   |  23 +
>   drivers/block/loop/loop_file_fmt_qcow_main.c  | 945 ++++++++++++++++++
>   drivers/block/loop/loop_file_fmt_qcow_main.h  | 417 ++++++++
>   drivers/block/loop/loop_file_fmt_raw.c        | 449 +++++++++
>   drivers/block/{loop.c => loop/loop_main.c}    | 567 ++++-------
>   drivers/block/{loop.h => loop/loop_main.h}    |  14 +-
>   include/uapi/linux/loop.h                     |  14 +-
>   248 files changed, 3861 insertions(+), 422 deletions(-)
>   create mode 100644 Documentation/admin-guide/blockdev/loop.rst
>   create mode 100644 Documentation/driver-api/loop-file-fmt.rst
>   create mode 100644 drivers/block/loop/Kconfig
>   create mode 100644 drivers/block/loop/Makefile
>   rename drivers/block/{ => loop}/cryptoloop.c (99%)
>   create mode 100644 drivers/block/loop/loop_file_fmt.c
>   create mode 100644 drivers/block/loop/loop_file_fmt.h
>   create mode 100644 drivers/block/loop/loop_file_fmt_qcow_cache.c
>   create mode 100644 drivers/block/loop/loop_file_fmt_qcow_cache.h
>   create mode 100644 drivers/block/loop/loop_file_fmt_qcow_cluster.c
>   create mode 100644 drivers/block/loop/loop_file_fmt_qcow_cluster.h
>   create mode 100644 drivers/block/loop/loop_file_fmt_qcow_main.c
>   create mode 100644 drivers/block/loop/loop_file_fmt_qcow_main.h
>   create mode 100644 drivers/block/loop/loop_file_fmt_raw.c
>   rename drivers/block/{loop.c => loop/loop_main.c} (86%)
>   rename drivers/block/{loop.h => loop/loop_main.h} (92%)
>
Bart Van Assche Aug. 24, 2019, 4:04 p.m. UTC | #4
On 8/24/19 2:14 AM, Manuel Bentele wrote:
> On 8/24/19 5:37 AM, Bart Van Assche wrote:
>> On 8/23/19 3:56 PM, development@manuel-bentele.de wrote:
>>> During the discussion, it turned out that the implementation as device
>>> mapper target is not applicable. The device mapper stacks different
>>> functionality such as compression or encryption on multiple block device
>>> layers whereas an implementation for the QCOW2 container format provides
>>> these functionalities on one block device layer.
>>
>> Is there a more detailed discussion available of this subject?
 >
> No, the only discussion is the referenced one [1]. But there was a
> similar discussion in the master's thesis of Francesc Zacarias Ribot
> [2]. Unfortunately, I found no attempt on the mailing list that proposes
> his solution.
> 
>> Are you familiar with the dm-crypt driver?
 >
> I don't know the specific implementation details, but I use this driver
> personally and I like it. Do you want to propose that only the storage
> aspect of the QCOW2 container format should be used and all other
> functionality inside the container should be provided by available
> device mapper targets?

(+Mike Snitzer)

Hmm, I haven't found any reference to the device mapper in the document 
written by Francesc. Maybe that means that I overlooked something?

I referred to the dm-crypt driver because I think that's an example that 
shows that QCOW2 file format support could be implemented using the 
device mapper framework.

Mike, do you perhaps want to comment on what the most appropriate way is 
to implement such functionality? The entire patch series is available at 
https://lore.kernel.org/linux-block/86279379-32ac-15e9-2f91-68ce9c94cfbf@manuel-bentele.de/T/#t.

Thanks,

Bart.
Manuel Bentele Aug. 25, 2019, 12:15 p.m. UTC | #5
On 8/24/19 6:04 PM, Bart Van Assche wrote:
> On 8/24/19 2:14 AM, Manuel Bentele wrote:
>> On 8/24/19 5:37 AM, Bart Van Assche wrote:
>>> On 8/23/19 3:56 PM, development@manuel-bentele.de wrote:
>>>> During the discussion, it turned out that the implementation as device
>>>> mapper target is not applicable. The device mapper stacks different
>>>> functionality such as compression or encryption on multiple block
>>>> device
>>>> layers whereas an implementation for the QCOW2 container format
>>>> provides
>>>> these functionalities on one block device layer.
>>>
>>> Is there a more detailed discussion available of this subject?
> >
>> No, the only discussion is the referenced one [1]. But there was a
>> similar discussion in the master's thesis of Francesc Zacarias Ribot
>> [2]. Unfortunately, I found no attempt on the mailing list that proposes
>> his solution.
>>
>>> Are you familiar with the dm-crypt driver?
> >
>> I don't know the specific implementation details, but I use this driver
>> personally and I like it. Do you want to propose that only the storage
>> aspect of the QCOW2 container format should be used and all other
>> functionality inside the container should be provided by available
>> device mapper targets?
>
> (+Mike Snitzer)
>
> Hmm, I haven't found any reference to the device mapper in the
> document written by Francesc. Maybe that means that I overlooked
> something?
Oh sorry, you're right. I meant this in general for the topic 'QCOW2 in
the kernel space'.

> I referred to the dm-crypt driver because I think that's an example
> that shows that QCOW2 file format support could be implemented using
> the device mapper framework.
Okay, now I get it :)

> Mike, do you perhaps want to comment on what the most appropriate way
> is to implement such functionality?

To implement the QCOW2 format or other sparse container formats
correctly, the implementation must be able to ...
  - extend the capacity of the mapped block device
  - shrink the capacity of the mapped block device
  - rescan the paritions of the mapped block device

Are all three functionalities feasible using the device mapper framework?

> The entire patch series is available at
> https://lore.kernel.org/linux-block/86279379-32ac-15e9-2f91-68ce9c94cfbf@manuel-bentele.de/T/#t.

Note that PATCH [1/5] is missing in this series, although I've submitted
it twice. I asked already in [1] for the reason but haven't received any
answer, yet. Therefore, I temporarily insert a link to my repository
showing the missing PATCH [1/5]:
https://github.com/bahnwaerter/linux/commit/7a78da744b4c84809ad6aa20673a2b686bafb201

Regards,
Manuel

[1] https://www.spinics.net/lists/linux-block/msg44255.html
Manuel Bentele Sept. 9, 2019, 10:12 p.m. UTC | #6
On 8/25/19 2:15 PM, Manuel Bentele wrote:
> On 8/24/19 6:04 PM, Bart Van Assche wrote:
>> On 8/24/19 2:14 AM, Manuel Bentele wrote:
>>> On 8/24/19 5:37 AM, Bart Van Assche wrote:
>>>> On 8/23/19 3:56 PM, development@manuel-bentele.de wrote:
>>>>> During the discussion, it turned out that the implementation as device
>>>>> mapper target is not applicable. The device mapper stacks different
>>>>> functionality such as compression or encryption on multiple block
>>>>> device
>>>>> layers whereas an implementation for the QCOW2 container format
>>>>> provides
>>>>> these functionalities on one block device layer.
>>>> Is there a more detailed discussion available of this subject?
>>> No, the only discussion is the referenced one [1]. But there was a
>>> similar discussion in the master's thesis of Francesc Zacarias Ribot
>>> [2]. Unfortunately, I found no attempt on the mailing list that proposes
>>> his solution.
>>>
>>>> Are you familiar with the dm-crypt driver?
>>> I don't know the specific implementation details, but I use this driver
>>> personally and I like it. Do you want to propose that only the storage
>>> aspect of the QCOW2 container format should be used and all other
>>> functionality inside the container should be provided by available
>>> device mapper targets?
>> (+Mike Snitzer)
>>
>> Hmm, I haven't found any reference to the device mapper in the
>> document written by Francesc. Maybe that means that I overlooked
>> something?
> Oh sorry, you're right. I meant this in general for the topic 'QCOW2 in
> the kernel space'.
>
>> I referred to the dm-crypt driver because I think that's an example
>> that shows that QCOW2 file format support could be implemented using
>> the device mapper framework.
> Okay, now I get it :)
>
>> Mike, do you perhaps want to comment on what the most appropriate way
>> is to implement such functionality?
> To implement the QCOW2 format or other sparse container formats
> correctly, the implementation must be able to ...
>   - extend the capacity of the mapped block device
>   - shrink the capacity of the mapped block device
>   - rescan the paritions of the mapped block device
>
> Are all three functionalities feasible using the device mapper framework?
Because there was no answer, I have analyzed the device mapper in more
detail. I found out, that one can get access to the virtual and
"underlying" devices. The virtual device (mapped_device) is created and
managed by the device mapper. The mapped_device can be obtained in the
constructor of a device mapper target by calling dm_table_get_md(). The
function call needs the table of the dm_target as parameter and returns
a pointer to the mapped_device structure. The structure contains
pointers to the gendisk and the block_device of the mapped_device. The
"underlying" devices of the table can be obtained or added by calling
dm_get_device() in the constructor, too. The call returns a pointer to a
dm_dev structure. Then, the dm_dev structure contains a pointer to its
referenced block_device. Now there is direct access to the block_device
or gendisk structures. This means that one can implement the three
functionalities to support sparse container formats and implement my
file format subsystem and file format drivers as device mapper targets.
But one should take care of the direct access to the block_device and
gendisk structures in a device mapper target because sometimes there is
the risk of bypassing the device mapper framework. Please be careful and
read the comments and descriptions of the exported functions in the
device mapper framework.

Compared to the proposed loop device module integration, this approach
seems harder to achieve for me. Furthermore, the device mapper target
needs an additional user space utility to simplify the control of the
file format subsystem and drivers and help people who are afraid of the
dmsetup utility ;)

Would you accept the proposed file format subsystem and drivers
implemented as device mapper targets?

>> The entire patch series is available at
>> https://lore.kernel.org/linux-block/86279379-32ac-15e9-2f91-68ce9c94cfbf@manuel-bentele.de/T/#t.
> Note that PATCH [1/5] is missing in this series, although I've submitted
> it twice. I asked already in [1] for the reason but haven't received any
> answer, yet. Therefore, I temporarily insert a link to my repository
> showing the missing PATCH [1/5]:
> https://github.com/bahnwaerter/linux/commit/7a78da744b4c84809ad6aa20673a2b686bafb201
>
> Regards,
> Manuel
>
> [1] https://www.spinics.net/lists/linux-block/msg44255.html

Regards,
Manuel
Ming Lei Sept. 12, 2019, 2:24 a.m. UTC | #7
On Sat, Aug 24, 2019 at 12:56:14AM +0200, development@manuel-bentele.de wrote:
> From: Manuel Bentele <development@manuel-bentele.de>
> 
> Hi
> 
> Regarding to the following discussion [1] on the mailing list I show you 
> the result of my work as announced at the end of the discussion [2].
> 
> The discussion was about the project topic of how to implement the 
> reading/writing of QCOW2 in the kernel. The project focuses on an read-only 
> in-kernel QCOW2 implementation to increase the read/write performance 
> and tries to avoid nbd. Furthermore, the project is part of a project 
> series to develop a in-kernel network boot infrastructure that has no need 

I'd suggest you to share more details about this use case first:

1) what is the in-kernel network boot infrastructure? which functions
does it provide for user?

2) how does the in kernel QCOW2 interacts with in-kernel network boot
infrastructure?

3) most important thing, what are the exact steps for one user to use
the in-kernel network boot infrastructure and in-kernel QCOW2?

Without knowing the motivation/purpose and exact use case, it doesn't
make sense to discuss the implementation details, IMO.

Thanks,
Ming
Manuel Bentele Sept. 13, 2019, 11:57 a.m. UTC | #8
Hi Ming,

On 9/12/19 4:24 AM, Ming Lei wrote:
> On Sat, Aug 24, 2019 at 12:56:14AM +0200, development@manuel-bentele.de wrote:
>> From: Manuel Bentele <development@manuel-bentele.de>
>>
>> Hi
>>
>> Regarding to the following discussion [1] on the mailing list I show you 
>> the result of my work as announced at the end of the discussion [2].
>>
>> The discussion was about the project topic of how to implement the 
>> reading/writing of QCOW2 in the kernel. The project focuses on an read-only 
>> in-kernel QCOW2 implementation to increase the read/write performance 
>> and tries to avoid nbd. Furthermore, the project is part of a project 
>> series to develop a in-kernel network boot infrastructure that has no need 
> I'd suggest you to share more details about this use case first:
>
> 1) what is the in-kernel network boot infrastructure? which functions
> does it provide for user?

Some time ago, I started to describe the setup a little bit in [1]. Now
I want to extend the description:

The boot infrastructure is used in the university environment and
quarrels with network-related limitations. Step-by-step, the network
hardware is renewed and improved, but there are still many university
branches which are spread all over the city and connected by poor uplink
connections. Sometimes there exist cases where 15 until 20 desktop
computers have to share only 1 gigabit uplink. To accelerate the network
boot, the idea came up to use the QCOW2 file format and its compression
feature for the image content. Tests have shown, that the usage of
compression is already measurable at gigabit uplinks and clearly
noticeable at 100 megabit uplinks.

The network boot infrastructure is based on a classical PXE network boot
to load the Linux kernel and the initramfs. In the initramfs, the
compressed QCOW2 image is fetched via nfs or cifs or something else. The
fetched QCOW2 image is now decompressed and read in the kernel. Compared
to a decompression and read in the user space, like qemu-nbd does, this
approach does not need any user space process, is faster and avoids
switchroot problems.

> 2) how does the in kernel QCOW2 interacts with in-kernel network boot
> infrastructure?

The in-kernel QCOW2 implementation uses the fetched QCOW2 image and
exposes it as block device.

Therefore, my implementation extends the loop device module by a general
file format subsystem to implement various file format drivers including
a driver for the QCOW2 and RAW file format. The configuration utility
losetup is used to set up a loop device and specify the file format
driver to use.

> 3) most important thing, what are the exact steps for one user to use
> the in-kernel network boot infrastructure and in-kernel QCOW2?

To achieve a running system one have to complete the following items:

  * Set up a PXE boot server and configure client computers to boot from
    the network
  * Build a Linux kernel for the network boot with built-in QCOW2
    implementation
  * Prepare the initramfs for the network boot. Use a network file
    system or copy tool to fetch the compressed QCOW2 image.
  * Create a compressed QCOW2 image that contains a complete environment
    for the user to work with after a successful network boot
  * Set up the reading of the fetched QCOW2 image using the in-kernel
    QCOW2 implementation and mount the file systems located in the QCOW2
    image.
  * Perform a switchroot to change into the mounted environment of the
    QCOW2 image.


Thanks for your help.

Regards,
Manuel

[1] https://www.spinics.net/lists/linux-block/msg39565.html
Ming Lei Sept. 16, 2019, 2:11 a.m. UTC | #9
On Fri, Sep 13, 2019 at 01:57:33PM +0200, Manuel Bentele wrote:
> Hi Ming,
> 
> On 9/12/19 4:24 AM, Ming Lei wrote:
> > On Sat, Aug 24, 2019 at 12:56:14AM +0200, development@manuel-bentele.de wrote:
> >> From: Manuel Bentele <development@manuel-bentele.de>
> >>
> >> Hi
> >>
> >> Regarding to the following discussion [1] on the mailing list I show you 
> >> the result of my work as announced at the end of the discussion [2].
> >>
> >> The discussion was about the project topic of how to implement the 
> >> reading/writing of QCOW2 in the kernel. The project focuses on an read-only 
> >> in-kernel QCOW2 implementation to increase the read/write performance 
> >> and tries to avoid nbd. Furthermore, the project is part of a project 
> >> series to develop a in-kernel network boot infrastructure that has no need 
> > I'd suggest you to share more details about this use case first:
> >
> > 1) what is the in-kernel network boot infrastructure? which functions
> > does it provide for user?
> 
> Some time ago, I started to describe the setup a little bit in [1]. Now
> I want to extend the description:
> 
> The boot infrastructure is used in the university environment and
> quarrels with network-related limitations. Step-by-step, the network
> hardware is renewed and improved, but there are still many university
> branches which are spread all over the city and connected by poor uplink
> connections. Sometimes there exist cases where 15 until 20 desktop
> computers have to share only 1 gigabit uplink. To accelerate the network
> boot, the idea came up to use the QCOW2 file format and its compression
> feature for the image content. Tests have shown, that the usage of
> compression is already measurable at gigabit uplinks and clearly
> noticeable at 100 megabit uplinks.

Got it, looks a good use case for compression, but not has to be QCOW2.

> 
> The network boot infrastructure is based on a classical PXE network boot
> to load the Linux kernel and the initramfs. In the initramfs, the
> compressed QCOW2 image is fetched via nfs or cifs or something else. The
> fetched QCOW2 image is now decompressed and read in the kernel. Compared
> to a decompression and read in the user space, like qemu-nbd does, this
> approach does not need any user space process, is faster and avoids
> switchroot problems.

This image can be compressed via xz, and fetched via wget or what
ever. 'xz' could have better compression ratio than qcow2, I guess.

> 
> > 2) how does the in kernel QCOW2 interacts with in-kernel network boot
> > infrastructure?
> 
> The in-kernel QCOW2 implementation uses the fetched QCOW2 image and
> exposes it as block device.
> 
> Therefore, my implementation extends the loop device module by a general
> file format subsystem to implement various file format drivers including
> a driver for the QCOW2 and RAW file format. The configuration utility
> losetup is used to set up a loop device and specify the file format
> driver to use.

You still need to update losetup.  xz-utils can be installed for
decompressing the image, then you still can create loop disk over
the image.

> 
> > 3) most important thing, what are the exact steps for one user to use
> > the in-kernel network boot infrastructure and in-kernel QCOW2?
> 
> To achieve a running system one have to complete the following items:
> 
>   * Set up a PXE boot server and configure client computers to boot from
>     the network
>   * Build a Linux kernel for the network boot with built-in QCOW2
>     implementation
>   * Prepare the initramfs for the network boot. Use a network file
>     system or copy tool to fetch the compressed QCOW2 image.
>   * Create a compressed QCOW2 image that contains a complete environment
>     for the user to work with after a successful network boot
>   * Set up the reading of the fetched QCOW2 image using the in-kernel
>     QCOW2 implementation and mount the file systems located in the QCOW2
>     image.
>   * Perform a switchroot to change into the mounted environment of the
>     QCOW2 image.

As I mentioned above, seems not necessary to introduce loop-qcow2.

Thanks,
Ming
Simon Rettberg Sept. 18, 2019, 10:26 a.m. UTC | #10
Hi everyone,

chiming in for clearing this up a bit.

> Got it, looks a good use case for compression, but not has to be
> QCOW2.
> 
> > 
> > The network boot infrastructure is based on a classical PXE network
> > boot to load the Linux kernel and the initramfs. In the initramfs,
> > the compressed QCOW2 image is fetched via nfs or cifs or something
> > else. The fetched QCOW2 image is now decompressed and read in the
> > kernel. Compared to a decompression and read in the user space,
> > like qemu-nbd does, this approach does not need any user space
> > process, is faster and avoids switchroot problems.  
> 
> This image can be compressed via xz, and fetched via wget or what
> ever. 'xz' could have better compression ratio than qcow2, I guess.

"Fetch" was probably a bit ambiguous. The image isn't downloaded, but
mounted directly from the network (streamed?), so we can benefit from
the per-cluster compression of qcow2, similar to squashfs but on the
block layer. A typical image is between 3 and 10GB with qcow2
compression, so downloading it entirely on boot to be able to
decompress it is not feasible.

> As I mentioned above, seems not necessary to introduce loop-qcow2.

Yes, there are many ways to achieve this. The basic concept of network
booting the workstations has been practiced here for almost 15 years
now using very different approaches like plain old NFS mounts for the
root filesystem, squashfs containers that get downloaded, or streamed
over network. But since our requirement is a stateless system, we need
a copy-on-write layer on top of this. In the beginnings we did this
with unionfs and then aufs, but as these operate on the file-system
layer they have several drawbacks and relatively high complexity
compared to block-layer CoW. So we switched to a block-based approach
about 4 years ago. For reasons stated before, we wanted to use some
form of compression, as was possible with squashfs before, so after
some experimenting, qcow2 proved to be a good fit. However, adding in
user-space tools like qemu-nbd or xmount added too much of a
performance penalty and initially, also some problems during the
switchroot from initrd to the actual root file system.

So the current process looks as follows: kernel + initrd are
loaded via iPXE. initrd sets up network, mounts NFS share or connects
to server via NBD to access the qcow2 image. Modified losetup sets up
access to qcow2 image, either from NFS share or
directly from /dev/nbd0. Finally, mount /dev/loop0pXX and switch to new
root.

Manuel's implementation has so far proven to be very reliable and
brought noticeable performance improvements compared to having a user
space process doing the qcow2 handling.

So we would have really liked the idea of having his changes
upstreamed, I think he did a very good job by designing a plugin
infrastructure for the loop device and making the qcow2 plugin a
separate module. We knew about the concerns of adding code for handling
a file format in the kernel and were hoping that maybe an acceptable
compromise would be to have his changes added to the kernel minus the
actual qcow2 plugin, so it is mostly a refactoring of the old loop
device that's not adding too much complexity (hopefully). But if we're
really such an oddball use-case here that this won't possibly be of any
interest to anybody else we will just have to go forward maintaining
this out of tree entirely.

Thanks for your time,
Simon