Message ID | 20200304165845.3081-1-vgoyal@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | virtiofs: Add DAX support | expand |
On Wed, Mar 4, 2020 at 7:01 PM Vivek Goyal <vgoyal@redhat.com> wrote: > > Hi, > > This patch series adds DAX support to virtiofs filesystem. This allows > bypassing guest page cache and allows mapping host page cache directly > in guest address space. > > When a page of file is needed, guest sends a request to map that page > (in host page cache) in qemu address space. Inside guest this is > a physical memory range controlled by virtiofs device. And guest > directly maps this physical address range using DAX and hence gets > access to file data on host. > > This can speed up things considerably in many situations. Also this > can result in substantial memory savings as file data does not have > to be copied in guest and it is directly accessed from host page > cache. > > Most of the changes are limited to fuse/virtiofs. There are couple > of changes needed in generic dax infrastructure and couple of changes > in virtio to be able to access shared memory region. > > These patches apply on top of 5.6-rc4 and are also available here. > > https://github.com/rhvgoyal/linux/commits/vivek-04-march-2020 > > Any review or feedback is welcome. > [...] > drivers/dax/super.c | 3 +- > drivers/virtio/virtio_mmio.c | 32 + > drivers/virtio/virtio_pci_modern.c | 107 +++ > fs/dax.c | 66 +- > fs/fuse/dir.c | 2 + > fs/fuse/file.c | 1162 +++++++++++++++++++++++++++- That's a big addition to already big file.c. Maybe split dax specific code to dax.c? Can be a post series cleanup too. Thanks, Amir.
On Wed, Mar 11, 2020 at 07:22:51AM +0200, Amir Goldstein wrote: > On Wed, Mar 4, 2020 at 7:01 PM Vivek Goyal <vgoyal@redhat.com> wrote: > > > > Hi, > > > > This patch series adds DAX support to virtiofs filesystem. This allows > > bypassing guest page cache and allows mapping host page cache directly > > in guest address space. > > > > When a page of file is needed, guest sends a request to map that page > > (in host page cache) in qemu address space. Inside guest this is > > a physical memory range controlled by virtiofs device. And guest > > directly maps this physical address range using DAX and hence gets > > access to file data on host. > > > > This can speed up things considerably in many situations. Also this > > can result in substantial memory savings as file data does not have > > to be copied in guest and it is directly accessed from host page > > cache. > > > > Most of the changes are limited to fuse/virtiofs. There are couple > > of changes needed in generic dax infrastructure and couple of changes > > in virtio to be able to access shared memory region. > > > > These patches apply on top of 5.6-rc4 and are also available here. > > > > https://github.com/rhvgoyal/linux/commits/vivek-04-march-2020 > > > > Any review or feedback is welcome. > > > [...] > > drivers/dax/super.c | 3 +- > > drivers/virtio/virtio_mmio.c | 32 + > > drivers/virtio/virtio_pci_modern.c | 107 +++ > > fs/dax.c | 66 +- > > fs/fuse/dir.c | 2 + > > fs/fuse/file.c | 1162 +++++++++++++++++++++++++++- > > That's a big addition to already big file.c. > Maybe split dax specific code to dax.c? > Can be a post series cleanup too. Lot of this code is coming from logic to reclaim dax memory range assigned to inode. I will look into moving some of it to a separate file. Vivek
Vivek Goyal <vgoyal@redhat.com> writes: > This patch series adds DAX support to virtiofs filesystem. This allows > bypassing guest page cache and allows mapping host page cache directly > in guest address space. > > When a page of file is needed, guest sends a request to map that page > (in host page cache) in qemu address space. Inside guest this is > a physical memory range controlled by virtiofs device. And guest > directly maps this physical address range using DAX and hence gets > access to file data on host. > > This can speed up things considerably in many situations. Also this > can result in substantial memory savings as file data does not have > to be copied in guest and it is directly accessed from host page > cache. As a potential user of this, let me make sure I understand the expected outcome: is the goal to let virtiofs use DAX (for increased performance, etc.) or also let applications that use virtiofs use DAX? You are mentioning using the host's page cache, so it's probably the former and MAP_SYNC on virtiofs will continue to be rejected, right?
On Wed, Mar 11, 2020 at 07:22:51AM +0200, Amir Goldstein wrote: > On Wed, Mar 4, 2020 at 7:01 PM Vivek Goyal <vgoyal@redhat.com> wrote: > > > > Hi, > > > > This patch series adds DAX support to virtiofs filesystem. This allows > > bypassing guest page cache and allows mapping host page cache directly > > in guest address space. > > > > When a page of file is needed, guest sends a request to map that page > > (in host page cache) in qemu address space. Inside guest this is > > a physical memory range controlled by virtiofs device. And guest > > directly maps this physical address range using DAX and hence gets > > access to file data on host. > > > > This can speed up things considerably in many situations. Also this > > can result in substantial memory savings as file data does not have > > to be copied in guest and it is directly accessed from host page > > cache. > > > > Most of the changes are limited to fuse/virtiofs. There are couple > > of changes needed in generic dax infrastructure and couple of changes > > in virtio to be able to access shared memory region. > > > > These patches apply on top of 5.6-rc4 and are also available here. > > > > https://github.com/rhvgoyal/linux/commits/vivek-04-march-2020 > > > > Any review or feedback is welcome. > > > [...] > > drivers/dax/super.c | 3 +- > > drivers/virtio/virtio_mmio.c | 32 + > > drivers/virtio/virtio_pci_modern.c | 107 +++ > > fs/dax.c | 66 +- > > fs/fuse/dir.c | 2 + > > fs/fuse/file.c | 1162 +++++++++++++++++++++++++++- > > That's a big addition to already big file.c. > Maybe split dax specific code to dax.c? > Can be a post series cleanup too. How about fs/fuse/iomap.c instead. This will have all the iomap related logic as well as all the dax range allocation/free logic which is required by iomap logic. That moves about 900 lines of code from file.c to iomap.c Vivek
On Wed, Mar 11, 2020 at 8:48 PM Vivek Goyal <vgoyal@redhat.com> wrote: > > On Wed, Mar 11, 2020 at 07:22:51AM +0200, Amir Goldstein wrote: > > On Wed, Mar 4, 2020 at 7:01 PM Vivek Goyal <vgoyal@redhat.com> wrote: > > > > > > Hi, > > > > > > This patch series adds DAX support to virtiofs filesystem. This allows > > > bypassing guest page cache and allows mapping host page cache directly > > > in guest address space. > > > > > > When a page of file is needed, guest sends a request to map that page > > > (in host page cache) in qemu address space. Inside guest this is > > > a physical memory range controlled by virtiofs device. And guest > > > directly maps this physical address range using DAX and hence gets > > > access to file data on host. > > > > > > This can speed up things considerably in many situations. Also this > > > can result in substantial memory savings as file data does not have > > > to be copied in guest and it is directly accessed from host page > > > cache. > > > > > > Most of the changes are limited to fuse/virtiofs. There are couple > > > of changes needed in generic dax infrastructure and couple of changes > > > in virtio to be able to access shared memory region. > > > > > > These patches apply on top of 5.6-rc4 and are also available here. > > > > > > https://github.com/rhvgoyal/linux/commits/vivek-04-march-2020 > > > > > > Any review or feedback is welcome. > > > > > [...] > > > drivers/dax/super.c | 3 +- > > > drivers/virtio/virtio_mmio.c | 32 + > > > drivers/virtio/virtio_pci_modern.c | 107 +++ > > > fs/dax.c | 66 +- > > > fs/fuse/dir.c | 2 + > > > fs/fuse/file.c | 1162 +++++++++++++++++++++++++++- > > > > That's a big addition to already big file.c. > > Maybe split dax specific code to dax.c? > > Can be a post series cleanup too. > > How about fs/fuse/iomap.c instead. This will have all the iomap related logic > as well as all the dax range allocation/free logic which is required > by iomap logic. That moves about 900 lines of code from file.c to iomap.c > Fine by me. I didn't take time to study the code in file.c I just noticed is has grown a lot bigger and wasn't sure that it made sense. Up to you. Only if you think the result would be nicer to maintain. Thanks, Amir.
On Wed, Mar 11, 2020 at 09:32:17PM +0200, Amir Goldstein wrote: > On Wed, Mar 11, 2020 at 8:48 PM Vivek Goyal <vgoyal@redhat.com> wrote: > > > > On Wed, Mar 11, 2020 at 07:22:51AM +0200, Amir Goldstein wrote: > > > On Wed, Mar 4, 2020 at 7:01 PM Vivek Goyal <vgoyal@redhat.com> wrote: > > > > > > > > Hi, > > > > > > > > This patch series adds DAX support to virtiofs filesystem. This allows > > > > bypassing guest page cache and allows mapping host page cache directly > > > > in guest address space. > > > > > > > > When a page of file is needed, guest sends a request to map that page > > > > (in host page cache) in qemu address space. Inside guest this is > > > > a physical memory range controlled by virtiofs device. And guest > > > > directly maps this physical address range using DAX and hence gets > > > > access to file data on host. > > > > > > > > This can speed up things considerably in many situations. Also this > > > > can result in substantial memory savings as file data does not have > > > > to be copied in guest and it is directly accessed from host page > > > > cache. > > > > > > > > Most of the changes are limited to fuse/virtiofs. There are couple > > > > of changes needed in generic dax infrastructure and couple of changes > > > > in virtio to be able to access shared memory region. > > > > > > > > These patches apply on top of 5.6-rc4 and are also available here. > > > > > > > > https://github.com/rhvgoyal/linux/commits/vivek-04-march-2020 > > > > > > > > Any review or feedback is welcome. > > > > > > > [...] > > > > drivers/dax/super.c | 3 +- > > > > drivers/virtio/virtio_mmio.c | 32 + > > > > drivers/virtio/virtio_pci_modern.c | 107 +++ > > > > fs/dax.c | 66 +- > > > > fs/fuse/dir.c | 2 + > > > > fs/fuse/file.c | 1162 +++++++++++++++++++++++++++- > > > > > > That's a big addition to already big file.c. > > > Maybe split dax specific code to dax.c? > > > Can be a post series cleanup too. > > > > How about fs/fuse/iomap.c instead. This will have all the iomap related logic > > as well as all the dax range allocation/free logic which is required > > by iomap logic. That moves about 900 lines of code from file.c to iomap.c > > > > Fine by me. I didn't take time to study the code in file.c > I just noticed is has grown a lot bigger and wasn't sure that > it made sense. Up to you. Only if you think the result would be nicer > to maintain. I am happy to move this code to a separate file. In fact I think we could probably break it further into another file say dax-mapping.c or something like that where all the memory range allocation/reclaim logic goes and iomap logic remains in iomap.c. But that's probably a future cleanup if code in this file continues to grow. Vivek
On Wed, Mar 11, 2020 at 02:38:03PM +0100, Patrick Ohly wrote: > Vivek Goyal <vgoyal@redhat.com> writes: > > This patch series adds DAX support to virtiofs filesystem. This allows > > bypassing guest page cache and allows mapping host page cache directly > > in guest address space. > > > > When a page of file is needed, guest sends a request to map that page > > (in host page cache) in qemu address space. Inside guest this is > > a physical memory range controlled by virtiofs device. And guest > > directly maps this physical address range using DAX and hence gets > > access to file data on host. > > > > This can speed up things considerably in many situations. Also this > > can result in substantial memory savings as file data does not have > > to be copied in guest and it is directly accessed from host page > > cache. > > As a potential user of this, let me make sure I understand the expected > outcome: is the goal to let virtiofs use DAX (for increased performance, > etc.) or also let applications that use virtiofs use DAX? > > You are mentioning using the host's page cache, so it's probably the > former and MAP_SYNC on virtiofs will continue to be rejected, right? Hi Patrick, You are right. Its the former. That is we want virtiofs to be able to make use of DAX to bypass guest page cache. But there is no persistent memory so no persistent memory programming semantics available to user space. For that I guess we have virtio-pmem. We expect users will issue fsync/msync like a regular filesystem to make changes persistent. So in that aspect, rejecting MAP_SYNC makes sense. I will test and see if current code is rejecting MAP_SYNC or not. Thanks Vivek
Vivek Goyal <vgoyal@redhat.com> writes: > We expect users will issue fsync/msync like a regular filesystem to > make changes persistent. So in that aspect, rejecting MAP_SYNC > makes sense. I will test and see if current code is rejecting MAP_SYNC > or not. Last time I checked, it did. Here's the test program that I wrote for that: https://github.com/intel/pmem-csi/blob/ee3200794a1ade49a02df6f359a134115b409e90/test/cmd/pmem-dax-check/main.go