diff mbox

[v2] Driver for Inter-VM shared memory device for KVM supporting interrupts.

Message ID 200905191100.14252.borntraeger@de.ibm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Christian Borntraeger May 19, 2009, 9 a.m. UTC
Am Montag 18 Mai 2009 16:26:15 schrieb Avi Kivity:
> Christian Borntraeger wrote:
> > Sorry for the late question, but I missed your first version. Is there a
> > way to change that code to use virtio instead of PCI? That would allow us
> > to use this driver on s390 and maybe other virtio transports.
>
> Opinion differs.  See the discussion in
> http://article.gmane.org/gmane.comp.emulators.kvm.devel/30119.
>
> To summarize, Anthony thinks it should use virtio, while I believe
> virtio is useful for exporting guest memory, not for importing host memory.

I think the current virtio interface is not ideal for importing host memory, 
but we can change that. If you look at the dcssblk driver for s390, it allows 
a guest to map shared memory segments via a diagnose (hypercall). This driver 
uses PCI regions to map memory.

My point is, that the method to map memory is completely irrelevant, we just 
need something like mmap/shmget between the guest and the host. We could 
define an interface in virtio, that can be used by any transport. In case of 
pci this could be a simple pci map operation. 

What do you think about something like: (CCed Rusty)
---
 include/linux/virtio.h |   26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)




--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Avi Kivity May 19, 2009, 9:10 a.m. UTC | #1
Christian Bornträger wrote:
>> To summarize, Anthony thinks it should use virtio, while I believe
>> virtio is useful for exporting guest memory, not for importing host memory.
>>     
>
> I think the current virtio interface is not ideal for importing host memory, 
> but we can change that. If you look at the dcssblk driver for s390, it allows 
> a guest to map shared memory segments via a diagnose (hypercall). This driver 
> uses PCI regions to map memory.
>
> My point is, that the method to map memory is completely irrelevant, we just 
> need something like mmap/shmget between the guest and the host. We could 
> define an interface in virtio, that can be used by any transport. In case of 
> pci this could be a simple pci map operation. 
>
> What do you think about something like: (CCed Rusty)
>   

Exactly.
Cam Macdonell May 19, 2009, 4:51 p.m. UTC | #2
Avi Kivity wrote:
> Christian Bornträger wrote:
>>> To summarize, Anthony thinks it should use virtio, while I believe
>>> virtio is useful for exporting guest memory, not for importing host 
>>> memory.
>>>     
>>
>> I think the current virtio interface is not ideal for importing host 
>> memory, but we can change that. If you look at the dcssblk driver for 
>> s390, it allows a guest to map shared memory segments via a diagnose 
>> (hypercall). This driver uses PCI regions to map memory.
>>
>> My point is, that the method to map memory is completely irrelevant, 
>> we just need something like mmap/shmget between the guest and the 
>> host. We could define an interface in virtio, that can be used by any 
>> transport. In case of pci this could be a simple pci map operation.
>> What do you think about something like: (CCed Rusty)
>>   
> 
> Exactly.
> 

Agreed.


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Anthony Liguori May 19, 2009, 6:39 p.m. UTC | #3
Christian Bornträger wrote:
> Am Montag 18 Mai 2009 16:26:15 schrieb Avi Kivity:
>   
>> Christian Borntraeger wrote:
>>     
>>> Sorry for the late question, but I missed your first version. Is there a
>>> way to change that code to use virtio instead of PCI? That would allow us
>>> to use this driver on s390 and maybe other virtio transports.
>>>       
>> Opinion differs.  See the discussion in
>> http://article.gmane.org/gmane.comp.emulators.kvm.devel/30119.
>>
>> To summarize, Anthony thinks it should use virtio, while I believe
>> virtio is useful for exporting guest memory, not for importing host memory.
>>     
>
> I think the current virtio interface is not ideal for importing host memory, 
> but we can change that. If you look at the dcssblk driver for s390, it allows 
> a guest to map shared memory segments via a diagnose (hypercall). This driver 
> uses PCI regions to map memory.
>
> My point is, that the method to map memory is completely irrelevant, we just 
> need something like mmap/shmget between the guest and the host. We could 
> define an interface in virtio, that can be used by any transport. In case of 
> pci this could be a simple pci map operation. 
>
> What do you think about something like: (CCed Rusty)
> ---
>  include/linux/virtio.h |   26 ++++++++++++++++++++++++++
>  1 file changed, 26 insertions(+)
>
> Index: linux-2.6/include/linux/virtio.h
> ===================================================================
> --- linux-2.6.orig/include/linux/virtio.h
> +++ linux-2.6/include/linux/virtio.h
> @@ -71,6 +71,31 @@ struct virtqueue_ops {
>  };
>  
>  /**
> + * virtio_device_ops - operations for virtio devices
> + * @map_region: map host buffer at a given address
> + *	vdev: the struct virtio_device we're talking about.
> + *	addr: The address where the buffer should be mapped (hint only)
> + *	length: THe length of the mapping
> + *	identifier: the token that identifies the host buffer
> + *      Returns the mapping address or an error pointer.
> + * @unmap_region: unmap host buffer from the address
> + *	vdev: the struct virtio_device we're talking about.
> + *	addr: The address where the buffer is mapped
> + *      Returns 0 on success or an error
> + *
> + * TBD, we might need query etc.
> + */
> +struct virtio_device_ops {
> +	void * (*map_region)(struct virtio_device *vdev,
> +			     void *addr,
> +			     size_t length,
> +			     int identifier);
> +	int (*unmap_region)(struct virtio_device *vdev, void *addr);
> +/* we might need query region and other stuff */
> +};
>   

Perhaps something that maps closer to the current add_buf/get_buf API.  
Something like:

struct iovec *(*map_buf)(struct virtqueue *vq, unsigned int *out_num, 
unsigned int *in_num);
void (*unmap_buf)(struct virtqueue *vq, struct iovec *iov, unsigned int 
out_num, unsigned int in_num);

There's symmetry here which is good.  The one bad thing about it is 
forces certain memory to be read-only and other memory to be 
read-write.  I don't see that as a bad thing though.

I think we'll need an interface like this so support driver domains too 
since "backend".  To put it another way, in QEMU, map_buf == 
virtqueue_pop and unmap_buf == virtqueue_push.

Regards,

Anthony Liguori
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rusty Russell May 20, 2009, 2:58 a.m. UTC | #4
On Wed, 20 May 2009 02:21:08 am Cam Macdonell wrote:
> Avi Kivity wrote:
> > Christian Bornträger wrote:
> >>> To summarize, Anthony thinks it should use virtio, while I believe
> >>> virtio is useful for exporting guest memory, not for importing host
> >>> memory.

Yes, precisely.

But what's it *for*, this shared memory?  Implementing shared memory is 
trivial.  Using it is harder.  For example, inter-guest networking: you'd have 
to copy packets in and out, making it slow as well as losing abstraction.

The only interesting idea I can think of is exposing it to userspace, and 
having that run some protocol across it for fast app <-> app comms.  But if 
that's your plan, you still have a lot of code the write!

So I guess I'm missing the big picture here?

Thanks,
Rusty.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christian Borntraeger May 20, 2009, 7:33 a.m. UTC | #5
Am Mittwoch 20 Mai 2009 04:58:38 schrieb Rusty Russell:
> On Wed, 20 May 2009 02:21:08 am Cam Macdonell wrote:
> > Avi Kivity wrote:
> > > Christian Bornträger wrote:
> > >>> To summarize, Anthony thinks it should use virtio, while I believe
> > >>> virtio is useful for exporting guest memory, not for importing host
> > >>> memory.
>
> Yes, precisely.
>
> But what's it *for*, this shared memory?  Implementing shared memory is
> trivial.  Using it is harder.  For example, inter-guest networking: you'd
> have to copy packets in and out, making it slow as well as losing
> abstraction.
>
> The only interesting idea I can think of is exposing it to userspace, and
> having that run some protocol across it for fast app <-> app comms.  But if
> that's your plan, you still have a lot of code the write!
>
> So I guess I'm missing the big picture here?

I can give some insights about shared memory usage in z/VM. z/VM uses so-
called discontiguous saved segments (DCSS) to shared memory between guests.
	(naming side note:
	o discontigous because these segments can have holes and different access
      rights, e.g. you can build DCSS that go from 800M-801M read only and
      900M-910M exclusive-write.
	o segments because the 2nd level of our page tables is called segment table.
     )

z/VM uses these segments for several purposes:
o The monitoring subsystem uses a DCSS to get data from several components
o shared guest kernels: The CMS operating system is build as a bootable DCSS
  (called named-saved-segments NSS). All guests have the same host pages for
  the read-only parts of the CMS kernel. The local data is stored in
  exclusive-write parts of the same NSS. Linux on System z is also capable of
  using this feature (CONFIG_SHARED_KERNEL). The kernel linkage is changed in
  a way to separate the read-only text segment from the other parts with
  segment size alignment
o execute-in-place: This is a Linux feature to exploit the DCSS technology.
  The goal is to shared identical guest pages without the additional overhead
  of KSM etc. We have a block device driver for DCSS. This block device driver
  supports the direct_access function and therefore allows to use the xip
  option of ext2. The idea is to put  binaries into an read-only ext2
  filesystem. Whenever an mmap is made on this file system, the page is not
  mapped into the page cache. The ptes point into the DCSS memory instead.
  Since the DCSS is demand-paged by the host no memory is wasted for unused
  parts of the binaries. In case of COW the page is copied as usual. It turned
  out that installations with many similar guests (lets say 400 guests) will
  profit in terms of memory saving and quicker application startups (not the
  first guest of course). There is a downside: this requires a skilled
  administrator to setup.

We have also experimented with network, Posix shared memory, and shared caches 
via DCSS. Most of these ideas turned out to be not very useful or hard to 
implement proper.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christian Borntraeger May 20, 2009, 7:33 a.m. UTC | #6
Am Dienstag 19 Mai 2009 20:39:24 schrieb Anthony Liguori:
> Perhaps something that maps closer to the current add_buf/get_buf API.
> Something like:
>
> struct iovec *(*map_buf)(struct virtqueue *vq, unsigned int *out_num,
> unsigned int *in_num);
> void (*unmap_buf)(struct virtqueue *vq, struct iovec *iov, unsigned int
> out_num, unsigned int in_num);
>
> There's symmetry here which is good.  The one bad thing about it is
> forces certain memory to be read-only and other memory to be
> read-write.  I don't see that as a bad thing though.
>
> I think we'll need an interface like this so support driver domains too
> since "backend".  To put it another way, in QEMU, map_buf ==
> virtqueue_pop and unmap_buf == virtqueue_push.


You are proposing that the guest should define some guest memory to be used as 
shared memory (some kind of replacement), right? This is fine, as long as we 
can _also_ map host memory somewhere else (e.g. after guest memory, above 1TB 
etc.). I definitely want to be able to have an 64MB guest map an 2GB shared 
memory zone. (See my other mail about the execute-in-place via DCSS use case).


I think we should start to write down some requirements. This will help to get 
a better understanding of the necessary interface:
here are my first ideas:

o allow to map host-shared-memory to anyplace that can be addressed via a PFN
	o allow to map beyond guest storage
	o allow to replace guest memory
o read-only and read/write modes
o driver interface should not depend on hardware specific stuff (e.g. prefer 
generic virtio over PCI)

More ideas are welcome.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
François Diakhate May 20, 2009, 8:07 a.m. UTC | #7
On Wed, May 20, 2009 at 4:58 AM, Rusty Russell <rusty@rustcorp.com.au> wrote:

> The only interesting idea I can think of is exposing it to userspace, and
> having that run some protocol across it for fast app <-> app comms.  But if
> that's your plan, you still have a lot of code the write!
>
> So I guess I'm missing the big picture here?

Hello Rusty,

For an example, you may have a look at a paper I wrote last year
on achieving fast MPI-like message passing between guests over
shared memory [1].

For my proof-of-concept implementation, I introduced a virtual device
allowing to perform DMA between guests, something for which virtio is
well suited, but also to share some memory to transfer small messages
efficiently from userspace. To expose this shared memory to guests, I
implemented something quite similar to what Cam is proposing which
was to expose it as the memory of a pci device. I think it could be a
useful addition to virtio if it allowed to abstract this.

François

[1] http://hal.archives-ouvertes.fr/docs/00/36/86/22/PDF/vhpc08.pdf
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Avi Kivity May 20, 2009, 8:45 a.m. UTC | #8
Christian Bornträger wrote:
> o shared guest kernels: The CMS operating system is build as a bootable DCSS
>   (called named-saved-segments NSS). All guests have the same host pages for
>   the read-only parts of the CMS kernel. The local data is stored in
>   exclusive-write parts of the same NSS. Linux on System z is also capable of
>   using this feature (CONFIG_SHARED_KERNEL). The kernel linkage is changed in
>   a way to separate the read-only text segment from the other parts with
>   segment size alignment
>   

How does patching (smp, kprobes/jprobes, markers/ftrace) work with this?

> o execute-in-place: This is a Linux feature to exploit the DCSS technology.
>   The goal is to shared identical guest pages without the additional overhead
>   of KSM etc. We have a block device driver for DCSS. This block device driver
>   supports the direct_access function and therefore allows to use the xip
>   option of ext2. The idea is to put  binaries into an read-only ext2
>   filesystem. Whenever an mmap is made on this file system, the page is not
>   mapped into the page cache. The ptes point into the DCSS memory instead.
>   Since the DCSS is demand-paged by the host no memory is wasted for unused
>   parts of the binaries. In case of COW the page is copied as usual. It turned
>   out that installations with many similar guests (lets say 400 guests) will
>   profit in terms of memory saving and quicker application startups (not the
>   first guest of course). There is a downside: this requires a skilled
>   administrator to setup.

ksm might be easier to admin, at the cost of some cpu time.
Christian Borntraeger May 20, 2009, 9:07 a.m. UTC | #9
Am Mittwoch 20 Mai 2009 10:45:50 schrieb Avi Kivity:
> Christian Bornträger wrote:
> > o shared guest kernels: The CMS operating system is build as a bootable
> > DCSS (called named-saved-segments NSS). All guests have the same host
> > pages for the read-only parts of the CMS kernel. The local data is stored
> > in exclusive-write parts of the same NSS. Linux on System z is also
> > capable of using this feature (CONFIG_SHARED_KERNEL). The kernel linkage
> > is changed in a way to separate the read-only text segment from the other
> > parts with segment size alignment
>
> How does patching (smp, kprobes/jprobes, markers/ftrace) work with this?
It does not. :-) 
Because of that and since most distro kernels are fully modular and kernel 
updates are another problem this feature is not used very often for Linux. It 
is used heavily in CMS, though.
Actually, we could do COW in the host but then it is really not worth the 
effort.

> > o execute-in-place: This is a Linux feature to exploit the DCSS
> > technology. The goal is to shared identical guest pages without the
> > additional overhead of KSM etc. We have a block device driver for DCSS.
> > This block device driver supports the direct_access function and
> > therefore allows to use the xip option of ext2. The idea is to put 
> > binaries into an read-only ext2 filesystem. Whenever an mmap is made on
> > this file system, the page is not mapped into the page cache. The ptes
> > point into the DCSS memory instead. Since the DCSS is demand-paged by the
> > host no memory is wasted for unused parts of the binaries. In case of COW
> > the page is copied as usual. It turned out that installations with many
> > similar guests (lets say 400 guests) will profit in terms of memory
> > saving and quicker application startups (not the first guest of course).
> > There is a downside: this requires a skilled administrator to setup.
>
> ksm might be easier to admin, at the cost of some cpu time.

Yes, KSM is easier and it even finds duplicate data pages.
On the other hand it does only provide memory saving. It does not speedup 
application startup like execute-in-place (major page faults become minor page 
faults for text pages if the page is already backed by the host)
I am not claiming that KSM is useless. Depending on the scenario you might 
want the one or the other or even both. For typical desktop use, KSM is very 
likely the better approach.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Avi Kivity May 20, 2009, 9:11 a.m. UTC | #10
Christian Bornträger wrote:
> Am Mittwoch 20 Mai 2009 10:45:50 schrieb Avi Kivity:
>   
>> Christian Bornträger wrote:
>>     
>>> o shared guest kernels: The CMS operating system is build as a bootable
>>> DCSS (called named-saved-segments NSS). All guests have the same host
>>> pages for the read-only parts of the CMS kernel. The local data is stored
>>> in exclusive-write parts of the same NSS. Linux on System z is also
>>> capable of using this feature (CONFIG_SHARED_KERNEL). The kernel linkage
>>> is changed in a way to separate the read-only text segment from the other
>>> parts with segment size alignment
>>>       
>> How does patching (smp, kprobes/jprobes, markers/ftrace) work with this?
>>     
> It does not. :-) 
> Because of that and since most distro kernels are fully modular and kernel 
> updates are another problem this feature is not used very often for Linux. It 
> is used heavily in CMS, though.
> Actually, we could do COW in the host but then it is really not worth the 
> effort.
>   

ksm on low throttle would solve all of those problems.

> Yes, KSM is easier and it even finds duplicate data pages.
> On the other hand it does only provide memory saving. It does not speedup 
> application startup like execute-in-place (major page faults become minor page 
> faults for text pages if the page is already backed by the host)
> I am not claiming that KSM is useless. Depending on the scenario you might 
> want the one or the other or even both. For typical desktop use, KSM is very 
> likely the better approach

If ksm shares pagecache, then doesn't it become effectively XIP?

We could also hook virtio dma to preemptively share pages somehow.
Christian Bornträger May 20, 2009, 9:20 a.m. UTC | #11
Am Mittwoch 20 Mai 2009 11:11:57 schrieb Avi Kivity:
> > Yes, KSM is easier and it even finds duplicate data pages.
> > On the other hand it does only provide memory saving. It does not speedup
> > application startup like execute-in-place (major page faults become minor
> > page faults for text pages if the page is already backed by the host) I
> > am not claiming that KSM is useless. Depending on the scenario you might
> > want the one or the other or even both. For typical desktop use, KSM is
> > very likely the better approach
>
> If ksm shares pagecache, then doesn't it become effectively XIP?

Not exactly, only for long running guests with stable working set. If the 
guest boots up, its page cache is basically empty, but the shared segment is 
populated. its the startup where xip wins. Same is true for guests with 
quickly changing working sets. 

> We could also hook virtio dma to preemptively share pages somehow.

Yes, that is something to think about. One idea that is used on z/VM by lot of 
customers is to have a shared disk read-only for /usr that is cached by the 
host.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Anthony Liguori May 20, 2009, 1:26 p.m. UTC | #12
Christian Bornträger wrote:
> Am Dienstag 19 Mai 2009 20:39:24 schrieb Anthony Liguori:
>   
>> Perhaps something that maps closer to the current add_buf/get_buf API.
>> Something like:
>>
>> struct iovec *(*map_buf)(struct virtqueue *vq, unsigned int *out_num,
>> unsigned int *in_num);
>> void (*unmap_buf)(struct virtqueue *vq, struct iovec *iov, unsigned int
>> out_num, unsigned int in_num);
>>
>> There's symmetry here which is good.  The one bad thing about it is
>> forces certain memory to be read-only and other memory to be
>> read-write.  I don't see that as a bad thing though.
>>
>> I think we'll need an interface like this so support driver domains too
>> since "backend".  To put it another way, in QEMU, map_buf ==
>> virtqueue_pop and unmap_buf == virtqueue_push.
>>     
>
>
> You are proposing that the guest should define some guest memory to be used as 
> shared memory (some kind of replacement), right?

No.  map_buf() returns a mapped region of memory.  Where that memory 
comes from is up to the transport.  It can be the result of an ioremap 
of a PCI BAR.

The model of virtio frontends today is:

 o add buffer of guest's memory
 o let backend do something with it
 o get back buffer of guest's memory

The backend model (as implemented by QEMU) is:

 o get buffer of mapped front-end memory
 o do something with memory
 o give buffer back

For implementing persistent shared memory, you need a vring with enough 
elements to hold all of the shared memory regions at one time.  This 
becomes more practical with indirect scatter/gather entries.

Of course, whether vring is used at all is a transport detail.

Regards,

Anthony Liguori
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rusty Russell May 25, 2009, 6:18 a.m. UTC | #13
On Wed, 20 May 2009 05:03:01 pm Christian Bornträger wrote:
> Am Mittwoch 20 Mai 2009 04:58:38 schrieb Rusty Russell:
> > But what's it *for*, this shared memory? 
...
> z/VM uses these segments for several purposes:
> o The monitoring subsystem uses a DCSS to get data from several components

In KVM this probably doesn't require inter-guest access; presumably monitoring 
is done on the host.

> o shared guest kernels: The CMS operating system is build as a bootable
> DCSS (called named-saved-segments NSS). All guests have the same host pages
> for the read-only parts of the CMS kernel. The local data is stored in
> exclusive-write parts of the same NSS. Linux on System z is also capable of
> using this feature (CONFIG_SHARED_KERNEL). The kernel linkage is changed in
> a way to separate the read-only text segment from the other parts with
> segment size alignment

This is unlikely for x86 at least, and as you point out, not good for 
distributions either.

> o execute-in-place: This is a Linux feature to exploit the DCSS technology.
>   The goal is to shared identical guest pages without the additional
> overhead of KSM etc. We have a block device driver for DCSS. This block
> device driver supports the direct_access function and therefore allows to
> use the xip option of ext2. The idea is to put  binaries into an read-only
> ext2 filesystem. Whenever an mmap is made on this file system, the page is
> not mapped into the page cache. The ptes point into the DCSS memory
> instead. Since the DCSS is demand-paged by the host no memory is wasted for
> unused parts of the binaries. In case of COW the page is copied as usual.
> It turned out that installations with many similar guests (lets say 400
> guests) will profit in terms of memory saving and quicker application
> startups (not the first guest of course). There is a downside: this
> requires a skilled administrator to setup.

We're better off doing opportunistic KSM in virtio_blk I'd say.  Anyway, it's 
not really "inter-guest" in this sense; the host controls it, though it lets 
multiple guests read from it.

> We have also experimented with network, Posix shared memory, and shared
> caches via DCSS. Most of these ideas turned out to be not very useful or
> hard to implement proper.

Indeed, and this is what I suspect these patches are aiming for...

Thanks,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

Index: linux-2.6/include/linux/virtio.h
===================================================================
--- linux-2.6.orig/include/linux/virtio.h
+++ linux-2.6/include/linux/virtio.h
@@ -71,6 +71,31 @@  struct virtqueue_ops {
 };
 
 /**
+ * virtio_device_ops - operations for virtio devices
+ * @map_region: map host buffer at a given address
+ *	vdev: the struct virtio_device we're talking about.
+ *	addr: The address where the buffer should be mapped (hint only)
+ *	length: THe length of the mapping
+ *	identifier: the token that identifies the host buffer
+ *      Returns the mapping address or an error pointer.
+ * @unmap_region: unmap host buffer from the address
+ *	vdev: the struct virtio_device we're talking about.
+ *	addr: The address where the buffer is mapped
+ *      Returns 0 on success or an error
+ *
+ * TBD, we might need query etc.
+ */
+struct virtio_device_ops {
+	void * (*map_region)(struct virtio_device *vdev,
+			     void *addr,
+			     size_t length,
+			     int identifier);
+	int (*unmap_region)(struct virtio_device *vdev, void *addr);
+/* we might need query region and other stuff */
+};
+
+
+/**
  * virtio_device - representation of a device using virtio
  * @index: unique position on the virtio bus
  * @dev: underlying device.
@@ -85,6 +110,7 @@  struct virtio_device
 	struct device dev;
 	struct virtio_device_id id;
 	struct virtio_config_ops *config;
+	struct virtio_device_ops *ops;
 	/* Note that this is a Linux set_bit-style bitmap. */
 	unsigned long features[1];
 	void *priv;