diff mbox series

drm/virtio: not pin pages on demand

Message ID 20211021074445.452309-1-maksym.wezdecki@collabora.com (mailing list archive)
State New, archived
Headers show
Series drm/virtio: not pin pages on demand | expand

Commit Message

Maksym Wezdecki Oct. 21, 2021, 7:44 a.m. UTC
From: mwezdeck <maksym.wezdecki@collabora.co.uk>

The idea behind the commit:
  1. when resource is created, let user space decide
     if resource should be pinned or not
  2. transfer_*_host needs pinned memory. If it is not
     pinned, then pin it.
  3. during execbuffer, decide which bo handles should
     be pinned based on the list provided by user space

This change has no impact on user space.
If user space driver would like not to pin pages,
for example, for textures only, it can notify guest
kernel about that. Then it can use staging buffer for
texture uploads from guest to host. Staging buffer is always
pinned.

Signed-off-by: mwezdeck <maksym.wezdecki@collabora.co.uk>

Comments

Gerd Hoffmann Oct. 21, 2021, 9:25 a.m. UTC | #1
On Thu, Oct 21, 2021 at 09:44:45AM +0200, Maksym Wezdecki wrote:
> From: mwezdeck <maksym.wezdecki@collabora.co.uk>
> 
> The idea behind the commit:
>   1. when resource is created, let user space decide
>      if resource should be pinned or not
>   2. transfer_*_host needs pinned memory. If it is not
>      pinned, then pin it.
>   3. during execbuffer, decide which bo handles should
>      be pinned based on the list provided by user space

When you never need cpu access to your gpu object there is
no need to create a resource in the first place.

take care,
  Gerd
Maksym Wezdecki Oct. 21, 2021, 9:55 a.m. UTC | #2
I get your point. However, we need to make resource_create ioctl,
in order to create corresponding resource on the host.

The concept is:
App           |    Gallium       |       Guest kernel                   What is happening?
init()              ...                    ...           
glTexImage2d:                                                           [PIPE_DISCARD_WHOLE_RESOURCE]
             -> resource_create() -> DRM_IOCTL_VIRTGPU_RESOURCE_CREATE 
                                   -> virtio_gpu_object_create():
                                   -> drm_gem_shmem_create()            [allocation of bo#1]
                                   -> virtio_gpu_smd_resource_create_3d [sending cmd to crosvm/qemu
                                                                         to create GL object]
             -> texture_map()                                           [staging buffer is selected]
             -> memcpy()                                                [memcpy from user's mem to staging buffer]
             -> texture_unmap()                                         [VIRGL_CCMD_COPY_TRANSFER3D with staging buffer]

Selecting staging buffer for texture uploads from guest to host
and not pinning resources in DRM_IOCTL_VIRTGPU_RESOURCE_CREATE can
save a lot of RAM. With previous approach we always create resource,
we upload from guest to host and we never unpin pages, which causes
high RAM usage by the guest. With proposed approach, we create resource,
we decide that for textures we won't pin pages, we select staging
buffer (which is recycled then) for uploads. This causes lower memory
consumption.

With best regards,
Maksym 

On 10/21/21 11:25 AM, Gerd Hoffmann wrote:
> On Thu, Oct 21, 2021 at 09:44:45AM +0200, Maksym Wezdecki wrote:
>> From: mwezdeck <maksym.wezdecki@collabora.co.uk>
>>
>> The idea behind the commit:
>>   1. when resource is created, let user space decide
>>      if resource should be pinned or not
>>   2. transfer_*_host needs pinned memory. If it is not
>>      pinned, then pin it.
>>   3. during execbuffer, decide which bo handles should
>>      be pinned based on the list provided by user space
> When you never need cpu access to your gpu object there is
> no need to create a resource in the first place.
>
> take care,
>   Gerd
>
Gerd Hoffmann Oct. 21, 2021, 11:52 a.m. UTC | #3
On Thu, Oct 21, 2021 at 11:55:47AM +0200, Maksym Wezdecki wrote:
> I get your point. However, we need to make resource_create ioctl,
> in order to create corresponding resource on the host.

That used to be the case but isn't true any more with the new
blob resources.  virglrenderer allows to create gpu objects
via execbuffer.  Those gpu objects can be linked to a
virtio-gpu resources, but it's optional.  In your case you
would do that only for your staging buffer.  The other textures
(which you fill with a host-side copy from the staging buffer)
do not need a virtio-gpu resource in the first place.

take care,
  Gerd
Chia-I Wu Oct. 21, 2021, 4:42 p.m. UTC | #4
On Thu, Oct 21, 2021 at 4:52 AM Gerd Hoffmann <kraxel@redhat.com> wrote:
>
> On Thu, Oct 21, 2021 at 11:55:47AM +0200, Maksym Wezdecki wrote:
> > I get your point. However, we need to make resource_create ioctl,
> > in order to create corresponding resource on the host.
>
> That used to be the case but isn't true any more with the new
> blob resources.  virglrenderer allows to create gpu objects
> via execbuffer.  Those gpu objects can be linked to a
> virtio-gpu resources, but it's optional.  In your case you
> would do that only for your staging buffer.  The other textures
> (which you fill with a host-side copy from the staging buffer)
> do not need a virtio-gpu resource in the first place.
That's however a breaking change to the virgl protocol.  All virgl
commands expect res ids rather than blob ids.

Using VIRTGPU_BLOB_MEM_HOST3D could in theory work.  But there are a
few occasions where virglrenderer expects there to be guest storage.
There are also readbacks that we need to support.  We might end up
using VIRTGPU_BLOB_MEM_HOST3D_GUEST, where pin-on-demand is still
desirable.

For this patch, I think the uapi change can be simplified.  It can be
a new param plus a new field in drm_virtgpu_execbuffer

  __u64 bo_flags; /* pointer to an array of size num_bo_handles, or NULL */

The other changes do not seem needed.

>
> take care,
>   Gerd
>
Maksym Wezdecki Oct. 22, 2021, 8:40 a.m. UTC | #5
On 10/21/21 6:42 PM, Chia-I Wu wrote:
> On Thu, Oct 21, 2021 at 4:52 AM Gerd Hoffmann <kraxel@redhat.com> wrote:
>> On Thu, Oct 21, 2021 at 11:55:47AM +0200, Maksym Wezdecki wrote:
>>> I get your point. However, we need to make resource_create ioctl,
>>> in order to create corresponding resource on the host.
>> That used to be the case but isn't true any more with the new
>> blob resources.  virglrenderer allows to create gpu objects
>> via execbuffer.  Those gpu objects can be linked to a
>> virtio-gpu resources, but it's optional.  In your case you
>> would do that only for your staging buffer.  The other textures
>> (which you fill with a host-side copy from the staging buffer)
>> do not need a virtio-gpu resource in the first place.
> That's however a breaking change to the virgl protocol.  All virgl
> commands expect res ids rather than blob ids.
>
> Using VIRTGPU_BLOB_MEM_HOST3D could in theory work.  But there are a
> few occasions where virglrenderer expects there to be guest storage.
> There are also readbacks that we need to support.  We might end up
> using VIRTGPU_BLOB_MEM_HOST3D_GUEST, where pin-on-demand is still
> desirable.
>
> For this patch, I think the uapi change can be simplified.  It can be
> a new param plus a new field in drm_virtgpu_execbuffer
>
>   __u64 bo_flags; /* pointer to an array of size num_bo_handles, or NULL */
>
> The other changes do not seem needed.

My first approach was the same, as you mentioned. However, having "__u64 bo_flags"
needs a for loop, where only few of the bo-s needs to be pinned - this has
performance implications. I would rather pass bo handles that should be pinned than
having a lot of flags, where only 1-2 bos needs to be pinned.

>
>> take care,
>>   Gerd
>>
Maksym Wezdecki Oct. 22, 2021, 8:44 a.m. UTC | #6
Once again with all lists and receivers. I'm sorry for that.

On 10/21/21 6:42 PM, Chia-I Wu wrote:
> On Thu, Oct 21, 2021 at 4:52 AM Gerd Hoffmann <kraxel@redhat.com> wrote:
>> On Thu, Oct 21, 2021 at 11:55:47AM +0200, Maksym Wezdecki wrote:
>>> I get your point. However, we need to make resource_create ioctl,
>>> in order to create corresponding resource on the host.
>> That used to be the case but isn't true any more with the new
>> blob resources.  virglrenderer allows to create gpu objects
>> via execbuffer.  Those gpu objects can be linked to a
>> virtio-gpu resources, but it's optional.  In your case you
>> would do that only for your staging buffer.  The other textures
>> (which you fill with a host-side copy from the staging buffer)
>> do not need a virtio-gpu resource in the first place.
> That's however a breaking change to the virgl protocol.  All virgl
> commands expect res ids rather than blob ids.
>
> Using VIRTGPU_BLOB_MEM_HOST3D could in theory work.  But there are a
> few occasions where virglrenderer expects there to be guest storage.
> There are also readbacks that we need to support.  We might end up
> using VIRTGPU_BLOB_MEM_HOST3D_GUEST, where pin-on-demand is still
> desirable.
>
> For this patch, I think the uapi change can be simplified.  It can be
> a new param plus a new field in drm_virtgpu_execbuffer
>
>   __u64 bo_flags; /* pointer to an array of size num_bo_handles, or NULL */
>
> The other changes do not seem needed.

My first approach was the same, as you mentioned. However, having "__u64 bo_flags"
needs a for loop, where only few of the bo-s needs to be pinned - this has
performance implications. I would rather pass bo handles that should be pinned than
having a lot of flags, where only 1-2 bos needs to be pinned.

>
>> take care,
>>   Gerd
>>
Maksym Wezdecki Oct. 27, 2021, 9:34 a.m. UTC | #7
Gerd,

Can we follow up on this issue?

The main pain point with your suggestion is the fact,
that it will cause VirGL protocol breakage and we would
like to avoid this.

Extending execbuffer ioctl and create_resource ioctl is
more convenient than having the protocol broken.

Blob resources is not a solution, too, because QEMU doesn't
support blob resources right now.

In ideal solution when both QEMU and crosvm support blob resources
we can switch to blob resources for textures.

I would like to introduce changes gradually and not make a protocol
breakage.

What do you think about that?

Maksym


On 10/22/21 10:44 AM, Maksym Wezdecki wrote:

> Once again with all lists and receivers. I'm sorry for that.
>
> On 10/21/21 6:42 PM, Chia-I Wu wrote:
>> On Thu, Oct 21, 2021 at 4:52 AM Gerd Hoffmann <kraxel@redhat.com> wrote:
>>> On Thu, Oct 21, 2021 at 11:55:47AM +0200, Maksym Wezdecki wrote:
>>>> I get your point. However, we need to make resource_create ioctl,
>>>> in order to create corresponding resource on the host.
>>> That used to be the case but isn't true any more with the new
>>> blob resources.  virglrenderer allows to create gpu objects
>>> via execbuffer.  Those gpu objects can be linked to a
>>> virtio-gpu resources, but it's optional.  In your case you
>>> would do that only for your staging buffer.  The other textures
>>> (which you fill with a host-side copy from the staging buffer)
>>> do not need a virtio-gpu resource in the first place.
>> That's however a breaking change to the virgl protocol.  All virgl
>> commands expect res ids rather than blob ids.
>>
>> Using VIRTGPU_BLOB_MEM_HOST3D could in theory work.  But there are a
>> few occasions where virglrenderer expects there to be guest storage.
>> There are also readbacks that we need to support.  We might end up
>> using VIRTGPU_BLOB_MEM_HOST3D_GUEST, where pin-on-demand is still
>> desirable.
>>
>> For this patch, I think the uapi change can be simplified.  It can be
>> a new param plus a new field in drm_virtgpu_execbuffer
>>
>>   __u64 bo_flags; /* pointer to an array of size num_bo_handles, or NULL */
>>
>> The other changes do not seem needed.
> My first approach was the same, as you mentioned. However, having "__u64 bo_flags"
> needs a for loop, where only few of the bo-s needs to be pinned - this has
> performance implications. I would rather pass bo handles that should be pinned than
> having a lot of flags, where only 1-2 bos needs to be pinned.
>
>>> take care,
>>>   Gerd
>>>
Gerd Hoffmann Oct. 27, 2021, 11:11 a.m. UTC | #8
[ Cc'ing Gurchetan Singh ]

> Can we follow up on this issue?
> 
> The main pain point with your suggestion is the fact,
> that it will cause VirGL protocol breakage and we would
> like to avoid this.
> 
> Extending execbuffer ioctl and create_resource ioctl is
> more convenient than having the protocol broken.

Do you know at resource creation time whenever you need cpu access
or not?  IOW can we make that a resource property, so we don't have
pass around lists of objects on each and every execbuffer ioctl?

> Blob resources is not a solution, too, because QEMU doesn't
> support blob resources right now.
> 
> In ideal solution when both QEMU and crosvm support blob resources
> we can switch to blob resources for textures.

That'll only happen if someone works on it.
I will not be able to do that though.

> I would like to introduce changes gradually and not make a protocol
> breakage.

Well, opengl switching to blob resources is a protocol change anyway.
That doesn't imply things will break though.  We have capsets etc to
extend the protocol while maintaining backward compatibility.

> What do you think about that?

I still think that switching to blob resources would be the better
solution.  Yes, it's alot of work and not something which helps
short-term.  But adding a new API as interim solution isn't great
either.  So, not sure.  Chia-I Wu?  Gurchetan Singh?


While being at it:  Chia-I Wu & Gurchetan Singh, could you help
reviewing virtio-gpu kernel patches?  I think you both have a better
view on the big picture (with both mesa and virglrenderer) than I have.
Also for the crosvm side of things.  The procedure for that would be to
send a patch adding yourself to the virtio-gpu section of the
MAINTAINERS file, so scripts/get_maintainer.pl will Cc you on patches
submitted.

thanks,
  Gerd

> 
> Maksym
> 
> 
> On 10/22/21 10:44 AM, Maksym Wezdecki wrote:
> 
> > Once again with all lists and receivers. I'm sorry for that.
> >
> > On 10/21/21 6:42 PM, Chia-I Wu wrote:
> >> On Thu, Oct 21, 2021 at 4:52 AM Gerd Hoffmann <kraxel@redhat.com> wrote:
> >>> On Thu, Oct 21, 2021 at 11:55:47AM +0200, Maksym Wezdecki wrote:
> >>>> I get your point. However, we need to make resource_create ioctl,
> >>>> in order to create corresponding resource on the host.
> >>> That used to be the case but isn't true any more with the new
> >>> blob resources.  virglrenderer allows to create gpu objects
> >>> via execbuffer.  Those gpu objects can be linked to a
> >>> virtio-gpu resources, but it's optional.  In your case you
> >>> would do that only for your staging buffer.  The other textures
> >>> (which you fill with a host-side copy from the staging buffer)
> >>> do not need a virtio-gpu resource in the first place.
> >> That's however a breaking change to the virgl protocol.  All virgl
> >> commands expect res ids rather than blob ids.
> >>
> >> Using VIRTGPU_BLOB_MEM_HOST3D could in theory work.  But there are a
> >> few occasions where virglrenderer expects there to be guest storage.
> >> There are also readbacks that we need to support.  We might end up
> >> using VIRTGPU_BLOB_MEM_HOST3D_GUEST, where pin-on-demand is still
> >> desirable.
> >>
> >> For this patch, I think the uapi change can be simplified.  It can be
> >> a new param plus a new field in drm_virtgpu_execbuffer
> >>
> >>   __u64 bo_flags; /* pointer to an array of size num_bo_handles, or NULL */
> >>
> >> The other changes do not seem needed.
> > My first approach was the same, as you mentioned. However, having "__u64 bo_flags"
> > needs a for loop, where only few of the bo-s needs to be pinned - this has
> > performance implications. I would rather pass bo handles that should be pinned than
> > having a lot of flags, where only 1-2 bos needs to be pinned.
> >
> >>> take care,
> >>>   Gerd
> >>>

--
Chia-I Wu Oct. 28, 2021, 8:27 p.m. UTC | #9
On Wed, Oct 27, 2021 at 4:12 AM Gerd Hoffmann <kraxel@redhat.com> wrote:
>
> [ Cc'ing Gurchetan Singh ]
>
> > Can we follow up on this issue?
> >
> > The main pain point with your suggestion is the fact,
> > that it will cause VirGL protocol breakage and we would
> > like to avoid this.
> >
> > Extending execbuffer ioctl and create_resource ioctl is
> > more convenient than having the protocol broken.
>
> Do you know at resource creation time whenever you need cpu access
> or not?  IOW can we make that a resource property, so we don't have
> pass around lists of objects on each and every execbuffer ioctl?
Yes.  The userspace driver should be able to internally mark the
resource as NO TRANSFER and omit it from execbuffer.  It can be tricky
though because resource wait will no longer work.


>
> > Blob resources is not a solution, too, because QEMU doesn't
> > support blob resources right now.
> >
> > In ideal solution when both QEMU and crosvm support blob resources
> > we can switch to blob resources for textures.
>
> That'll only happen if someone works on it.
> I will not be able to do that though.
>
> > I would like to introduce changes gradually and not make a protocol
> > breakage.
>
> Well, opengl switching to blob resources is a protocol change anyway.
> That doesn't imply things will break though.  We have capsets etc to
> extend the protocol while maintaining backward compatibility.
>
> > What do you think about that?
>
> I still think that switching to blob resources would be the better
> solution.  Yes, it's alot of work and not something which helps
> short-term.  But adding a new API as interim solution isn't great
> either.  So, not sure.  Chia-I Wu?  Gurchetan Singh?
I can agree with that, although it depends on how much work needs to
happen in the userspace for virgl.

We will look into a userspace-only solution, or at least something not
involving uAPI additions.

>
>
> While being at it:  Chia-I Wu & Gurchetan Singh, could you help
> reviewing virtio-gpu kernel patches?  I think you both have a better
> view on the big picture (with both mesa and virglrenderer) than I have.
> Also for the crosvm side of things.  The procedure for that would be to
> send a patch adding yourself to the virtio-gpu section of the
> MAINTAINERS file, so scripts/get_maintainer.pl will Cc you on patches
> submitted.
Sure.  Will do.
>
> thanks,
>   Gerd
>
> >
> > Maksym
> >
> >
> > On 10/22/21 10:44 AM, Maksym Wezdecki wrote:
> >
> > > Once again with all lists and receivers. I'm sorry for that.
> > >
> > > On 10/21/21 6:42 PM, Chia-I Wu wrote:
> > >> On Thu, Oct 21, 2021 at 4:52 AM Gerd Hoffmann <kraxel@redhat.com> wrote:
> > >>> On Thu, Oct 21, 2021 at 11:55:47AM +0200, Maksym Wezdecki wrote:
> > >>>> I get your point. However, we need to make resource_create ioctl,
> > >>>> in order to create corresponding resource on the host.
> > >>> That used to be the case but isn't true any more with the new
> > >>> blob resources.  virglrenderer allows to create gpu objects
> > >>> via execbuffer.  Those gpu objects can be linked to a
> > >>> virtio-gpu resources, but it's optional.  In your case you
> > >>> would do that only for your staging buffer.  The other textures
> > >>> (which you fill with a host-side copy from the staging buffer)
> > >>> do not need a virtio-gpu resource in the first place.
> > >> That's however a breaking change to the virgl protocol.  All virgl
> > >> commands expect res ids rather than blob ids.
> > >>
> > >> Using VIRTGPU_BLOB_MEM_HOST3D could in theory work.  But there are a
> > >> few occasions where virglrenderer expects there to be guest storage.
> > >> There are also readbacks that we need to support.  We might end up
> > >> using VIRTGPU_BLOB_MEM_HOST3D_GUEST, where pin-on-demand is still
> > >> desirable.
> > >>
> > >> For this patch, I think the uapi change can be simplified.  It can be
> > >> a new param plus a new field in drm_virtgpu_execbuffer
> > >>
> > >>   __u64 bo_flags; /* pointer to an array of size num_bo_handles, or NULL */
> > >>
> > >> The other changes do not seem needed.
> > > My first approach was the same, as you mentioned. However, having "__u64 bo_flags"
> > > needs a for loop, where only few of the bo-s needs to be pinned - this has
> > > performance implications. I would rather pass bo handles that should be pinned than
> > > having a lot of flags, where only 1-2 bos needs to be pinned.
> > >
> > >>> take care,
> > >>>   Gerd
> > >>>
>
> --
>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h
index d4e610a44e12..f429b0f48243 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -434,6 +434,10 @@  int virtio_gpu_object_create(struct virtio_gpu_device *vgdev,
 			     struct virtio_gpu_object **bo_ptr,
 			     struct virtio_gpu_fence *fence);
 
+int virtio_gpu_object_pin(struct virtio_gpu_device *vgdev,
+			  struct virtio_gpu_object_array *objs,
+			  int num_gem_objects);
+
 bool virtio_gpu_is_shmem(struct virtio_gpu_object *bo);
 
 int virtio_gpu_resource_id_get(struct virtio_gpu_device *vgdev,
diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
index 5c1ad1596889..c468f71b47d7 100644
--- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
+++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
@@ -83,8 +83,11 @@  static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data,
 	struct virtio_gpu_fence *out_fence;
 	int ret;
 	uint32_t *bo_handles = NULL;
+	uint32_t *accessed_bo_handles = NULL;
 	void __user *user_bo_handles = NULL;
+	void __user *user_accessed_bo_handles = NULL;
 	struct virtio_gpu_object_array *buflist = NULL;
+	struct virtio_gpu_object_array *acc_buflist = NULL;
 	struct sync_file *sync_file;
 	int in_fence_fd = exbuf->fence_fd;
 	int out_fence_fd = -1;
@@ -149,6 +152,44 @@  static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data,
 		}
 		kvfree(bo_handles);
 		bo_handles = NULL;
+
+		if (exbuf->flags & VIRTGPU_EXECBUF_ACC_BO_PRESENT &&
+		    exbuf->num_acc_bo_handles != 0) {
+			accessed_bo_handles =
+				kvmalloc_array(exbuf->num_acc_bo_handles,
+					       sizeof(uint32_t), GFP_KERNEL);
+			if (!accessed_bo_handles) {
+				ret = -ENOMEM;
+				goto out_unused_fd;
+			}
+
+			user_accessed_bo_handles =
+				u64_to_user_ptr(exbuf->accessed_bo_handles);
+			if (copy_from_user(accessed_bo_handles,
+					   user_accessed_bo_handles,
+					   exbuf->num_acc_bo_handles *
+						   sizeof(uint32_t))) {
+				ret = -EFAULT;
+				goto out_unused_fd;
+			}
+
+			acc_buflist = virtio_gpu_array_from_handles(
+				file, accessed_bo_handles,
+				exbuf->num_acc_bo_handles);
+			if (!buflist) {
+				ret = -ENOENT;
+				goto out_unused_fd;
+			}
+
+			ret = virtio_gpu_object_pin(vgdev, acc_buflist,
+						    exbuf->num_acc_bo_handles);
+			if (ret < 0) {
+				goto out_unused_fd;
+			}
+
+			kvfree(accessed_bo_handles);
+			accessed_bo_handles = NULL;
+		}
 	}
 
 	buf = vmemdup_user(u64_to_user_ptr(exbuf->command), exbuf->size);
@@ -226,6 +267,9 @@  static int virtio_gpu_getparam_ioctl(struct drm_device *dev, void *data,
 	case VIRTGPU_PARAM_CROSS_DEVICE:
 		value = vgdev->has_resource_assign_uuid ? 1 : 0;
 		break;
+	case VIRTGPU_PARAM_PIN_ON_DEMAND:
+		value = 1;
+		break;
 	default:
 		return -EINVAL;
 	}
@@ -331,6 +375,7 @@  static int virtio_gpu_transfer_from_host_ioctl(struct drm_device *dev,
 	struct virtio_gpu_object *bo;
 	struct virtio_gpu_object_array *objs;
 	struct virtio_gpu_fence *fence;
+	struct virtio_gpu_object_shmem *shmem;
 	int ret;
 	u32 offset = args->offset;
 
@@ -348,6 +393,11 @@  static int virtio_gpu_transfer_from_host_ioctl(struct drm_device *dev,
 		goto err_put_free;
 	}
 
+	shmem = to_virtio_gpu_shmem(bo);
+	if (!shmem->pages) {
+		virtio_gpu_object_pin(vgdev, objs, 1);
+	}
+
 	if (!bo->host3d_blob && (args->stride || args->layer_stride)) {
 		ret = -EINVAL;
 		goto err_put_free;
@@ -385,6 +435,7 @@  static int virtio_gpu_transfer_to_host_ioctl(struct drm_device *dev, void *data,
 	struct drm_virtgpu_3d_transfer_to_host *args = data;
 	struct virtio_gpu_object *bo;
 	struct virtio_gpu_object_array *objs;
+	struct virtio_gpu_object_shmem *shmem;
 	struct virtio_gpu_fence *fence;
 	int ret;
 	u32 offset = args->offset;
@@ -399,6 +450,11 @@  static int virtio_gpu_transfer_to_host_ioctl(struct drm_device *dev, void *data,
 		goto err_put_free;
 	}
 
+	shmem = to_virtio_gpu_shmem(bo);
+	if (!shmem->pages) {
+		virtio_gpu_object_pin(vgdev, objs, 1);
+	}
+
 	if (!vgdev->has_virgl_3d) {
 		virtio_gpu_cmd_transfer_to_host_2d
 			(vgdev, offset,
diff --git a/drivers/gpu/drm/virtio/virtgpu_object.c b/drivers/gpu/drm/virtio/virtgpu_object.c
index f648b0e24447..ff2e891d7973 100644
--- a/drivers/gpu/drm/virtio/virtgpu_object.c
+++ b/drivers/gpu/drm/virtio/virtgpu_object.c
@@ -80,9 +80,9 @@  void virtio_gpu_cleanup_object(struct virtio_gpu_object *bo)
 			kfree(shmem->pages);
 			shmem->pages = NULL;
 			drm_gem_shmem_unpin(&bo->base.base);
+			drm_gem_shmem_free_object(&bo->base.base);
 		}
 
-		drm_gem_shmem_free_object(&bo->base.base);
 	} else if (virtio_gpu_is_vram(bo)) {
 		struct virtio_gpu_object_vram *vram = to_virtio_gpu_vram(bo);
 
@@ -219,6 +219,7 @@  int virtio_gpu_object_create(struct virtio_gpu_device *vgdev,
 	struct virtio_gpu_mem_entry *ents;
 	unsigned int nents;
 	int ret;
+	uint32_t backup_flags = params->flags;
 
 	*bo_ptr = NULL;
 
@@ -246,11 +247,16 @@  int virtio_gpu_object_create(struct virtio_gpu_device *vgdev,
 			goto err_put_objs;
 	}
 
-	ret = virtio_gpu_object_shmem_init(vgdev, bo, &ents, &nents);
-	if (ret != 0) {
-		virtio_gpu_array_put_free(objs);
-		virtio_gpu_free_object(&shmem_obj->base);
-		return ret;
+	// turn off these bits, as renderer doesn't support such bits
+	params->flags &= ~(VIRTGPU_NOT_PIN_FLAG);
+
+	if (!(backup_flags & VIRTGPU_NOT_PIN_FLAG)) {
+		ret = virtio_gpu_object_shmem_init(vgdev, bo, &ents, &nents);
+		if (ret != 0) {
+			virtio_gpu_array_put_free(objs);
+			virtio_gpu_free_object(&shmem_obj->base);
+			return ret;
+		}
 	}
 
 	if (params->blob) {
@@ -262,11 +268,15 @@  int virtio_gpu_object_create(struct virtio_gpu_device *vgdev,
 	} else if (params->virgl) {
 		virtio_gpu_cmd_resource_create_3d(vgdev, bo, params,
 						  objs, fence);
-		virtio_gpu_object_attach(vgdev, bo, ents, nents);
+		if (!(backup_flags & VIRTGPU_NOT_PIN_FLAG)) {
+			virtio_gpu_object_attach(vgdev, bo, ents, nents);
+		}
 	} else {
 		virtio_gpu_cmd_create_resource(vgdev, bo, params,
 					       objs, fence);
-		virtio_gpu_object_attach(vgdev, bo, ents, nents);
+		if (!(backup_flags & VIRTGPU_NOT_PIN_FLAG)) {
+			virtio_gpu_object_attach(vgdev, bo, ents, nents);
+		}
 	}
 
 	*bo_ptr = bo;
@@ -280,3 +290,29 @@  int virtio_gpu_object_create(struct virtio_gpu_device *vgdev,
 	drm_gem_shmem_free_object(&shmem_obj->base);
 	return ret;
 }
+
+int virtio_gpu_object_pin(struct virtio_gpu_device *vgdev,
+			  struct virtio_gpu_object_array *objs,
+			  int num_gem_objects)
+{
+	int i, ret;
+
+	for (i = 0; i < num_gem_objects; i++) {
+		struct virtio_gpu_mem_entry *ents;
+		unsigned int nents;
+
+		struct virtio_gpu_object *bo =
+			gem_to_virtio_gpu_obj(objs->objs[i]);
+		if (!bo) {
+			return -EFAULT;
+		}
+
+		ret = virtio_gpu_object_shmem_init(vgdev, bo, &ents, &nents);
+		if (ret != 0) {
+			return -EFAULT;
+		}
+
+		virtio_gpu_object_attach(vgdev, bo, ents, nents);
+	}
+	return 0;
+}
diff --git a/include/uapi/drm/virtgpu_drm.h b/include/uapi/drm/virtgpu_drm.h
index b9ec26e9c646..6ae9af4aadea 100644
--- a/include/uapi/drm/virtgpu_drm.h
+++ b/include/uapi/drm/virtgpu_drm.h
@@ -50,10 +50,17 @@  extern "C" {
 
 #define VIRTGPU_EXECBUF_FENCE_FD_IN	0x01
 #define VIRTGPU_EXECBUF_FENCE_FD_OUT	0x02
-#define VIRTGPU_EXECBUF_FLAGS  (\
-		VIRTGPU_EXECBUF_FENCE_FD_IN |\
-		VIRTGPU_EXECBUF_FENCE_FD_OUT |\
-		0)
+#define VIRTGPU_EXECBUF_ACC_BO_PRESENT 0x04
+#define VIRTGPU_EXECBUF_FLAGS                                                  \
+	(VIRTGPU_EXECBUF_FENCE_FD_IN | VIRTGPU_EXECBUF_FENCE_FD_OUT |          \
+	 VIRTGPU_EXECBUF_ACC_BO_PRESENT | 0)
+
+/* it is used in resource_create_ioctl as resource
+ * flag.
+ * First 8 bits of uint32_t and 24th bit 
+ * are reserved for user space driver.
+ */
+#define VIRTGPU_NOT_PIN_FLAG (1 << 9)
 
 struct drm_virtgpu_map {
 	__u64 offset; /* use for mmap system call */
@@ -68,6 +75,8 @@  struct drm_virtgpu_execbuffer {
 	__u64 bo_handles;
 	__u32 num_bo_handles;
 	__s32 fence_fd; /* in/out fence fd (see VIRTGPU_EXECBUF_FENCE_FD_IN/OUT) */
+	__u64 accessed_bo_handles;
+	__u32 num_acc_bo_handles;
 };
 
 #define VIRTGPU_PARAM_3D_FEATURES 1 /* do we have 3D features in the hw */
@@ -75,6 +84,7 @@  struct drm_virtgpu_execbuffer {
 #define VIRTGPU_PARAM_RESOURCE_BLOB 3 /* DRM_VIRTGPU_RESOURCE_CREATE_BLOB */
 #define VIRTGPU_PARAM_HOST_VISIBLE 4 /* Host blob resources are mappable */
 #define VIRTGPU_PARAM_CROSS_DEVICE 5 /* Cross virtio-device resource sharing  */
+#define VIRTGPU_PARAM_PIN_ON_DEMAND 6 /* is pinning on demand available? */
 
 struct drm_virtgpu_getparam {
 	__u64 param;