Message ID | 20190108112519.27473-16-kraxel@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/bochs: cleanups, atomic modesetting, generic fbdev. | expand |
On Tue, Jan 08, 2019 at 12:25:19PM +0100, Gerd Hoffmann wrote: > The buffer object must be reserved before calling > ttm_bo_validate for pinning/unpinning. > > Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Seems a bit a bisect fumble in your series here: legacy kms code reserved the ttm bo before calling boch_bo_pin/unpin, your atomic code doesn't. I think pushing this into bochs_bo_pin/unpin makes sense for atomic, but to avoid bisect fail I think you need to have these temporarily in your cleanup/prepare_plane functions too. Looked through the entire series, this here is the only issue I think should be fixed before merging (making atomic_enable optional can be done as a follow-up if you feel like it). With that addressed on the series: Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> > --- > drivers/gpu/drm/bochs/bochs_mm.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/drivers/gpu/drm/bochs/bochs_mm.c b/drivers/gpu/drm/bochs/bochs_mm.c > index cfe061c25f..970a591908 100644 > --- a/drivers/gpu/drm/bochs/bochs_mm.c > +++ b/drivers/gpu/drm/bochs/bochs_mm.c > @@ -223,7 +223,11 @@ int bochs_bo_pin(struct bochs_bo *bo, u32 pl_flag) > bochs_ttm_placement(bo, pl_flag); > for (i = 0; i < bo->placement.num_placement; i++) > bo->placements[i].flags |= TTM_PL_FLAG_NO_EVICT; > + ret = ttm_bo_reserve(&bo->bo, true, false, NULL); > + if (ret) > + return ret; > ret = ttm_bo_validate(&bo->bo, &bo->placement, &ctx); > + ttm_bo_unreserve(&bo->bo); > if (ret) > return ret; > > @@ -247,7 +251,11 @@ int bochs_bo_unpin(struct bochs_bo *bo) > > for (i = 0; i < bo->placement.num_placement; i++) > bo->placements[i].flags &= ~TTM_PL_FLAG_NO_EVICT; > + ret = ttm_bo_reserve(&bo->bo, true, false, NULL); > + if (ret) > + return ret; > ret = ttm_bo_validate(&bo->bo, &bo->placement, &ctx); > + ttm_bo_unreserve(&bo->bo); > if (ret) > return ret; > > -- > 2.9.3 > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Wed, Jan 09, 2019 at 11:10:44AM +0100, Daniel Vetter wrote: > On Tue, Jan 08, 2019 at 12:25:19PM +0100, Gerd Hoffmann wrote: > > The buffer object must be reserved before calling > > ttm_bo_validate for pinning/unpinning. > > > > Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> > > Seems a bit a bisect fumble in your series here: legacy kms code reserved > the ttm bo before calling boch_bo_pin/unpin, your atomic code doesn't. I > think pushing this into bochs_bo_pin/unpin makes sense for atomic, but to > avoid bisect fail I think you need to have these temporarily in your > cleanup/prepare_plane functions too. I think I've sorted that. Have some other changes too, will probably send v3 tomorrow. > Looked through the entire series, this here is the only issue I think > should be fixed before merging (making atomic_enable optional can be done > as a follow-up if you feel like it). With that addressed on the series: > > Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Thanks. While being at it: I'm also looking at dma-buf export and import support for the qemu drivers. Right now both qxl and virtio have gem_prime_get_sg_table and gem_prime_import_sg_table handlers which throw a WARN_ONCE() and return an error. If I understand things correctly it is valid to set all import/export callbacks (prime_handle_to_fd, prime_fd_to_handle, gem_prime_get_sg_table, gem_prime_import_sg_table) to NULL when not supporting dma-buf import/export and still advertise DRIVER_PRIME to indicate the other prime callbacks are supported (so generic fbdev emulation can use gem_prime_vmap etc). Is that correct? On exporting: TTM_PL_TT should be easy, just pin the buffer, grab the pages list and feed that into drm_prime_pages_to_sg. Didn't try yet though. Is that approach correct? Is it possible to export TTM_PL_VRAM objects (with backing storage being a pci memory bar)? If so, how? On importing: Importing into TTM_PL_TT object looks easy again, at least when the object is actually stored in RAM. What if not? Importing into TTM_PL_VRAM: Impossible I think, without copying over the data. Should that be done? If so, how? Or is it better to just not support import then? thanks, Gerd
On Wed, Jan 9, 2019 at 3:54 PM Gerd Hoffmann <kraxel@redhat.com> wrote: > > On Wed, Jan 09, 2019 at 11:10:44AM +0100, Daniel Vetter wrote: > > On Tue, Jan 08, 2019 at 12:25:19PM +0100, Gerd Hoffmann wrote: > > > The buffer object must be reserved before calling > > > ttm_bo_validate for pinning/unpinning. > > > > > > Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> > > > > Seems a bit a bisect fumble in your series here: legacy kms code reserved > > the ttm bo before calling boch_bo_pin/unpin, your atomic code doesn't. I > > think pushing this into bochs_bo_pin/unpin makes sense for atomic, but to > > avoid bisect fail I think you need to have these temporarily in your > > cleanup/prepare_plane functions too. > > I think I've sorted that. Have some other changes too, will probably > send v3 tomorrow. > > > Looked through the entire series, this here is the only issue I think > > should be fixed before merging (making atomic_enable optional can be done > > as a follow-up if you feel like it). With that addressed on the series: > > > > Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> > > Thanks. > > While being at it: I'm also looking at dma-buf export and import > support for the qemu drivers. > > Right now both qxl and virtio have gem_prime_get_sg_table and > gem_prime_import_sg_table handlers which throw a WARN_ONCE() and return > an error. > > If I understand things correctly it is valid to set all import/export > callbacks (prime_handle_to_fd, prime_fd_to_handle, > gem_prime_get_sg_table, gem_prime_import_sg_table) to NULL when not > supporting dma-buf import/export and still advertise DRIVER_PRIME to > indicate the other prime callbacks are supported (so generic fbdev > emulation can use gem_prime_vmap etc). Is that correct? I'm not sure how much that's a good idea ... Never thought about it tbh. All the fbdev/dma-buf stuff has plenty of hacks and inconsistencies still, so I guess we can't make it much worse really. > On exporting: > > TTM_PL_TT should be easy, just pin the buffer, grab the pages list and > feed that into drm_prime_pages_to_sg. Didn't try yet though. Is that > approach correct? > > Is it possible to export TTM_PL_VRAM objects (with backing storage being > a pci memory bar)? If so, how? Not really in general. amdgpu upcasts to amdgpu_bo (if it's amgpu BO) and then knows the internals so it can do a proper pci peer2peer mapping. Or at least there's been lots of patches floating around to make that happen. I think other drivers migrate the bo out of VRAM. > On importing: > > Importing into TTM_PL_TT object looks easy again, at least when the > object is actually stored in RAM. What if not? They are all supposed to be stored in RAM. Note that all current ttm importers totally break the abstraction, by taking the sg list, throwing the dma mapping away and assuming there's a struct page backing it. Would be good if we could stop spreading that abuse - the dma-buf interfaces have been modelled after the ttm bo interfaces, so shouldn't be too hard to wire this up correctly. > Importing into TTM_PL_VRAM: Impossible I think, without copying over > the data. Should that be done? If so, how? Or is it better to just > not support import then? Hm, since you ask about TTM concepts and not what this means in terms of dma-buf: As long as you upcast to the ttm_bo you can do whatever you want to really. But with plain dma-buf this doesn't work right now (least because ttm assumes it gets system RAM on import, in theory you could put the peer2peer dma mapping into the sg list and it should work). -Daniel > > thanks, > Gerd > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Wed, Jan 9, 2019 at 12:36 PM Daniel Vetter <daniel@ffwll.ch> wrote: > > On Wed, Jan 9, 2019 at 3:54 PM Gerd Hoffmann <kraxel@redhat.com> wrote: > > > > On Wed, Jan 09, 2019 at 11:10:44AM +0100, Daniel Vetter wrote: > > > On Tue, Jan 08, 2019 at 12:25:19PM +0100, Gerd Hoffmann wrote: > > > > The buffer object must be reserved before calling > > > > ttm_bo_validate for pinning/unpinning. > > > > > > > > Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> > > > > > > Seems a bit a bisect fumble in your series here: legacy kms code reserved > > > the ttm bo before calling boch_bo_pin/unpin, your atomic code doesn't. I > > > think pushing this into bochs_bo_pin/unpin makes sense for atomic, but to > > > avoid bisect fail I think you need to have these temporarily in your > > > cleanup/prepare_plane functions too. > > > > I think I've sorted that. Have some other changes too, will probably > > send v3 tomorrow. > > > > > Looked through the entire series, this here is the only issue I think > > > should be fixed before merging (making atomic_enable optional can be done > > > as a follow-up if you feel like it). With that addressed on the series: > > > > > > Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> > > > > Thanks. > > > > While being at it: I'm also looking at dma-buf export and import > > support for the qemu drivers. > > > > Right now both qxl and virtio have gem_prime_get_sg_table and > > gem_prime_import_sg_table handlers which throw a WARN_ONCE() and return > > an error. > > > > If I understand things correctly it is valid to set all import/export > > callbacks (prime_handle_to_fd, prime_fd_to_handle, > > gem_prime_get_sg_table, gem_prime_import_sg_table) to NULL when not > > supporting dma-buf import/export and still advertise DRIVER_PRIME to > > indicate the other prime callbacks are supported (so generic fbdev > > emulation can use gem_prime_vmap etc). Is that correct? > > I'm not sure how much that's a good idea ... Never thought about it > tbh. All the fbdev/dma-buf stuff has plenty of hacks and > inconsistencies still, so I guess we can't make it much worse really. > > > On exporting: > > > > TTM_PL_TT should be easy, just pin the buffer, grab the pages list and > > feed that into drm_prime_pages_to_sg. Didn't try yet though. Is that > > approach correct? > > > > Is it possible to export TTM_PL_VRAM objects (with backing storage being > > a pci memory bar)? If so, how? > > Not really in general. amdgpu upcasts to amdgpu_bo (if it's amgpu BO) > and then knows the internals so it can do a proper pci peer2peer > mapping. Or at least there's been lots of patches floating around to > make that happen. Here's Christian's WIP stuff for adding device memory support to dma-buf: https://cgit.freedesktop.org/~deathsimple/linux/log/?h=p2p Alex > > I think other drivers migrate the bo out of VRAM. > > > On importing: > > > > Importing into TTM_PL_TT object looks easy again, at least when the > > object is actually stored in RAM. What if not? > > They are all supposed to be stored in RAM. Note that all current ttm > importers totally break the abstraction, by taking the sg list, > throwing the dma mapping away and assuming there's a struct page > backing it. Would be good if we could stop spreading that abuse - the > dma-buf interfaces have been modelled after the ttm bo interfaces, so > shouldn't be too hard to wire this up correctly. > > > Importing into TTM_PL_VRAM: Impossible I think, without copying over > > the data. Should that be done? If so, how? Or is it better to just > > not support import then? > > Hm, since you ask about TTM concepts and not what this means in terms > of dma-buf: As long as you upcast to the ttm_bo you can do whatever > you want to really. But with plain dma-buf this doesn't work right now > (least because ttm assumes it gets system RAM on import, in theory you > could put the peer2peer dma mapping into the sg list and it should > work). > -Daniel > > > > > thanks, > > Gerd > > > > _______________________________________________ > > dri-devel mailing list > > dri-devel@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/dri-devel > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel
Hi, > > If I understand things correctly it is valid to set all import/export > > callbacks (prime_handle_to_fd, prime_fd_to_handle, > > gem_prime_get_sg_table, gem_prime_import_sg_table) to NULL when not > > supporting dma-buf import/export and still advertise DRIVER_PRIME to > > indicate the other prime callbacks are supported (so generic fbdev > > emulation can use gem_prime_vmap etc). Is that correct? > > I'm not sure how much that's a good idea ... Never thought about it > tbh. All the fbdev/dma-buf stuff has plenty of hacks and > inconsistencies still, so I guess we can't make it much worse really. Setting prime_handle_to_fd + prime_fd_to_handle to NULL has the effect that drm stops advertising DRM_PRIME_CAP_{IMPORT,EXPORT} to userspace. Which looks better to me than telling userspace we support it then throw errors unconditionally when userspace tries to use that. > > Is it possible to export TTM_PL_VRAM objects (with backing storage being > > a pci memory bar)? If so, how? > > Not really in general. amdgpu upcasts to amdgpu_bo (if it's amgpu BO) > and then knows the internals so it can do a proper pci peer2peer > mapping. Or at least there's been lots of patches floating around to > make that happen. That is limited to bo sharing between two amdgpu devices, correct? > I think other drivers migrate the bo out of VRAM. Well, that doesn't look too useful. bochs and qxl virtual hardware can't access buffers outside VRAM. So, while I could migrate the buffers to RAM (via memcpy) when exporting they would at the same time become unusable for the GPU ... > > On importing: > > > > Importing into TTM_PL_TT object looks easy again, at least when the > > object is actually stored in RAM. What if not? > > They are all supposed to be stored in RAM. Note that all current ttm > importers totally break the abstraction, by taking the sg list, > throwing the dma mapping away and assuming there's a struct page > backing it. Would be good if we could stop spreading that abuse - the > dma-buf interfaces have been modelled after the ttm bo interfaces, so > shouldn't be too hard to wire this up correctly. Ok. With virtio-gpu (where objects are backed by RAM pages anyway) wiring this up should be easy. But given there is no correct sample code I can look at it would be cool if you could give some more hints how this is supposed to work. The gem_prime_import_sg_table() callback gets a sg list passed in after all, so I probably would have tried to take the sg list too ... > > Importing into TTM_PL_VRAM: Impossible I think, without copying over > > the data. Should that be done? If so, how? Or is it better to just > > not support import then? > > Hm, since you ask about TTM concepts and not what this means in terms > of dma-buf: Ok, more details on the quesion: dma-buf: whatever the driver gets passed into the gem_prime_import_sg_table() callback. import into TTM_PL_VRAM: qemu driver which supports VRAM storage only (bochs, qxl), so the buffer has to be stored there if we want do something with it (like scanning out to a crtc). > As long as you upcast to the ttm_bo you can do whatever > you want to really. Well, if the dma-buf comes from another device (say export vgem bo, then try import into bochs/qxl/virtio) I can't upcast. When the dma-buf comes from the same device drm_gem_prime_import_dev() will notice and take a shortcut (skip import, just increase refcount instead), so I don't have to worry about that case in the gem_prime_import_sg_table() callback. > But with plain dma-buf this doesn't work right now > (least because ttm assumes it gets system RAM on import, in theory you > could put the peer2peer dma mapping into the sg list and it should > work). Well, qemu display devices don't have peer2peer dma support. So I guess the answer is "doesn't work". cheers, Gerd
On Wed, Jan 9, 2019 at 9:52 PM Gerd Hoffmann <kraxel@redhat.com> wrote: > > Hi, > > > > If I understand things correctly it is valid to set all import/export > > > callbacks (prime_handle_to_fd, prime_fd_to_handle, > > > gem_prime_get_sg_table, gem_prime_import_sg_table) to NULL when not > > > supporting dma-buf import/export and still advertise DRIVER_PRIME to > > > indicate the other prime callbacks are supported (so generic fbdev > > > emulation can use gem_prime_vmap etc). Is that correct? > > > > I'm not sure how much that's a good idea ... Never thought about it > > tbh. All the fbdev/dma-buf stuff has plenty of hacks and > > inconsistencies still, so I guess we can't make it much worse really. > > Setting prime_handle_to_fd + prime_fd_to_handle to NULL has the effect > that drm stops advertising DRM_PRIME_CAP_{IMPORT,EXPORT} to userspace. > > Which looks better to me than telling userspace we support it then throw > errors unconditionally when userspace tries to use that. > > > > Is it possible to export TTM_PL_VRAM objects (with backing storage being > > > a pci memory bar)? If so, how? > > > > Not really in general. amdgpu upcasts to amdgpu_bo (if it's amgpu BO) > > and then knows the internals so it can do a proper pci peer2peer > > mapping. Or at least there's been lots of patches floating around to > > make that happen. > > That is limited to bo sharing between two amdgpu devices, correct? > > > I think other drivers migrate the bo out of VRAM. > > Well, that doesn't look too useful. bochs and qxl virtual hardware > can't access buffers outside VRAM. So, while I could migrate the > buffers to RAM (via memcpy) when exporting they would at the same time > become unusable for the GPU ... > > > > On importing: > > > > > > Importing into TTM_PL_TT object looks easy again, at least when the > > > object is actually stored in RAM. What if not? > > > > They are all supposed to be stored in RAM. Note that all current ttm > > importers totally break the abstraction, by taking the sg list, > > throwing the dma mapping away and assuming there's a struct page > > backing it. Would be good if we could stop spreading that abuse - the > > dma-buf interfaces have been modelled after the ttm bo interfaces, so > > shouldn't be too hard to wire this up correctly. > > Ok. With virtio-gpu (where objects are backed by RAM pages anyway) > wiring this up should be easy. > > But given there is no correct sample code I can look at it would be cool > if you could give some more hints how this is supposed to work. The > gem_prime_import_sg_table() callback gets a sg list passed in after all, > so I probably would have tried to take the sg list too ... I'm not a fan of that helper either, that's really the broken part imo. i915 doesn't use it. It's a midlayer so that the nvidia blob can avoid directly touching the EXPORT_SYMBOL_GPL dma-buf symbols, afaiui there's really no other solid reason for it. What the new gem cma helpers does is imo much better (it still uses the import_sg_table midlayer, but oh well). For ttm you'd need to make sure that all the various ttm cpu side access functions also all go through the relevant dma-buf interfaces, and not through the struct page list fished out of the sgt. That was at least the idea, long ago. > > > Importing into TTM_PL_VRAM: Impossible I think, without copying over > > > the data. Should that be done? If so, how? Or is it better to just > > > not support import then? > > > > Hm, since you ask about TTM concepts and not what this means in terms > > of dma-buf: > > Ok, more details on the quesion: > > dma-buf: whatever the driver gets passed into the > gem_prime_import_sg_table() callback. > > import into TTM_PL_VRAM: qemu driver which supports VRAM storage only > (bochs, qxl), so the buffer has to be stored there if we want do > something with it (like scanning out to a crtc). > > > As long as you upcast to the ttm_bo you can do whatever > > you want to really. > > Well, if the dma-buf comes from another device (say export vgem bo, then > try import into bochs/qxl/virtio) I can't upcast. In that case you'll in practice only get system RAM, and you're not allowed to move it (dma-buf is meant to be zero-copy after all). If your hw can't scan these out directly, then userspace needs to arrange for a buffer copy into a native buffer somehow (that's how Xorg prime works at least I think). No idea whether your virtual gpus can make use of that directly. You might also get some pci peer2peer range in the future, but it's strictly opt-in (because there's too many dma-buf importers that just blindly assume there's a struct page behind the sgt). > When the dma-buf comes from the same device drm_gem_prime_import_dev() > will notice and take a shortcut (skip import, just increase refcount > instead), so I don't have to worry about that case in the > gem_prime_import_sg_table() callback. You can also upcast if it's from the same driver, not just same device. -Daniel > > But with plain dma-buf this doesn't work right now > > (least because ttm assumes it gets system RAM on import, in theory you > > could put the peer2peer dma mapping into the sg list and it should > > work). > > Well, qemu display devices don't have peer2peer dma support. > So I guess the answer is "doesn't work". > > cheers, > Gerd >
diff --git a/drivers/gpu/drm/bochs/bochs_mm.c b/drivers/gpu/drm/bochs/bochs_mm.c index cfe061c25f..970a591908 100644 --- a/drivers/gpu/drm/bochs/bochs_mm.c +++ b/drivers/gpu/drm/bochs/bochs_mm.c @@ -223,7 +223,11 @@ int bochs_bo_pin(struct bochs_bo *bo, u32 pl_flag) bochs_ttm_placement(bo, pl_flag); for (i = 0; i < bo->placement.num_placement; i++) bo->placements[i].flags |= TTM_PL_FLAG_NO_EVICT; + ret = ttm_bo_reserve(&bo->bo, true, false, NULL); + if (ret) + return ret; ret = ttm_bo_validate(&bo->bo, &bo->placement, &ctx); + ttm_bo_unreserve(&bo->bo); if (ret) return ret; @@ -247,7 +251,11 @@ int bochs_bo_unpin(struct bochs_bo *bo) for (i = 0; i < bo->placement.num_placement; i++) bo->placements[i].flags &= ~TTM_PL_FLAG_NO_EVICT; + ret = ttm_bo_reserve(&bo->bo, true, false, NULL); + if (ret) + return ret; ret = ttm_bo_validate(&bo->bo, &bo->placement, &ctx); + ttm_bo_unreserve(&bo->bo); if (ret) return ret;
The buffer object must be reserved before calling ttm_bo_validate for pinning/unpinning. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> --- drivers/gpu/drm/bochs/bochs_mm.c | 8 ++++++++ 1 file changed, 8 insertions(+)