Message ID | 1494233482-23403-3-git-send-email-tomi.valkeinen@ti.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Tomi, Thank you for the patch. On Monday 08 May 2017 11:51:22 Tomi Valkeinen wrote: > On SoCs with TILER, we have to ways to allocate buffers: normal s/to ways/two ways/ > dma_alloc or via TILER (which basically functions as an IOMMU). TILER > can map 128MB at a time, and we only map the TILER buffers when they are > used (i.e. not at alloc time). If TILER is present, omapdrm always > uses TILER. > > There are use cases that require lots of big buffers that are being used > at the same time by different IPs. At the moment the userspace has a > hard maximum of 128MB. > > This patch adds three new flags that can be used by the userspace to > solve the situation: > > OMAP_BO_MEM_CONTIG: The driver will use dma_alloc to get the memory. > This can be used to avoid TILER if the userspace knows it needs more > than 128M of memory at the same time. > > OMAP_BO_MEM_TILER: The driver will use TILER to get the memory. There's > nto much use for this flag at the moment, but it's here for s/nto/not/ > completeness. I assume the OMAP_BO_MEM_CONTIG and OMAP_BO_MEM_TILER flags are supposed to be mutually exclusive ? > OMAP_BO_MEM_PIN: The driver will pin the memory at alloc time, and keep > it pinned. This can be used to 1) get an error at alloc time if TILER > space is full, and 2) get rid of the constant pin/unpin operations which > may have some effect on performance. > > If none of the flags are given, the behavior is the same as currently. > > Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> > --- > drivers/gpu/drm/omapdrm/omap_gem.c | 23 ++++++++++++++++++++++- > include/uapi/drm/omap_drm.h | 3 +++ > 2 files changed, 25 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/omapdrm/omap_gem.c > b/drivers/gpu/drm/omapdrm/omap_gem.c index 5d73dccc1383..90ae8615f6c6 > 100644 > --- a/drivers/gpu/drm/omapdrm/omap_gem.c > +++ b/drivers/gpu/drm/omapdrm/omap_gem.c > @@ -1292,6 +1292,9 @@ void omap_gem_free_object(struct drm_gem_object *obj) > list_del(&omap_obj->mm_list); > spin_unlock(&priv->list_lock); > > + if (omap_obj->flags & OMAP_BO_MEM_PIN) > + omap_gem_put_paddr_locked(obj); > + > /* this means the object is still pinned.. which really should > * not happen. I think.. > */ > @@ -1338,6 +1341,11 @@ struct drm_gem_object *omap_gem_new(struct drm_device > *dev, return NULL; > } > > + if (flags & OMAP_BO_MEM_CONTIG) { > + dev_err(dev->dev, "Tiled buffers require TILER memory\n"); I think we need more sanity checks, only part of the faulty cases are caught. The semantics of the new flags need to be defined more precisely and properly documented (with kerneldoc for all BO flags for instance). In particular, interactions between OMAP_BO_MEM_TILER and the existing OMAP_BO_TILED_* flags are not clear to me. I wonder whether we really should introduce OMAP_BO_MEM_TILER, relying on OMAP_BO_TILED seems enough to me. In addition to this, I think there's a rule that new userspace APIs need a userspace implementation, and not just in libdrm, but in Xorg, Weston, Android or a similar project. > + return NULL; > + } > + > /* > * Tiled buffers are always shmem paged backed. When they are > * scanned out, they are remapped into DMM/TILER. > @@ -1351,7 +1359,8 @@ struct drm_gem_object *omap_gem_new(struct drm_device > *dev, */ > flags &= ~(OMAP_BO_CACHED|OMAP_BO_WC|OMAP_BO_UNCACHED); > flags |= tiler_get_cpu_cache_flags(); > - } else if ((flags & OMAP_BO_SCANOUT) && !priv->has_dmm) { > + } else if ((flags & OMAP_BO_MEM_CONTIG) || > + ((flags & OMAP_BO_SCANOUT) && !priv->has_dmm)) { > /* > * OMAP_BO_SCANOUT hints that the buffer doesn't need to be > * tiled. However, to lower the pressure on memory allocation, > @@ -1411,12 +1420,24 @@ struct drm_gem_object *omap_gem_new(struct > drm_device *dev, goto err_release; > } > > + if (flags & OMAP_BO_MEM_PIN) { > + dma_addr_t dummy; > + > + ret = omap_gem_get_paddr(obj, &dummy, true); Wouldn't it be better to modify omap_gem_get_paddr() to accept a NULL second parameter ? > + if (ret) > + goto err_free_dma; > + } > + > spin_lock(&priv->list_lock); > list_add(&omap_obj->mm_list, &priv->obj_list); > spin_unlock(&priv->list_lock); > > return obj; > > +err_free_dma: > + if (flags & OMAP_BO_MEM_DMA_API) > + dma_free_writecombine(dev->dev, size, > + omap_obj->vaddr, omap_obj->paddr); > err_release: > drm_gem_object_release(obj); > err_free: > diff --git a/include/uapi/drm/omap_drm.h b/include/uapi/drm/omap_drm.h > index 7fb97863c945..a976e8682c5f 100644 > --- a/include/uapi/drm/omap_drm.h > +++ b/include/uapi/drm/omap_drm.h > @@ -40,6 +40,9 @@ struct drm_omap_param { > #define OMAP_BO_SCANOUT 0x00000001 /* scanout capable > (phys contiguous) */ > #define OMAP_BO_CACHE_MASK 0x00000006 /* cache type mask, see cache > modes */ > #define OMAP_BO_TILED_MASK 0x00000f00 /* tiled mapping mask, see > tiled modes */ > +#define OMAP_BO_MEM_CONTIG 0x00000008 /* only use contiguous dma mem > */ > +#define OMAP_BO_MEM_TILER 0x00000010 /* only use TILER mem */ > +#define OMAP_BO_MEM_PIN 0x00000020 /* pin the buffer when > allocating */ > > /* cache modes */ > #define OMAP_BO_CACHED 0x00000000 /* default */
Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki On 11/08/17 17:23, Laurent Pinchart wrote: > Hi Tomi, > > Thank you for the patch. > > On Monday 08 May 2017 11:51:22 Tomi Valkeinen wrote: >> On SoCs with TILER, we have to ways to allocate buffers: normal > > s/to ways/two ways/ > >> dma_alloc or via TILER (which basically functions as an IOMMU). TILER >> can map 128MB at a time, and we only map the TILER buffers when they are >> used (i.e. not at alloc time). If TILER is present, omapdrm always >> uses TILER. >> >> There are use cases that require lots of big buffers that are being used >> at the same time by different IPs. At the moment the userspace has a >> hard maximum of 128MB. >> >> This patch adds three new flags that can be used by the userspace to >> solve the situation: >> >> OMAP_BO_MEM_CONTIG: The driver will use dma_alloc to get the memory. >> This can be used to avoid TILER if the userspace knows it needs more >> than 128M of memory at the same time. >> >> OMAP_BO_MEM_TILER: The driver will use TILER to get the memory. There's >> nto much use for this flag at the moment, but it's here for > > s/nto/not/ > >> completeness. > > I assume the OMAP_BO_MEM_CONTIG and OMAP_BO_MEM_TILER flags are supposed to be > mutually exclusive ? That is true. >> OMAP_BO_MEM_PIN: The driver will pin the memory at alloc time, and keep >> it pinned. This can be used to 1) get an error at alloc time if TILER >> space is full, and 2) get rid of the constant pin/unpin operations which >> may have some effect on performance. >> >> If none of the flags are given, the behavior is the same as currently. >> >> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> >> --- >> drivers/gpu/drm/omapdrm/omap_gem.c | 23 ++++++++++++++++++++++- >> include/uapi/drm/omap_drm.h | 3 +++ >> 2 files changed, 25 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/omapdrm/omap_gem.c >> b/drivers/gpu/drm/omapdrm/omap_gem.c index 5d73dccc1383..90ae8615f6c6 >> 100644 >> --- a/drivers/gpu/drm/omapdrm/omap_gem.c >> +++ b/drivers/gpu/drm/omapdrm/omap_gem.c >> @@ -1292,6 +1292,9 @@ void omap_gem_free_object(struct drm_gem_object *obj) >> list_del(&omap_obj->mm_list); >> spin_unlock(&priv->list_lock); >> >> + if (omap_obj->flags & OMAP_BO_MEM_PIN) >> + omap_gem_put_paddr_locked(obj); >> + >> /* this means the object is still pinned.. which really should >> * not happen. I think.. >> */ >> @@ -1338,6 +1341,11 @@ struct drm_gem_object *omap_gem_new(struct drm_device >> *dev, return NULL; >> } >> >> + if (flags & OMAP_BO_MEM_CONTIG) { >> + dev_err(dev->dev, "Tiled buffers require TILER > memory\n"); > > I think we need more sanity checks, only part of the faulty cases are caught. > The semantics of the new flags need to be defined more precisely and properly > documented (with kerneldoc for all BO flags for instance). In particular, Ok. > interactions between OMAP_BO_MEM_TILER and the existing OMAP_BO_TILED_* flags > are not clear to me. I wonder whether we really should introduce > OMAP_BO_MEM_TILER, relying on OMAP_BO_TILED seems enough to me. TILED_* is for "real" tiled modes, i.e. the so called 2D mode. TILER just means to use the 1D mode. However, I think that perhaps it's better to use some other word than TILER there, as, if I'm not mistaken, the 1D mode doesn't really use TILER (even if it's often referred to as TILER 1D mode). It is handled by the same driver, though. Probably OMAP_BO_MEM_DMM or OMAP_BO_MEM_PAT would be better. > In addition to this, I think there's a rule that new userspace APIs need a > userspace implementation, and not just in libdrm, but in Xorg, Weston, Android > or a similar project. Yes, unfortunately that is not going to happen. I don't see how this could be used in a generic system like Weston or X. It's meant for very specific use cases, where you know exactly the requirements of your application and the capabilities of the whole system, and optimize based on that. >> + return NULL; >> + } >> + >> /* >> * Tiled buffers are always shmem paged backed. When they are >> * scanned out, they are remapped into DMM/TILER. >> @@ -1351,7 +1359,8 @@ struct drm_gem_object *omap_gem_new(struct drm_device >> *dev, */ >> flags &= ~(OMAP_BO_CACHED|OMAP_BO_WC|OMAP_BO_UNCACHED); >> flags |= tiler_get_cpu_cache_flags(); >> - } else if ((flags & OMAP_BO_SCANOUT) && !priv->has_dmm) { >> + } else if ((flags & OMAP_BO_MEM_CONTIG) || >> + ((flags & OMAP_BO_SCANOUT) && !priv->has_dmm)) { >> /* >> * OMAP_BO_SCANOUT hints that the buffer doesn't need to be >> * tiled. However, to lower the pressure on memory allocation, >> @@ -1411,12 +1420,24 @@ struct drm_gem_object *omap_gem_new(struct >> drm_device *dev, goto err_release; >> } >> >> + if (flags & OMAP_BO_MEM_PIN) { >> + dma_addr_t dummy; >> + >> + ret = omap_gem_get_paddr(obj, &dummy, true); > > Wouldn't it be better to modify omap_gem_get_paddr() to accept a NULL second > parameter ? Yes, I think that makes sense. Tomi
Hi Tomi, CC'ing Daniel. On Monday 14 Aug 2017 14:02:16 Tomi Valkeinen wrote: > Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. > Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki > On 11/08/17 17:23, Laurent Pinchart wrote: > > On Monday 08 May 2017 11:51:22 Tomi Valkeinen wrote: > >> On SoCs with TILER, we have to ways to allocate buffers: normal > > > > s/to ways/two ways/ > > > >> dma_alloc or via TILER (which basically functions as an IOMMU). TILER > >> can map 128MB at a time, and we only map the TILER buffers when they are > >> used (i.e. not at alloc time). If TILER is present, omapdrm always > >> uses TILER. > >> > >> There are use cases that require lots of big buffers that are being used > >> at the same time by different IPs. At the moment the userspace has a > >> hard maximum of 128MB. > >> > >> This patch adds three new flags that can be used by the userspace to > >> solve the situation: > >> > >> OMAP_BO_MEM_CONTIG: The driver will use dma_alloc to get the memory. > >> This can be used to avoid TILER if the userspace knows it needs more > >> than 128M of memory at the same time. > >> > >> OMAP_BO_MEM_TILER: The driver will use TILER to get the memory. There's > >> nto much use for this flag at the moment, but it's here for > > > > s/nto/not/ > > > >> completeness. > > > > I assume the OMAP_BO_MEM_CONTIG and OMAP_BO_MEM_TILER flags are supposed > > to be mutually exclusive ? > > That is true. > > >> OMAP_BO_MEM_PIN: The driver will pin the memory at alloc time, and keep > >> it pinned. This can be used to 1) get an error at alloc time if TILER > >> space is full, and 2) get rid of the constant pin/unpin operations which > >> may have some effect on performance. > >> > >> If none of the flags are given, the behavior is the same as currently. > >> > >> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> > >> --- > >> > >> drivers/gpu/drm/omapdrm/omap_gem.c | 23 ++++++++++++++++++++++- > >> include/uapi/drm/omap_drm.h | 3 +++ > >> 2 files changed, 25 insertions(+), 1 deletion(-) > >> > >> diff --git a/drivers/gpu/drm/omapdrm/omap_gem.c > >> b/drivers/gpu/drm/omapdrm/omap_gem.c index 5d73dccc1383..90ae8615f6c6 > >> 100644 > >> --- a/drivers/gpu/drm/omapdrm/omap_gem.c > >> +++ b/drivers/gpu/drm/omapdrm/omap_gem.c > >> @@ -1292,6 +1292,9 @@ void omap_gem_free_object(struct drm_gem_object > >> *obj) > >> > >> list_del(&omap_obj->mm_list); > >> spin_unlock(&priv->list_lock); > >> > >> + if (omap_obj->flags & OMAP_BO_MEM_PIN) > >> + omap_gem_put_paddr_locked(obj); > >> + > >> /* this means the object is still pinned.. which really should > >> * not happen. I think.. > >> */ > >> @@ -1338,6 +1341,11 @@ struct drm_gem_object *omap_gem_new(struct > >> drm_device *dev, > >> return NULL; > >> } > >> > >> + if (flags & OMAP_BO_MEM_CONTIG) { > >> + dev_err(dev->dev, "Tiled buffers require TILER > > memory\n"); > > > > I think we need more sanity checks, only part of the faulty cases are > > caught. The semantics of the new flags need to be defined more precisely > > and properly documented (with kerneldoc for all BO flags for instance). > > In particular, > > Ok. > > > interactions between OMAP_BO_MEM_TILER and the existing OMAP_BO_TILED_* > > flags are not clear to me. I wonder whether we really should introduce > > OMAP_BO_MEM_TILER, relying on OMAP_BO_TILED seems enough to me. > > TILED_* is for "real" tiled modes, i.e. the so called 2D mode. TILER > just means to use the 1D mode. However, I think that perhaps it's better > to use some other word than TILER there, as, if I'm not mistaken, the 1D > mode doesn't really use TILER (even if it's often referred to as TILER > 1D mode). It is handled by the same driver, though. Probably > OMAP_BO_MEM_DMM or OMAP_BO_MEM_PAT would be better. That's a good idea, yes. It's a kind of OMAP_BO_MEM_IOMMU, except that the hardware isn't a real IOMMU. > > In addition to this, I think there's a rule that new userspace APIs need a > > userspace implementation, and not just in libdrm, but in Xorg, Weston, > > Android or a similar project. > > Yes, unfortunately that is not going to happen. I don't see how this > could be used in a generic system like Weston or X. It's meant for very > specific use cases, where you know exactly the requirements of your > application and the capabilities of the whole system, and optimize based > on that. That's a very good point. Daniel, what do you think ? > >> + return NULL; > >> + } > >> + > >> /* > >> * Tiled buffers are always shmem paged backed. When they are > >> * scanned out, they are remapped into DMM/TILER. > >> @@ -1351,7 +1359,8 @@ struct drm_gem_object *omap_gem_new(struct > >> drm_device *dev, > >> */ > >> flags &= ~(OMAP_BO_CACHED|OMAP_BO_WC|OMAP_BO_UNCACHED); > >> flags |= tiler_get_cpu_cache_flags(); > >> - } else if ((flags & OMAP_BO_SCANOUT) && !priv->has_dmm) { > >> + } else if ((flags & OMAP_BO_MEM_CONTIG) || > >> + ((flags & OMAP_BO_SCANOUT) && !priv->has_dmm)) { > >> /* > >> * OMAP_BO_SCANOUT hints that the buffer doesn't need to be > >> * tiled. However, to lower the pressure on memory allocation, > >> @@ -1411,12 +1420,24 @@ struct drm_gem_object *omap_gem_new(struct > >> drm_device *dev, > >> goto err_release; > >> } > >> > >> + if (flags & OMAP_BO_MEM_PIN) { > >> + dma_addr_t dummy; > >> + > >> + ret = omap_gem_get_paddr(obj, &dummy, true); > > > > Wouldn't it be better to modify omap_gem_get_paddr() to accept a NULL > > second parameter ? > > Yes, I think that makes sense.
diff --git a/drivers/gpu/drm/omapdrm/omap_gem.c b/drivers/gpu/drm/omapdrm/omap_gem.c index 5d73dccc1383..90ae8615f6c6 100644 --- a/drivers/gpu/drm/omapdrm/omap_gem.c +++ b/drivers/gpu/drm/omapdrm/omap_gem.c @@ -1292,6 +1292,9 @@ void omap_gem_free_object(struct drm_gem_object *obj) list_del(&omap_obj->mm_list); spin_unlock(&priv->list_lock); + if (omap_obj->flags & OMAP_BO_MEM_PIN) + omap_gem_put_paddr_locked(obj); + /* this means the object is still pinned.. which really should * not happen. I think.. */ @@ -1338,6 +1341,11 @@ struct drm_gem_object *omap_gem_new(struct drm_device *dev, return NULL; } + if (flags & OMAP_BO_MEM_CONTIG) { + dev_err(dev->dev, "Tiled buffers require TILER memory\n"); + return NULL; + } + /* * Tiled buffers are always shmem paged backed. When they are * scanned out, they are remapped into DMM/TILER. @@ -1351,7 +1359,8 @@ struct drm_gem_object *omap_gem_new(struct drm_device *dev, */ flags &= ~(OMAP_BO_CACHED|OMAP_BO_WC|OMAP_BO_UNCACHED); flags |= tiler_get_cpu_cache_flags(); - } else if ((flags & OMAP_BO_SCANOUT) && !priv->has_dmm) { + } else if ((flags & OMAP_BO_MEM_CONTIG) || + ((flags & OMAP_BO_SCANOUT) && !priv->has_dmm)) { /* * OMAP_BO_SCANOUT hints that the buffer doesn't need to be * tiled. However, to lower the pressure on memory allocation, @@ -1411,12 +1420,24 @@ struct drm_gem_object *omap_gem_new(struct drm_device *dev, goto err_release; } + if (flags & OMAP_BO_MEM_PIN) { + dma_addr_t dummy; + + ret = omap_gem_get_paddr(obj, &dummy, true); + if (ret) + goto err_free_dma; + } + spin_lock(&priv->list_lock); list_add(&omap_obj->mm_list, &priv->obj_list); spin_unlock(&priv->list_lock); return obj; +err_free_dma: + if (flags & OMAP_BO_MEM_DMA_API) + dma_free_writecombine(dev->dev, size, + omap_obj->vaddr, omap_obj->paddr); err_release: drm_gem_object_release(obj); err_free: diff --git a/include/uapi/drm/omap_drm.h b/include/uapi/drm/omap_drm.h index 7fb97863c945..a976e8682c5f 100644 --- a/include/uapi/drm/omap_drm.h +++ b/include/uapi/drm/omap_drm.h @@ -40,6 +40,9 @@ struct drm_omap_param { #define OMAP_BO_SCANOUT 0x00000001 /* scanout capable (phys contiguous) */ #define OMAP_BO_CACHE_MASK 0x00000006 /* cache type mask, see cache modes */ #define OMAP_BO_TILED_MASK 0x00000f00 /* tiled mapping mask, see tiled modes */ +#define OMAP_BO_MEM_CONTIG 0x00000008 /* only use contiguous dma mem */ +#define OMAP_BO_MEM_TILER 0x00000010 /* only use TILER mem */ +#define OMAP_BO_MEM_PIN 0x00000020 /* pin the buffer when allocating */ /* cache modes */ #define OMAP_BO_CACHED 0x00000000 /* default */
On SoCs with TILER, we have to ways to allocate buffers: normal dma_alloc or via TILER (which basically functions as an IOMMU). TILER can map 128MB at a time, and we only map the TILER buffers when they are used (i.e. not at alloc time). If TILER is present, omapdrm always uses TILER. There are use cases that require lots of big buffers that are being used at the same time by different IPs. At the moment the userspace has a hard maximum of 128MB. This patch adds three new flags that can be used by the userspace to solve the situation: OMAP_BO_MEM_CONTIG: The driver will use dma_alloc to get the memory. This can be used to avoid TILER if the userspace knows it needs more than 128M of memory at the same time. OMAP_BO_MEM_TILER: The driver will use TILER to get the memory. There's nto much use for this flag at the moment, but it's here for completeness. OMAP_BO_MEM_PIN: The driver will pin the memory at alloc time, and keep it pinned. This can be used to 1) get an error at alloc time if TILER space is full, and 2) get rid of the constant pin/unpin operations which may have some effect on performance. If none of the flags are given, the behavior is the same as currently. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> --- drivers/gpu/drm/omapdrm/omap_gem.c | 23 ++++++++++++++++++++++- include/uapi/drm/omap_drm.h | 3 +++ 2 files changed, 25 insertions(+), 1 deletion(-)