From patchwork Fri Jun 3 16:55:21 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 9153683 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 5AF8B60221 for ; Fri, 3 Jun 2016 16:56:18 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4A26026E82 for ; Fri, 3 Jun 2016 16:56:18 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3F1AA27BFA; Fri, 3 Jun 2016 16:56:18 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 468B3282E8 for ; Fri, 3 Jun 2016 16:56:17 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 63E996EE7E; Fri, 3 Jun 2016 16:56:15 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wm0-x243.google.com (mail-wm0-x243.google.com [IPv6:2a00:1450:400c:c09::243]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8927D6EE69 for ; Fri, 3 Jun 2016 16:56:07 +0000 (UTC) Received: by mail-wm0-x243.google.com with SMTP id a20so703693wma.3 for ; Fri, 03 Jun 2016 09:56:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=URh2Qb2QQyMpXkBG+5u7kr/9IfkAFO+Z3vIqR2WPNMU=; b=nxjaFLkL2EgfmgOtcDCjhysMqnvGoxtx7WcMWa9l53KgAnn5XEB30xqjuVNTIscM07 egYG8nlHwShX2I5kP8q83QV6+ikpDTlk+BkV+HfnXob8NvdJ+2Y4p4OvQSJ0Ik2NkgrA jHmAZgDWWeyu3xEYGBTAg58Duf6DSmj/NambULOS996zvWG5ATmoXJBcSR9ani2PmvBZ 0Rom8/scKaNgRlcm+1/pEnMXry5V07YP8vuzRI+VXHVHIiAifGxmyIWZuE4Hd9spqKJu 5veYoSs30wq/JtM+rppUeWFF47L31uqUx6DuplnM81dE4tyYlJ06T9qxx3psnjO6a1cG Kv1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=URh2Qb2QQyMpXkBG+5u7kr/9IfkAFO+Z3vIqR2WPNMU=; b=OuTpWMc9WV1wjSfoPtv01njNNfjg/LSfjSdUfzdxzo7sr19TFCL8s3qqMs37V7Q8Fh DKH5XB4ysfP96cPCNf9ZNdIs6XI8o5lxTBXufabTR42cyOdhwdcgR9wBfBlgb7JzSGSF sXk/KfmZBEIvhF5UCG9a0Adj4gRQx9ew/WnnLYb/tD/lTOGW9xBys91ALRt0Zguh1P3u UURw3Da6bGY0FD345v4RyKxzFi5b8Ky51mUUCgnvozvCUkBhPxgie+HpwlJUFrPnQ0Is 4cilmWeoxsy6aYu2GegslGEOOKNyqhwZY5buQnqd3ScGKsGgDEet3fqY8jcwOtniOmpY kQFA== X-Gm-Message-State: ALyK8tJRUDKhm5mFjx6JXUVSb08x+vibtPriwqOZzKZWNHFdZpbDPAU1lCTDrmsoXS4OMw== X-Received: by 10.194.239.169 with SMTP id vt9mr1607819wjc.77.1464972965671; Fri, 03 Jun 2016 09:56:05 -0700 (PDT) Received: from haswell.alporthouse.com ([78.156.65.138]) by smtp.gmail.com with ESMTPSA id g3sm6566088wjb.47.2016.06.03.09.56.04 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 03 Jun 2016 09:56:04 -0700 (PDT) From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 3 Jun 2016 17:55:21 +0100 Message-Id: <1464972953-2726-7-git-send-email-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1464972953-2726-1-git-send-email-chris@chris-wilson.co.uk> References: <1464972953-2726-1-git-send-email-chris@chris-wilson.co.uk> Subject: [Intel-gfx] [PATCH 06/38] drm/i915: Pad GTT views of exec objects up to user specified size X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Our GPUs impose certain requirements upon buffers that depend upon how exactly they are used. Typically this is expressed as that they require a larger surface than would be naively computed by pitch * height. Normally such requirements are hidden away in the userspace driver, but when we accept pointers from strangers and later impose extra conditions on them, the original client allocator has no idea about the monstrosities in the GPU and we require the userspace driver to inform the kernel how many padding pages are required beyond the client allocation. v2: Long time, no see v3: Try an anonymous union for uapi struct compatability Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_drv.h | 6 ++- drivers/gpu/drm/i915/i915_gem.c | 82 +++++++++++++++--------------- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 16 +++++- include/uapi/drm/i915_drm.h | 8 ++- 4 files changed, 65 insertions(+), 47 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a065325580d8..9520adba33f6 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2945,11 +2945,13 @@ void i915_gem_free_object(struct drm_gem_object *obj); int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj, struct i915_address_space *vm, + uint64_t size, uint32_t alignment, uint64_t flags); int __must_check i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj, const struct i915_ggtt_view *view, + uint64_t size, uint32_t alignment, uint64_t flags); @@ -3209,8 +3211,8 @@ i915_gem_obj_ggtt_pin(struct drm_i915_gem_object *obj, struct drm_i915_private *dev_priv = to_i915(obj->base.dev); struct i915_ggtt *ggtt = &dev_priv->ggtt; - return i915_gem_object_pin(obj, &ggtt->base, - alignment, flags | PIN_GLOBAL); + return i915_gem_object_pin(obj, &ggtt->base, 0, alignment, + flags | PIN_GLOBAL); } void i915_gem_object_ggtt_unpin_view(struct drm_i915_gem_object *obj, diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 19b8d2ea7698..0f0101300b2b 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1438,7 +1438,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf) } /* Now pin it into the GTT if needed */ - ret = i915_gem_object_ggtt_pin(obj, &view, 0, PIN_MAPPABLE); + ret = i915_gem_object_ggtt_pin(obj, &view, 0, 0, PIN_MAPPABLE); if (ret) goto unlock; @@ -2678,21 +2678,20 @@ static struct i915_vma * i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj, struct i915_address_space *vm, const struct i915_ggtt_view *ggtt_view, + uint64_t size, unsigned alignment, uint64_t flags) { struct drm_device *dev = obj->base.dev; struct drm_i915_private *dev_priv = to_i915(dev); - struct i915_ggtt *ggtt = &dev_priv->ggtt; - u32 fence_alignment, unfenced_alignment; - u32 search_flag, alloc_flag; u64 start, end; - u64 size, fence_size; + u32 search_flag, alloc_flag; struct i915_vma *vma; int ret; if (i915_is_ggtt(vm)) { - u32 view_size; + u32 fence_size, fence_alignment, unfenced_alignment; + u64 view_size; if (WARN_ON(!ggtt_view)) return ERR_PTR(-EINVAL); @@ -2710,48 +2709,39 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj, view_size, obj->tiling_mode, false); - size = flags & PIN_MAPPABLE ? fence_size : view_size; + size = max(size, view_size); + if (flags & PIN_MAPPABLE) + size = max_t(u64, size, fence_size); + + if (alignment == 0) + alignment = flags & PIN_MAPPABLE ? fence_alignment : + unfenced_alignment; + if (flags & PIN_MAPPABLE && alignment & (fence_alignment - 1)) { + DRM_DEBUG("Invalid object (view type=%u) alignment requested %u\n", + ggtt_view ? ggtt_view->type : 0, + alignment); + return ERR_PTR(-EINVAL); + } } else { - fence_size = i915_gem_get_gtt_size(dev, - obj->base.size, - obj->tiling_mode); - fence_alignment = i915_gem_get_gtt_alignment(dev, - obj->base.size, - obj->tiling_mode, - true); - unfenced_alignment = - i915_gem_get_gtt_alignment(dev, - obj->base.size, - obj->tiling_mode, - false); - size = flags & PIN_MAPPABLE ? fence_size : obj->base.size; + size = max_t(u64, size, obj->base.size); + alignment = 4096; } start = flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0; end = vm->total; if (flags & PIN_MAPPABLE) - end = min_t(u64, end, ggtt->mappable_end); + end = min_t(u64, end, dev_priv->ggtt.mappable_end); if (flags & PIN_ZONE_4G) end = min_t(u64, end, (1ULL << 32) - PAGE_SIZE); - if (alignment == 0) - alignment = flags & PIN_MAPPABLE ? fence_alignment : - unfenced_alignment; - if (flags & PIN_MAPPABLE && alignment & (fence_alignment - 1)) { - DRM_DEBUG("Invalid object (view type=%u) alignment requested %u\n", - ggtt_view ? ggtt_view->type : 0, - alignment); - return ERR_PTR(-EINVAL); - } - /* If binding the object/GGTT view requires more space than the entire * aperture has, reject it early before evicting everything in a vain * attempt to find space. */ if (size > end) { - DRM_DEBUG("Attempting to bind an object (view type=%u) larger than the aperture: size=%llu > %s aperture=%llu\n", + DRM_DEBUG("Attempting to bind an object (view type=%u) larger than the aperture: request=%llu [object=%zd] > %s aperture=%llu\n", ggtt_view ? ggtt_view->type : 0, - size, + size, obj->base.size, flags & PIN_MAPPABLE ? "mappable" : "total", end); return ERR_PTR(-E2BIG); @@ -3243,7 +3233,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, * (e.g. libkms for the bootup splash), we have to ensure that we * always use map_and_fenceable for all scanout buffers. */ - ret = i915_gem_object_ggtt_pin(obj, view, alignment, + ret = i915_gem_object_ggtt_pin(obj, view, 0, alignment, view->type == I915_GGTT_VIEW_NORMAL ? PIN_MAPPABLE : 0); if (ret) @@ -3393,12 +3383,17 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file) } static bool -i915_vma_misplaced(struct i915_vma *vma, uint32_t alignment, uint64_t flags) +i915_vma_misplaced(struct i915_vma *vma, + uint64_t size, + uint32_t alignment, + uint64_t flags) { struct drm_i915_gem_object *obj = vma->obj; - if (alignment && - vma->node.start & (alignment - 1)) + if (vma->node.size < size) + return true; + + if (alignment && vma->node.start & (alignment - 1)) return true; if (flags & PIN_MAPPABLE && !obj->map_and_fenceable) @@ -3442,6 +3437,7 @@ static int i915_gem_object_do_pin(struct drm_i915_gem_object *obj, struct i915_address_space *vm, const struct i915_ggtt_view *ggtt_view, + uint64_t size, uint32_t alignment, uint64_t flags) { @@ -3469,7 +3465,7 @@ i915_gem_object_do_pin(struct drm_i915_gem_object *obj, if (WARN_ON(vma->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT)) return -EBUSY; - if (i915_vma_misplaced(vma, alignment, flags)) { + if (i915_vma_misplaced(vma, size, alignment, flags)) { WARN(vma->pin_count, "bo is already pinned in %s with incorrect alignment:" " offset=%08x %08x, req.alignment=%x, req.map_and_fenceable=%d," @@ -3490,8 +3486,8 @@ i915_gem_object_do_pin(struct drm_i915_gem_object *obj, bound = vma ? vma->bound : 0; if (vma == NULL || !drm_mm_node_allocated(&vma->node)) { - vma = i915_gem_object_bind_to_vm(obj, vm, ggtt_view, alignment, - flags); + vma = i915_gem_object_bind_to_vm(obj, vm, ggtt_view, + size, alignment, flags); if (IS_ERR(vma)) return PTR_ERR(vma); } else { @@ -3513,17 +3509,19 @@ i915_gem_object_do_pin(struct drm_i915_gem_object *obj, int i915_gem_object_pin(struct drm_i915_gem_object *obj, struct i915_address_space *vm, + uint64_t size, uint32_t alignment, uint64_t flags) { return i915_gem_object_do_pin(obj, vm, i915_is_ggtt(vm) ? &i915_ggtt_view_normal : NULL, - alignment, flags); + size, alignment, flags); } int i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj, const struct i915_ggtt_view *view, + uint64_t size, uint32_t alignment, uint64_t flags) { @@ -3534,7 +3532,7 @@ i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj, BUG_ON(!view); return i915_gem_object_do_pin(obj, &ggtt->base, view, - alignment, flags | PIN_GLOBAL); + size, alignment, flags | PIN_GLOBAL); } void diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 40937a09855d..c1e7ee212e7e 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -652,10 +652,14 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma, flags |= PIN_HIGH; } - ret = i915_gem_object_pin(obj, vma->vm, entry->alignment, flags); + ret = i915_gem_object_pin(obj, vma->vm, + entry->pad_to_size, + entry->alignment, + flags); if ((ret == -ENOSPC || ret == -E2BIG) && only_mappable_for_reloc(entry->flags)) ret = i915_gem_object_pin(obj, vma->vm, + entry->pad_to_size, entry->alignment, flags & ~PIN_MAPPABLE); if (ret) @@ -718,6 +722,9 @@ eb_vma_misplaced(struct i915_vma *vma) vma->node.start & (entry->alignment - 1)) return true; + if (vma->node.size < entry->pad_to_size) + return true; + if (entry->flags & EXEC_OBJECT_PINNED && vma->node.start != entry->offset) return true; @@ -1058,6 +1065,13 @@ validate_exec_list(struct drm_device *dev, if (exec[i].alignment && !is_power_of_2(exec[i].alignment)) return -EINVAL; + /* pad_to_size was once a reserved field, so sanitize it */ + if (exec[i].flags & EXEC_OBJECT_PAD_TO_SIZE) { + if (offset_in_page(exec[i].pad_to_size)) + return -EINVAL; + } else + exec[i].pad_to_size = 0; + /* First check for malicious input causing overflow in * the worst case where we need to allocate the entire * relocation tree as a single array. diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index d6c668e58426..3b861746ba7a 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -701,10 +701,14 @@ struct drm_i915_gem_exec_object2 { #define EXEC_OBJECT_WRITE (1<<2) #define EXEC_OBJECT_SUPPORTS_48B_ADDRESS (1<<3) #define EXEC_OBJECT_PINNED (1<<4) -#define __EXEC_OBJECT_UNKNOWN_FLAGS -(EXEC_OBJECT_PINNED<<1) +#define EXEC_OBJECT_PAD_TO_SIZE (1<<5) +#define __EXEC_OBJECT_UNKNOWN_FLAGS -(EXEC_OBJECT_PAD_TO_SIZE<<1) __u64 flags; - __u64 rsvd1; + union { + __u64 rsvd1; + __u64 pad_to_size; + }; __u64 rsvd2; };