diff mbox

drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers

Message ID 0ae3166a-e27a-648f-edb6-0280fd087bb3@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Hans de Goede April 19, 2018, 1:43 p.m. UTC
Hi,

On 05-04-18 15:26, Daniel Vetter wrote:
> On Thu, Apr 5, 2018 at 1:47 PM, Hans de Goede <hdegoede@redhat.com> wrote:
>> Hi,
>>
>>
>> On 05-04-18 09:14, Daniel Vetter wrote:
>>>
>>> On Fri, Mar 30, 2018 at 02:27:15PM +0200, Hans de Goede wrote:
>>>>
>>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was
>>>> skipping the first 4k by passing 4096 as start of the address range
>>>> passed
>>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try
>>>> and
>>>> reserve the firmware framebuffer so that we can inherit it would always
>>>> fail, as the firmware framebuffer starts at address 0.
>>>>
>>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on
>>>> everything >= gen8") says in its commit message: "This is confirmed to
>>>> fix
>>>> Skylake screen flickering issues (probably caused by the fact that we
>>>> initialized a ring in the first page of stolen, but I didn't 100% confirm
>>>> this theory)."
>>>>
>>>> Which suggests that it is safe to use the first page for a linear
>>>> framebuffer as the firmware is doing.
>>>>
>>>> This commit always passes 0 as start to drm_mm_init() and works around
>>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range()
>>>> by insuring the start address passed by to drm_mm_insert_node_in_range()
>>>> is always 4k or more. All entry points to i915_gem_stolen.c go through
>>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated
>>>> objects such as ring-buffers will not be allocated in the first 4k.
>>>>
>>>> The one exception is i915_gem_object_create_stolen_for_preallocated()
>>>> which directly calls drm_mm_reserve_node() which now will be able to
>>>> use the first 4k.
>>>>
>>>> This fixes the i915 driver no longer being able to inherit the firmware
>>>> framebuffer on gen8+, which fixes the video output changing from the
>>>> vendor logo to a black screen as soon as the i915 driver is loaded
>>>> (on systems without fbcon).
>>>>
>>>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>>>
>>>
>>> I think this is worth a shot. The only explanation I can think of why the
>>> GOP could get away with this and still follow the w/a is if it doesn't
>>> have a 1:1 mapping between GGTT and stolen.
>>
>>
>> My guess is that the GOP does not care about the w/a because the w/a
>> states that the command-streamers sometimes do a spurious flush (write)
>> to the first page, and the command-streamers are not used until much
>> later, so the GOP is probably ignoring the w/a all together.
>>
>> At least that is my theory.
> 
> Atm we do not know whether the GOP actually implements the wa or not.
> That it doesn't care is just a conjecture, but we have no proof. On
> previous platforms the 1:1 mapping did hold, but there's no
> fundamental reason why it sitll does.

Right.

>>> Atm we hardcode that
>>> assumption in intel_alloc_initial_plane_obj by passing the base_aligned as
>>> both the stolen_offset and the gtt_offset (but it's only the gtt_offset
>>> really). And since we're not re-writing the ptes it's not noticeable.
>>>
>>> I think to decide whether this is the right approach we should
>>> double-check whether that 1:1 assumption really holds true: Either read
>>> back the ggtt ptes and check their addresses (but iirc on some platforms
>>> their write-only, readback doesn't work), or we also rewrite the ptes
>>> again for preallocated stuff, like when binding a normal object into the
>>> gtt. If either of these approaches confirms that those affected gen8+
>>> machines still use the 1:1 mapping, then I'm happy to put my r-b on this
>>> patch. If not, well then we at least know what to fix: We need to read out
>>> the real stolen_offset, instead of making assumptions.
>>
>>
>> I'm happy to give this a try on one or more affected machines, I can e.g.
>> try this on both a skylake desktop and a cherry-trail based tablet.
>>
>> But I'm clueless about the whole PTE stuff, so I'm going to need someone
>> to provide me with a patch following either approach. If readback of the
>> PTEs works on skylake / cherrytrail I guess that would be the best way.
>>
>> Note to test this I'm currently booting the kernel with the machine's
>> UEFI vendor logo still being displayed when efifb kicks in. I've added:
>> "fbcon=vc:2-62" to the kernel commandline to make fbcon not bind to
>> VC0 / VC1, so that the framebuffer contents stays intact, combined with
>> the patch we are discussing now, this makes the vendor logo stay on
>> the screen all the way till GDM loads, which looks rather nice actually :)
>>
>> And on shutdown we fall back to the original framebuffer and the vendor
>> logo is back again. I cannot see any corruption in the display at either
>> boot or shutdown time.
> 
> It shouldn't matter whether efifb takes over or not, it's still the
> GOP's framebuffer we're taking over. Different content for sure, logo
> is gone, but we only care about which pages we're using.
> 
> Wrt the code:
> - (Re)binding happens by calling i915_vma_bind, if you put a call to
> that at the end of the stolen_preallocated functions you should get
> that. Ofc needs the logo still there so you can see it jump/get
> mangled.
> - GGTT PTEs for gen8+ are written through e.g. gen8_ggttt_insert_page
> or gen8_ggtt_insert_entries (which takes the vma). Since we only care
> about the first entry a readq(dev_priv->ggtt->gsm) is all we really
> need. If it's all 0 then it's not working (since at least
> _PAGE_PRESENT should be set, plus the physical address of a page in
> stolen memory).
> 
> I can do some patches for each of those, but a bit swamped. Also no
> gen8 handy for testing I think to make sure it works, so would take a
> few weeks probably.

I was a bit swamped with other stuff myself too, but I'm back to working
on this now. Below is the debug patch I came up with based on what you
wrote above.

This results in the following debug output:

[    1.651141] first ggtt entry before bind: 0x0000000078c00001
[    1.651151] i915_vma_bind ret 0
[    1.651152] first ggtt entry after bind: 0x0000000078c00083

And "sudo cat /proc/iomem | grep Stolen" gives:
   78c00000-88bfffff : Graphics Stolen Memory

I see no visual changes with this patch (BIOS vendor logo still
stays in place when using fbcon=vc:2-62 to make fbcon not bind
on boot and keep the vendor logo intact).

The address of the first ggtt entry matches with the start of stolen
and the i915_vma_bind call only changes the first gtt entry's flags,
or-ing in _PAGE_RW (BIT(1)) and PPAT_CACHED (BIT(7)), which perfectly
matches what we would expect based on gen8_pte_encode()'s behavior.

So it seems that the GOP indeed does NOT implement the wa and the i915's
code assuming a linear mapping at the start of stolen for the BIOS fb
still holds true.

The debug output is from my Skylake based machine which uses an
Asrock B150M Pro4S/D3 mainboard.

I've also tested this on a Cherry Trail based device (a GPD Win)
with identical results (the flags are 0x1b after the vma_bind
on CHT, which matches with I915_CACHE_NONE).

Regards,

Hans


 From 400f2b0d722236999a5706a1f2fa0bb70bf9c73f Mon Sep 17 00:00:00 2001
From: Hans de Goede <hdegoede@redhat.com>
Date: Thu, 19 Apr 2018 14:44:43 +0200
Subject: [PATCH resend] i915: i915_vma_bind for pre-allocated stolen test /
  debug

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
---
  drivers/gpu/drm/i915/i915_gem_stolen.c | 7 +++++++
  1 file changed, 7 insertions(+)

Comments

Daniel Vetter April 20, 2018, 7:27 a.m. UTC | #1
On Thu, Apr 19, 2018 at 03:43:19PM +0200, Hans de Goede wrote:
> Hi,
> 
> On 05-04-18 15:26, Daniel Vetter wrote:
> > On Thu, Apr 5, 2018 at 1:47 PM, Hans de Goede <hdegoede@redhat.com> wrote:
> > > Hi,
> > > 
> > > 
> > > On 05-04-18 09:14, Daniel Vetter wrote:
> > > > 
> > > > On Fri, Mar 30, 2018 at 02:27:15PM +0200, Hans de Goede wrote:
> > > > > 
> > > > > Before this commit the WaSkipStolenMemoryFirstPage workaround code was
> > > > > skipping the first 4k by passing 4096 as start of the address range
> > > > > passed
> > > > > to drm_mm_init(). This means that calling drm_mm_reserve_node() to try
> > > > > and
> > > > > reserve the firmware framebuffer so that we can inherit it would always
> > > > > fail, as the firmware framebuffer starts at address 0.
> > > > > 
> > > > > Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on
> > > > > everything >= gen8") says in its commit message: "This is confirmed to
> > > > > fix
> > > > > Skylake screen flickering issues (probably caused by the fact that we
> > > > > initialized a ring in the first page of stolen, but I didn't 100% confirm
> > > > > this theory)."
> > > > > 
> > > > > Which suggests that it is safe to use the first page for a linear
> > > > > framebuffer as the firmware is doing.
> > > > > 
> > > > > This commit always passes 0 as start to drm_mm_init() and works around
> > > > > WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range()
> > > > > by insuring the start address passed by to drm_mm_insert_node_in_range()
> > > > > is always 4k or more. All entry points to i915_gem_stolen.c go through
> > > > > i915_gem_stolen_insert_node_in_range(), so that any newly allocated
> > > > > objects such as ring-buffers will not be allocated in the first 4k.
> > > > > 
> > > > > The one exception is i915_gem_object_create_stolen_for_preallocated()
> > > > > which directly calls drm_mm_reserve_node() which now will be able to
> > > > > use the first 4k.
> > > > > 
> > > > > This fixes the i915 driver no longer being able to inherit the firmware
> > > > > framebuffer on gen8+, which fixes the video output changing from the
> > > > > vendor logo to a black screen as soon as the i915 driver is loaded
> > > > > (on systems without fbcon).
> > > > > 
> > > > > Signed-off-by: Hans de Goede <hdegoede@redhat.com>
> > > > 
> > > > 
> > > > I think this is worth a shot. The only explanation I can think of why the
> > > > GOP could get away with this and still follow the w/a is if it doesn't
> > > > have a 1:1 mapping between GGTT and stolen.
> > > 
> > > 
> > > My guess is that the GOP does not care about the w/a because the w/a
> > > states that the command-streamers sometimes do a spurious flush (write)
> > > to the first page, and the command-streamers are not used until much
> > > later, so the GOP is probably ignoring the w/a all together.
> > > 
> > > At least that is my theory.
> > 
> > Atm we do not know whether the GOP actually implements the wa or not.
> > That it doesn't care is just a conjecture, but we have no proof. On
> > previous platforms the 1:1 mapping did hold, but there's no
> > fundamental reason why it sitll does.
> 
> Right.
> 
> > > > Atm we hardcode that
> > > > assumption in intel_alloc_initial_plane_obj by passing the base_aligned as
> > > > both the stolen_offset and the gtt_offset (but it's only the gtt_offset
> > > > really). And since we're not re-writing the ptes it's not noticeable.
> > > > 
> > > > I think to decide whether this is the right approach we should
> > > > double-check whether that 1:1 assumption really holds true: Either read
> > > > back the ggtt ptes and check their addresses (but iirc on some platforms
> > > > their write-only, readback doesn't work), or we also rewrite the ptes
> > > > again for preallocated stuff, like when binding a normal object into the
> > > > gtt. If either of these approaches confirms that those affected gen8+
> > > > machines still use the 1:1 mapping, then I'm happy to put my r-b on this
> > > > patch. If not, well then we at least know what to fix: We need to read out
> > > > the real stolen_offset, instead of making assumptions.
> > > 
> > > 
> > > I'm happy to give this a try on one or more affected machines, I can e.g.
> > > try this on both a skylake desktop and a cherry-trail based tablet.
> > > 
> > > But I'm clueless about the whole PTE stuff, so I'm going to need someone
> > > to provide me with a patch following either approach. If readback of the
> > > PTEs works on skylake / cherrytrail I guess that would be the best way.
> > > 
> > > Note to test this I'm currently booting the kernel with the machine's
> > > UEFI vendor logo still being displayed when efifb kicks in. I've added:
> > > "fbcon=vc:2-62" to the kernel commandline to make fbcon not bind to
> > > VC0 / VC1, so that the framebuffer contents stays intact, combined with
> > > the patch we are discussing now, this makes the vendor logo stay on
> > > the screen all the way till GDM loads, which looks rather nice actually :)
> > > 
> > > And on shutdown we fall back to the original framebuffer and the vendor
> > > logo is back again. I cannot see any corruption in the display at either
> > > boot or shutdown time.
> > 
> > It shouldn't matter whether efifb takes over or not, it's still the
> > GOP's framebuffer we're taking over. Different content for sure, logo
> > is gone, but we only care about which pages we're using.
> > 
> > Wrt the code:
> > - (Re)binding happens by calling i915_vma_bind, if you put a call to
> > that at the end of the stolen_preallocated functions you should get
> > that. Ofc needs the logo still there so you can see it jump/get
> > mangled.
> > - GGTT PTEs for gen8+ are written through e.g. gen8_ggttt_insert_page
> > or gen8_ggtt_insert_entries (which takes the vma). Since we only care
> > about the first entry a readq(dev_priv->ggtt->gsm) is all we really
> > need. If it's all 0 then it's not working (since at least
> > _PAGE_PRESENT should be set, plus the physical address of a page in
> > stolen memory).
> > 
> > I can do some patches for each of those, but a bit swamped. Also no
> > gen8 handy for testing I think to make sure it works, so would take a
> > few weeks probably.
> 
> I was a bit swamped with other stuff myself too, but I'm back to working
> on this now. Below is the debug patch I came up with based on what you
> wrote above.
> 
> This results in the following debug output:
> 
> [    1.651141] first ggtt entry before bind: 0x0000000078c00001
> [    1.651151] i915_vma_bind ret 0
> [    1.651152] first ggtt entry after bind: 0x0000000078c00083
> 
> And "sudo cat /proc/iomem | grep Stolen" gives:
>   78c00000-88bfffff : Graphics Stolen Memory
> 
> I see no visual changes with this patch (BIOS vendor logo still
> stays in place when using fbcon=vc:2-62 to make fbcon not bind
> on boot and keep the vendor logo intact).
> 
> The address of the first ggtt entry matches with the start of stolen
> and the i915_vma_bind call only changes the first gtt entry's flags,
> or-ing in _PAGE_RW (BIT(1)) and PPAT_CACHED (BIT(7)), which perfectly
> matches what we would expect based on gen8_pte_encode()'s behavior.
> 
> So it seems that the GOP indeed does NOT implement the wa and the i915's
> code assuming a linear mapping at the start of stolen for the BIOS fb
> still holds true.
> 
> The debug output is from my Skylake based machine which uses an
> Asrock B150M Pro4S/D3 mainboard.
> 
> I've also tested this on a Cherry Trail based device (a GPD Win)
> with identical results (the flags are 0x1b after the vma_bind
> on CHT, which matches with I915_CACHE_NONE).

tbh, not what I expected, so good we've done the experiment to confirm :-)

With the above information added to the commit message (you can just paste
your entire message, including test diff to the commit message) the
original patch has my:

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Thanks for doing all the digging here.

Cheers, Daniel

> 
> Regards,
> 
> Hans
> 
> 
> From 400f2b0d722236999a5706a1f2fa0bb70bf9c73f Mon Sep 17 00:00:00 2001
> From: Hans de Goede <hdegoede@redhat.com>
> Date: Thu, 19 Apr 2018 14:44:43 +0200
> Subject: [PATCH resend] i915: i915_vma_bind for pre-allocated stolen test /
>  debug
> 
> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_stolen.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> index 152487d91197..7db03f7d2c51 100644
> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> @@ -648,6 +648,13 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *dev_priv
>  	obj->bind_count++;
>  	spin_unlock(&dev_priv->mm.obj_lock);
> 
> +	pr_err("first ggtt entry before bind: 0x%016llx\n", readq(dev_priv->ggtt.gsm));
> +	ret = i915_vma_bind(vma,
> +		    HAS_LLC(dev_priv) ? I915_CACHE_LLC : I915_CACHE_NONE,
> +		    PIN_UPDATE);
> +	pr_err("i915_vma_bind ret %d\n", ret);
> +	pr_err("first ggtt entry after bind: 0x%016llx\n", readq(dev_priv->ggtt.gsm));
> +
>  	return obj;
> 
>  err_pages:
> -- 
> 2.17.0
> 
> 
> 
> > 
> > Cheers, Daniel
> > > 
> > > 
> > > > 
> > > > Cheers, Daniel
> > > > > 
> > > > > ---
> > > > >    drivers/gpu/drm/i915/i915_gem_stolen.c | 15 ++++++---------
> > > > >    1 file changed, 6 insertions(+), 9 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > > b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > > index af915d041281..ad949cc30928 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > > @@ -51,6 +51,10 @@ int i915_gem_stolen_insert_node_in_range(struct
> > > > > drm_i915_private *dev_priv,
> > > > >          if (!drm_mm_initialized(&dev_priv->mm.stolen))
> > > > >                  return -ENODEV;
> > > > >    +     /* WaSkipStolenMemoryFirstPage:bdw+ */
> > > > > +       if (INTEL_GEN(dev_priv) >= 8 && start < 4096)
> > > > > +               start = 4096;
> > > > > +
> > > > >          mutex_lock(&dev_priv->mm.stolen_lock);
> > > > >          ret = drm_mm_insert_node_in_range(&dev_priv->mm.stolen, node,
> > > > >                                            size, alignment, 0,
> > > > > @@ -343,7 +347,6 @@ int i915_gem_init_stolen(struct drm_i915_private
> > > > > *dev_priv)
> > > > >    {
> > > > >          resource_size_t reserved_base, stolen_top;
> > > > >          resource_size_t reserved_total, reserved_size;
> > > > > -       resource_size_t stolen_usable_start;
> > > > >          mutex_init(&dev_priv->mm.stolen_lock);
> > > > >    @@ -435,17 +438,11 @@ int i915_gem_init_stolen(struct drm_i915_private
> > > > > *dev_priv)
> > > > >                           (u64)resource_size(&dev_priv->dsm) >> 10,
> > > > >                           ((u64)resource_size(&dev_priv->dsm) -
> > > > > reserved_total) >> 10);
> > > > >    -     stolen_usable_start = 0;
> > > > > -       /* WaSkipStolenMemoryFirstPage:bdw+ */
> > > > > -       if (INTEL_GEN(dev_priv) >= 8)
> > > > > -               stolen_usable_start = 4096;
> > > > > -
> > > > >          dev_priv->stolen_usable_size =
> > > > > -               resource_size(&dev_priv->dsm) - reserved_total -
> > > > > stolen_usable_start;
> > > > > +               resource_size(&dev_priv->dsm) - reserved_total;
> > > > >          /* Basic memrange allocator for stolen space. */
> > > > > -       drm_mm_init(&dev_priv->mm.stolen, stolen_usable_start,
> > > > > -                   dev_priv->stolen_usable_size);
> > > > > +       drm_mm_init(&dev_priv->mm.stolen, 0,
> > > > > dev_priv->stolen_usable_size);
> > > > >          return 0;
> > > > >    }
> > > > > --
> > > > > 2.17.0.rc2
> > > > > 
> > > > > _______________________________________________
> > > > > Intel-gfx mailing list
> > > > > Intel-gfx@lists.freedesktop.org
> > > > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > > > 
> > > > 
> > > 
> > 
> > 
> >
Hans de Goede April 20, 2018, 10:19 a.m. UTC | #2
Hi,

On 20-04-18 09:27, Daniel Vetter wrote:
> On Thu, Apr 19, 2018 at 03:43:19PM +0200, Hans de Goede wrote:
>> Hi,
>>
>> On 05-04-18 15:26, Daniel Vetter wrote:
>>> On Thu, Apr 5, 2018 at 1:47 PM, Hans de Goede <hdegoede@redhat.com> wrote:
>>>> Hi,
>>>>
>>>>
>>>> On 05-04-18 09:14, Daniel Vetter wrote:
>>>>>
>>>>> On Fri, Mar 30, 2018 at 02:27:15PM +0200, Hans de Goede wrote:
>>>>>>
>>>>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was
>>>>>> skipping the first 4k by passing 4096 as start of the address range
>>>>>> passed
>>>>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try
>>>>>> and
>>>>>> reserve the firmware framebuffer so that we can inherit it would always
>>>>>> fail, as the firmware framebuffer starts at address 0.
>>>>>>
>>>>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on
>>>>>> everything >= gen8") says in its commit message: "This is confirmed to
>>>>>> fix
>>>>>> Skylake screen flickering issues (probably caused by the fact that we
>>>>>> initialized a ring in the first page of stolen, but I didn't 100% confirm
>>>>>> this theory)."
>>>>>>
>>>>>> Which suggests that it is safe to use the first page for a linear
>>>>>> framebuffer as the firmware is doing.
>>>>>>
>>>>>> This commit always passes 0 as start to drm_mm_init() and works around
>>>>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range()
>>>>>> by insuring the start address passed by to drm_mm_insert_node_in_range()
>>>>>> is always 4k or more. All entry points to i915_gem_stolen.c go through
>>>>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated
>>>>>> objects such as ring-buffers will not be allocated in the first 4k.
>>>>>>
>>>>>> The one exception is i915_gem_object_create_stolen_for_preallocated()
>>>>>> which directly calls drm_mm_reserve_node() which now will be able to
>>>>>> use the first 4k.
>>>>>>
>>>>>> This fixes the i915 driver no longer being able to inherit the firmware
>>>>>> framebuffer on gen8+, which fixes the video output changing from the
>>>>>> vendor logo to a black screen as soon as the i915 driver is loaded
>>>>>> (on systems without fbcon).
>>>>>>
>>>>>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>>>>>
>>>>>
>>>>> I think this is worth a shot. The only explanation I can think of why the
>>>>> GOP could get away with this and still follow the w/a is if it doesn't
>>>>> have a 1:1 mapping between GGTT and stolen.
>>>>
>>>>
>>>> My guess is that the GOP does not care about the w/a because the w/a
>>>> states that the command-streamers sometimes do a spurious flush (write)
>>>> to the first page, and the command-streamers are not used until much
>>>> later, so the GOP is probably ignoring the w/a all together.
>>>>
>>>> At least that is my theory.
>>>
>>> Atm we do not know whether the GOP actually implements the wa or not.
>>> That it doesn't care is just a conjecture, but we have no proof. On
>>> previous platforms the 1:1 mapping did hold, but there's no
>>> fundamental reason why it sitll does.
>>
>> Right.
>>
>>>>> Atm we hardcode that
>>>>> assumption in intel_alloc_initial_plane_obj by passing the base_aligned as
>>>>> both the stolen_offset and the gtt_offset (but it's only the gtt_offset
>>>>> really). And since we're not re-writing the ptes it's not noticeable.
>>>>>
>>>>> I think to decide whether this is the right approach we should
>>>>> double-check whether that 1:1 assumption really holds true: Either read
>>>>> back the ggtt ptes and check their addresses (but iirc on some platforms
>>>>> their write-only, readback doesn't work), or we also rewrite the ptes
>>>>> again for preallocated stuff, like when binding a normal object into the
>>>>> gtt. If either of these approaches confirms that those affected gen8+
>>>>> machines still use the 1:1 mapping, then I'm happy to put my r-b on this
>>>>> patch. If not, well then we at least know what to fix: We need to read out
>>>>> the real stolen_offset, instead of making assumptions.
>>>>
>>>>
>>>> I'm happy to give this a try on one or more affected machines, I can e.g.
>>>> try this on both a skylake desktop and a cherry-trail based tablet.
>>>>
>>>> But I'm clueless about the whole PTE stuff, so I'm going to need someone
>>>> to provide me with a patch following either approach. If readback of the
>>>> PTEs works on skylake / cherrytrail I guess that would be the best way.
>>>>
>>>> Note to test this I'm currently booting the kernel with the machine's
>>>> UEFI vendor logo still being displayed when efifb kicks in. I've added:
>>>> "fbcon=vc:2-62" to the kernel commandline to make fbcon not bind to
>>>> VC0 / VC1, so that the framebuffer contents stays intact, combined with
>>>> the patch we are discussing now, this makes the vendor logo stay on
>>>> the screen all the way till GDM loads, which looks rather nice actually :)
>>>>
>>>> And on shutdown we fall back to the original framebuffer and the vendor
>>>> logo is back again. I cannot see any corruption in the display at either
>>>> boot or shutdown time.
>>>
>>> It shouldn't matter whether efifb takes over or not, it's still the
>>> GOP's framebuffer we're taking over. Different content for sure, logo
>>> is gone, but we only care about which pages we're using.
>>>
>>> Wrt the code:
>>> - (Re)binding happens by calling i915_vma_bind, if you put a call to
>>> that at the end of the stolen_preallocated functions you should get
>>> that. Ofc needs the logo still there so you can see it jump/get
>>> mangled.
>>> - GGTT PTEs for gen8+ are written through e.g. gen8_ggttt_insert_page
>>> or gen8_ggtt_insert_entries (which takes the vma). Since we only care
>>> about the first entry a readq(dev_priv->ggtt->gsm) is all we really
>>> need. If it's all 0 then it's not working (since at least
>>> _PAGE_PRESENT should be set, plus the physical address of a page in
>>> stolen memory).
>>>
>>> I can do some patches for each of those, but a bit swamped. Also no
>>> gen8 handy for testing I think to make sure it works, so would take a
>>> few weeks probably.
>>
>> I was a bit swamped with other stuff myself too, but I'm back to working
>> on this now. Below is the debug patch I came up with based on what you
>> wrote above.
>>
>> This results in the following debug output:
>>
>> [    1.651141] first ggtt entry before bind: 0x0000000078c00001
>> [    1.651151] i915_vma_bind ret 0
>> [    1.651152] first ggtt entry after bind: 0x0000000078c00083
>>
>> And "sudo cat /proc/iomem | grep Stolen" gives:
>>    78c00000-88bfffff : Graphics Stolen Memory
>>
>> I see no visual changes with this patch (BIOS vendor logo still
>> stays in place when using fbcon=vc:2-62 to make fbcon not bind
>> on boot and keep the vendor logo intact).
>>
>> The address of the first ggtt entry matches with the start of stolen
>> and the i915_vma_bind call only changes the first gtt entry's flags,
>> or-ing in _PAGE_RW (BIT(1)) and PPAT_CACHED (BIT(7)), which perfectly
>> matches what we would expect based on gen8_pte_encode()'s behavior.
>>
>> So it seems that the GOP indeed does NOT implement the wa and the i915's
>> code assuming a linear mapping at the start of stolen for the BIOS fb
>> still holds true.
>>
>> The debug output is from my Skylake based machine which uses an
>> Asrock B150M Pro4S/D3 mainboard.
>>
>> I've also tested this on a Cherry Trail based device (a GPD Win)
>> with identical results (the flags are 0x1b after the vma_bind
>> on CHT, which matches with I915_CACHE_NONE).
> 
> tbh, not what I expected, so good we've done the experiment to confirm :-)
> 
> With the above information added to the commit message (you can just paste
> your entire message, including test diff to the commit message) the
> original patch has my:
> 
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Thanks, I've just send out a v2 with the amended commit message
and your r-b added. Once that has passed the CI I will push
it to dinq.

Regards,

Hans




> 
> Thanks for doing all the digging here.
> 
> Cheers, Daniel
> 
>>
>> Regards,
>>
>> Hans
>>
>>
>>  From 400f2b0d722236999a5706a1f2fa0bb70bf9c73f Mon Sep 17 00:00:00 2001
>> From: Hans de Goede <hdegoede@redhat.com>
>> Date: Thu, 19 Apr 2018 14:44:43 +0200
>> Subject: [PATCH resend] i915: i915_vma_bind for pre-allocated stolen test /
>>   debug
>>
>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>> ---
>>   drivers/gpu/drm/i915/i915_gem_stolen.c | 7 +++++++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
>> index 152487d91197..7db03f7d2c51 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
>> @@ -648,6 +648,13 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *dev_priv
>>   	obj->bind_count++;
>>   	spin_unlock(&dev_priv->mm.obj_lock);
>>
>> +	pr_err("first ggtt entry before bind: 0x%016llx\n", readq(dev_priv->ggtt.gsm));
>> +	ret = i915_vma_bind(vma,
>> +		    HAS_LLC(dev_priv) ? I915_CACHE_LLC : I915_CACHE_NONE,
>> +		    PIN_UPDATE);
>> +	pr_err("i915_vma_bind ret %d\n", ret);
>> +	pr_err("first ggtt entry after bind: 0x%016llx\n", readq(dev_priv->ggtt.gsm));
>> +
>>   	return obj;
>>
>>   err_pages:
>> -- 
>> 2.17.0
>>
>>
>>
>>>
>>> Cheers, Daniel
>>>>
>>>>
>>>>>
>>>>> Cheers, Daniel
>>>>>>
>>>>>> ---
>>>>>>     drivers/gpu/drm/i915/i915_gem_stolen.c | 15 ++++++---------
>>>>>>     1 file changed, 6 insertions(+), 9 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c
>>>>>> b/drivers/gpu/drm/i915/i915_gem_stolen.c
>>>>>> index af915d041281..ad949cc30928 100644
>>>>>> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
>>>>>> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
>>>>>> @@ -51,6 +51,10 @@ int i915_gem_stolen_insert_node_in_range(struct
>>>>>> drm_i915_private *dev_priv,
>>>>>>           if (!drm_mm_initialized(&dev_priv->mm.stolen))
>>>>>>                   return -ENODEV;
>>>>>>     +     /* WaSkipStolenMemoryFirstPage:bdw+ */
>>>>>> +       if (INTEL_GEN(dev_priv) >= 8 && start < 4096)
>>>>>> +               start = 4096;
>>>>>> +
>>>>>>           mutex_lock(&dev_priv->mm.stolen_lock);
>>>>>>           ret = drm_mm_insert_node_in_range(&dev_priv->mm.stolen, node,
>>>>>>                                             size, alignment, 0,
>>>>>> @@ -343,7 +347,6 @@ int i915_gem_init_stolen(struct drm_i915_private
>>>>>> *dev_priv)
>>>>>>     {
>>>>>>           resource_size_t reserved_base, stolen_top;
>>>>>>           resource_size_t reserved_total, reserved_size;
>>>>>> -       resource_size_t stolen_usable_start;
>>>>>>           mutex_init(&dev_priv->mm.stolen_lock);
>>>>>>     @@ -435,17 +438,11 @@ int i915_gem_init_stolen(struct drm_i915_private
>>>>>> *dev_priv)
>>>>>>                            (u64)resource_size(&dev_priv->dsm) >> 10,
>>>>>>                            ((u64)resource_size(&dev_priv->dsm) -
>>>>>> reserved_total) >> 10);
>>>>>>     -     stolen_usable_start = 0;
>>>>>> -       /* WaSkipStolenMemoryFirstPage:bdw+ */
>>>>>> -       if (INTEL_GEN(dev_priv) >= 8)
>>>>>> -               stolen_usable_start = 4096;
>>>>>> -
>>>>>>           dev_priv->stolen_usable_size =
>>>>>> -               resource_size(&dev_priv->dsm) - reserved_total -
>>>>>> stolen_usable_start;
>>>>>> +               resource_size(&dev_priv->dsm) - reserved_total;
>>>>>>           /* Basic memrange allocator for stolen space. */
>>>>>> -       drm_mm_init(&dev_priv->mm.stolen, stolen_usable_start,
>>>>>> -                   dev_priv->stolen_usable_size);
>>>>>> +       drm_mm_init(&dev_priv->mm.stolen, 0,
>>>>>> dev_priv->stolen_usable_size);
>>>>>>           return 0;
>>>>>>     }
>>>>>> --
>>>>>> 2.17.0.rc2
>>>>>>
>>>>>> _______________________________________________
>>>>>> Intel-gfx mailing list
>>>>>> Intel-gfx@lists.freedesktop.org
>>>>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 152487d91197..7db03f7d2c51 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -648,6 +648,13 @@  i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *dev_priv
  	obj->bind_count++;
  	spin_unlock(&dev_priv->mm.obj_lock);

+	pr_err("first ggtt entry before bind: 0x%016llx\n", readq(dev_priv->ggtt.gsm));
+	ret = i915_vma_bind(vma,
+		    HAS_LLC(dev_priv) ? I915_CACHE_LLC : I915_CACHE_NONE,
+		    PIN_UPDATE);
+	pr_err("i915_vma_bind ret %d\n", ret);
+	pr_err("first ggtt entry after bind: 0x%016llx\n", readq(dev_priv->ggtt.gsm));
+
  	return obj;

  err_pages: