Message ID | 20180330122715.18209-1-hdegoede@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi, On 30-03-18 14:44, Chris Wilson wrote: > Quoting Hans de Goede (2018-03-30 13:37:40) >> Hi, >> >> On 30-03-18 14:30, Chris Wilson wrote: >>> Quoting Hans de Goede (2018-03-30 13:27:15) >>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was >>>> skipping the first 4k by passing 4096 as start of the address range passed >>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and >>>> reserve the firmware framebuffer so that we can inherit it would always >>>> fail, as the firmware framebuffer starts at address 0. >>>> >>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on >>>> everything >= gen8") says in its commit message: "This is confirmed to fix >>>> Skylake screen flickering issues (probably caused by the fact that we >>>> initialized a ring in the first page of stolen, but I didn't 100% confirm >>>> this theory)." >>>> >>>> Which suggests that it is safe to use the first page for a linear >>>> framebuffer as the firmware is doing. >>>> >>>> This commit always passes 0 as start to drm_mm_init() and works around >>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range() >>>> by insuring the start address passed by to drm_mm_insert_node_in_range() >>>> is always 4k or more. All entry points to i915_gem_stolen.c go through >>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated >>>> objects such as ring-buffers will not be allocated in the first 4k. >>>> >>>> The one exception is i915_gem_object_create_stolen_for_preallocated() >>>> which directly calls drm_mm_reserve_node() which now will be able to >>>> use the first 4k. >>>> >>>> This fixes the i915 driver no longer being able to inherit the firmware >>>> framebuffer on gen8+, which fixes the video output changing from the >>>> vendor logo to a black screen as soon as the i915 driver is loaded >>>> (on systems without fbcon). >>> >>> We've been told by the HW guys not to use the first page. (That's my >>> understanding from every time this gets questioned.) >> >> Yet the GOP is happily using the first page. I think we may need to make >> a difference here between the GPU not using the first page and the >> display engine/pipeline not using the first page. Note that my patch >> only influences the inheriting of the initial framebuffer as allocated >> by the GOP. It does not influence any other allocations from the >> reserved range, those will still all avoid the first page. >> >> Without this patch fastboot / flickerfree support is essentially broken >> on any gen8+ hardware given that one of the goals of atomic is to be >> able to do flickerfree transitions I think that this warrants a closer >> look then just simply saying never use the first page. > > The concern is what else (i.e. nothing that we allocated ourselves) that > may be in the first page... Given that the GOP has put its framebuffer there at least at boot there is nothing there, otherwise it would show up on the display. We have a whole bunch of code to inherit the BIOS fb in intel_display.c and AFAIK that code is there because this inheriting the BIOS fb is deemed an important feature. So I'm not happy at all with the handwavy best to not touch it answer I'm getting to this patch. Unless there are some clear answer from the hardware folks which specifically say we cannot put a framebuffer there (and then why is the GOP doing it?) then I believe that the best way forward here is to get various people to test with this patch and the best way to do that is probably to put it in next. Note I deliberately did not add a Cc stable. Regards, Hans
Hi, On 30-03-18 15:25, Hans de Goede wrote: > Hi, > > On 30-03-18 14:44, Chris Wilson wrote: >> Quoting Hans de Goede (2018-03-30 13:37:40) >>> Hi, >>> >>> On 30-03-18 14:30, Chris Wilson wrote: >>>> Quoting Hans de Goede (2018-03-30 13:27:15) >>>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was >>>>> skipping the first 4k by passing 4096 as start of the address range passed >>>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and >>>>> reserve the firmware framebuffer so that we can inherit it would always >>>>> fail, as the firmware framebuffer starts at address 0. >>>>> >>>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on >>>>> everything >= gen8") says in its commit message: "This is confirmed to fix >>>>> Skylake screen flickering issues (probably caused by the fact that we >>>>> initialized a ring in the first page of stolen, but I didn't 100% confirm >>>>> this theory)." >>>>> >>>>> Which suggests that it is safe to use the first page for a linear >>>>> framebuffer as the firmware is doing. >>>>> >>>>> This commit always passes 0 as start to drm_mm_init() and works around >>>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range() >>>>> by insuring the start address passed by to drm_mm_insert_node_in_range() >>>>> is always 4k or more. All entry points to i915_gem_stolen.c go through >>>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated >>>>> objects such as ring-buffers will not be allocated in the first 4k. >>>>> >>>>> The one exception is i915_gem_object_create_stolen_for_preallocated() >>>>> which directly calls drm_mm_reserve_node() which now will be able to >>>>> use the first 4k. >>>>> >>>>> This fixes the i915 driver no longer being able to inherit the firmware >>>>> framebuffer on gen8+, which fixes the video output changing from the >>>>> vendor logo to a black screen as soon as the i915 driver is loaded >>>>> (on systems without fbcon). >>>> >>>> We've been told by the HW guys not to use the first page. (That's my >>>> understanding from every time this gets questioned.) >>> >>> Yet the GOP is happily using the first page. I think we may need to make >>> a difference here between the GPU not using the first page and the >>> display engine/pipeline not using the first page. Note that my patch >>> only influences the inheriting of the initial framebuffer as allocated >>> by the GOP. It does not influence any other allocations from the >>> reserved range, those will still all avoid the first page. >>> >>> Without this patch fastboot / flickerfree support is essentially broken >>> on any gen8+ hardware given that one of the goals of atomic is to be >>> able to do flickerfree transitions I think that this warrants a closer >>> look then just simply saying never use the first page. >> >> The concern is what else (i.e. nothing that we allocated ourselves) that >> may be in the first page... > > Given that the GOP has put its framebuffer there at least at boot there > is nothing there, otherwise it would show up on the display. > > We have a whole bunch of code to inherit the BIOS fb in intel_display.c > and AFAIK that code is there because this inheriting the BIOS fb is > deemed an important feature. So I'm not happy at all with the handwavy > best to not touch it answer I'm getting to this patch. > > Unless there are some clear answer from the hardware folks which specifically > say we cannot put a framebuffer there (and then why is the GOP doing it?) > then I believe that the best way forward here is to get various people to > test with this patch and the best way to do that is probably to put it > in next. Note I deliberately did not add a Cc stable. To elaborate on this, the excluding of the first 4k of the stolen memory region causes intel_alloc_initial_plane_obj() from intel_display.c to fail, which in turn causes intel_find_initial_plane_obj() to call intel_plane_disable_noatomic(intel_crtc, intel_plane); which temporarily completely turns off the display which leads to a very ugly flickering of the display at boot (as well as replacing the vendor logo with a black screen). I think we can all agree that this behavior is undesirable and even a regression in behavior caused by the fix to exclude the first 4k. Chris, if my patch is not an acceptable way to fix this, then how do you suggest that we fix this? Digging a bit deeper I found this: https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-kbl-vol16-workarounds.pdf Which says: "WaSkipStolenMemoryFirstPage: WA to skip the first page of stolen memory due to sporadic HW write on *CS Idle" And also about FBC: "First line of FBC getting corrupted when FBC compressed frame buffer offset is programmed to zero. Command streamers are doing flush writes to base of stolen. WA: New restriction to program FBC compressed frame buffer offset to at least 4KB." So using the first 4kB for the *framebuffer* as done by the GOP will not cause any major problems (freezes, hangs, etc.), and commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on everything >= gen8") was correct in deducing that the problem was likely that some *vital* information was being stored i the first 4k and that go overwritten. But the contents of the (first lines of) the framebuffer may become corrupted once we actually start using the command-streamers, which is still very much not wanted. In practice Xorg or Wayland will likely have setup another framebuffer by the time the command-streamers will start to get used. Alternatively we could start with inheriting the BIOS framebuffer (as my patch allows) so that we don't get the flicker and then soon afterwards atomically transit to a new framebuffer (which should contain a copy of the BIOS fb contents) at a different location. I think it might be worthwhile to first just try my patch and then if we see or receive reports of problems with the fb becoming corrupt because in some cases it ends up being used for longer then expected, we can do the atomic transition to a new fb thing. Regards, Hans p.s. Fi.CI.BAT and Fi.CI.IGT have run successfully for this patch, I realize that this does not mean that much, but it is an indication that it is not completely broken.
On Fri, Mar 30, 2018 at 04:26:53PM +0200, Hans de Goede wrote: > Hi, > > On 30-03-18 15:25, Hans de Goede wrote: > > Hi, > > > > On 30-03-18 14:44, Chris Wilson wrote: > >> Quoting Hans de Goede (2018-03-30 13:37:40) > >>> Hi, > >>> > >>> On 30-03-18 14:30, Chris Wilson wrote: > >>>> Quoting Hans de Goede (2018-03-30 13:27:15) > >>>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was > >>>>> skipping the first 4k by passing 4096 as start of the address range passed > >>>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and > >>>>> reserve the firmware framebuffer so that we can inherit it would always > >>>>> fail, as the firmware framebuffer starts at address 0. > >>>>> > >>>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on > >>>>> everything >= gen8") says in its commit message: "This is confirmed to fix > >>>>> Skylake screen flickering issues (probably caused by the fact that we > >>>>> initialized a ring in the first page of stolen, but I didn't 100% confirm > >>>>> this theory)." > >>>>> > >>>>> Which suggests that it is safe to use the first page for a linear > >>>>> framebuffer as the firmware is doing. > >>>>> > >>>>> This commit always passes 0 as start to drm_mm_init() and works around > >>>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range() > >>>>> by insuring the start address passed by to drm_mm_insert_node_in_range() > >>>>> is always 4k or more. All entry points to i915_gem_stolen.c go through > >>>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated > >>>>> objects such as ring-buffers will not be allocated in the first 4k. > >>>>> > >>>>> The one exception is i915_gem_object_create_stolen_for_preallocated() > >>>>> which directly calls drm_mm_reserve_node() which now will be able to > >>>>> use the first 4k. > >>>>> > >>>>> This fixes the i915 driver no longer being able to inherit the firmware > >>>>> framebuffer on gen8+, which fixes the video output changing from the > >>>>> vendor logo to a black screen as soon as the i915 driver is loaded > >>>>> (on systems without fbcon). > >>>> > >>>> We've been told by the HW guys not to use the first page. (That's my > >>>> understanding from every time this gets questioned.) > >>> > >>> Yet the GOP is happily using the first page. I think we may need to make > >>> a difference here between the GPU not using the first page and the > >>> display engine/pipeline not using the first page. Note that my patch > >>> only influences the inheriting of the initial framebuffer as allocated > >>> by the GOP. It does not influence any other allocations from the > >>> reserved range, those will still all avoid the first page. > >>> > >>> Without this patch fastboot / flickerfree support is essentially broken > >>> on any gen8+ hardware given that one of the goals of atomic is to be > >>> able to do flickerfree transitions I think that this warrants a closer > >>> look then just simply saying never use the first page. > >> > >> The concern is what else (i.e. nothing that we allocated ourselves) that > >> may be in the first page... > > > > Given that the GOP has put its framebuffer there at least at boot there > > is nothing there, otherwise it would show up on the display. > > > > We have a whole bunch of code to inherit the BIOS fb in intel_display.c > > and AFAIK that code is there because this inheriting the BIOS fb is > > deemed an important feature. So I'm not happy at all with the handwavy > > best to not touch it answer I'm getting to this patch. > > > > Unless there are some clear answer from the hardware folks which specifically > > say we cannot put a framebuffer there (and then why is the GOP doing it?) > > then I believe that the best way forward here is to get various people to > > test with this patch and the best way to do that is probably to put it > > in next. Note I deliberately did not add a Cc stable. > > To elaborate on this, the excluding of the first 4k of the stolen memory > region causes intel_alloc_initial_plane_obj() from intel_display.c to fail, > which in turn causes intel_find_initial_plane_obj() to call > intel_plane_disable_noatomic(intel_crtc, intel_plane); which temporarily > completely turns off the display which leads to a very ugly flickering > of the display at boot (as well as replacing the vendor logo with a > black screen). > > I think we can all agree that this behavior is undesirable and even a > regression in behavior caused by the fix to exclude the first 4k. > > Chris, if my patch is not an acceptable way to fix this, then how do you > suggest that we fix this? > > Digging a bit deeper I found this: > > https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-kbl-vol16-workarounds.pdf > > Which says: > > "WaSkipStolenMemoryFirstPage: > > WA to skip the first page of stolen > memory due to sporadic HW write on *CS Idle" > > And also about FBC: > > "First line of FBC getting corrupted when FBC > compressed frame buffer offset is programmed to > zero. Command streamers are doing flush writes to > base of stolen. > WA: New restriction to program FBC compressed > frame buffer offset to at least 4KB." > > So using the first 4kB for the *framebuffer* as done by the GOP will > not cause any major problems (freezes, hangs, etc.), and commit > d43537610470 ("drm/i915: skip the first 4k of stolen memory on > everything >= gen8") was correct in deducing that the problem was > likely that some *vital* information was being stored i the first 4k > and that go overwritten. > > But the contents of the (first lines of) the framebuffer may become > corrupted once we actually start using the command-streamers, which > is still very much not wanted. > > In practice Xorg or Wayland will likely have setup another framebuffer > by the time the command-streamers will start to get used. > > Alternatively we could start with inheriting the BIOS framebuffer > (as my patch allows) so that we don't get the flicker and then soon > afterwards atomically transit to a new framebuffer (which should > contain a copy of the BIOS fb contents) at a different location. What I suggested long ago was to copy just the first page and adjust the sg list. But I'm not sure if our stolen gem code would be happy with an sg list with two entries instead of one.
On Wed, Apr 04, 2018 at 06:50:16PM +0300, Ville Syrjälä wrote: > On Fri, Mar 30, 2018 at 04:26:53PM +0200, Hans de Goede wrote: > > Hi, > > > > On 30-03-18 15:25, Hans de Goede wrote: > > > Hi, > > > > > > On 30-03-18 14:44, Chris Wilson wrote: > > >> Quoting Hans de Goede (2018-03-30 13:37:40) > > >>> Hi, > > >>> > > >>> On 30-03-18 14:30, Chris Wilson wrote: > > >>>> Quoting Hans de Goede (2018-03-30 13:27:15) > > >>>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was > > >>>>> skipping the first 4k by passing 4096 as start of the address range passed > > >>>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and > > >>>>> reserve the firmware framebuffer so that we can inherit it would always > > >>>>> fail, as the firmware framebuffer starts at address 0. > > >>>>> > > >>>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on > > >>>>> everything >= gen8") says in its commit message: "This is confirmed to fix > > >>>>> Skylake screen flickering issues (probably caused by the fact that we > > >>>>> initialized a ring in the first page of stolen, but I didn't 100% confirm > > >>>>> this theory)." > > >>>>> > > >>>>> Which suggests that it is safe to use the first page for a linear > > >>>>> framebuffer as the firmware is doing. > > >>>>> > > >>>>> This commit always passes 0 as start to drm_mm_init() and works around > > >>>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range() > > >>>>> by insuring the start address passed by to drm_mm_insert_node_in_range() > > >>>>> is always 4k or more. All entry points to i915_gem_stolen.c go through > > >>>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated > > >>>>> objects such as ring-buffers will not be allocated in the first 4k. > > >>>>> > > >>>>> The one exception is i915_gem_object_create_stolen_for_preallocated() > > >>>>> which directly calls drm_mm_reserve_node() which now will be able to > > >>>>> use the first 4k. > > >>>>> > > >>>>> This fixes the i915 driver no longer being able to inherit the firmware > > >>>>> framebuffer on gen8+, which fixes the video output changing from the > > >>>>> vendor logo to a black screen as soon as the i915 driver is loaded > > >>>>> (on systems without fbcon). > > >>>> > > >>>> We've been told by the HW guys not to use the first page. (That's my > > >>>> understanding from every time this gets questioned.) > > >>> > > >>> Yet the GOP is happily using the first page. I think we may need to make > > >>> a difference here between the GPU not using the first page and the > > >>> display engine/pipeline not using the first page. Note that my patch > > >>> only influences the inheriting of the initial framebuffer as allocated > > >>> by the GOP. It does not influence any other allocations from the > > >>> reserved range, those will still all avoid the first page. > > >>> > > >>> Without this patch fastboot / flickerfree support is essentially broken > > >>> on any gen8+ hardware given that one of the goals of atomic is to be > > >>> able to do flickerfree transitions I think that this warrants a closer > > >>> look then just simply saying never use the first page. > > >> > > >> The concern is what else (i.e. nothing that we allocated ourselves) that > > >> may be in the first page... > > > > > > Given that the GOP has put its framebuffer there at least at boot there > > > is nothing there, otherwise it would show up on the display. > > > > > > We have a whole bunch of code to inherit the BIOS fb in intel_display.c > > > and AFAIK that code is there because this inheriting the BIOS fb is > > > deemed an important feature. So I'm not happy at all with the handwavy > > > best to not touch it answer I'm getting to this patch. > > > > > > Unless there are some clear answer from the hardware folks which specifically > > > say we cannot put a framebuffer there (and then why is the GOP doing it?) > > > then I believe that the best way forward here is to get various people to > > > test with this patch and the best way to do that is probably to put it > > > in next. Note I deliberately did not add a Cc stable. > > > > To elaborate on this, the excluding of the first 4k of the stolen memory > > region causes intel_alloc_initial_plane_obj() from intel_display.c to fail, > > which in turn causes intel_find_initial_plane_obj() to call > > intel_plane_disable_noatomic(intel_crtc, intel_plane); which temporarily > > completely turns off the display which leads to a very ugly flickering > > of the display at boot (as well as replacing the vendor logo with a > > black screen). > > > > I think we can all agree that this behavior is undesirable and even a > > regression in behavior caused by the fix to exclude the first 4k. > > > > Chris, if my patch is not an acceptable way to fix this, then how do you > > suggest that we fix this? > > > > Digging a bit deeper I found this: > > > > https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-kbl-vol16-workarounds.pdf > > > > Which says: > > > > "WaSkipStolenMemoryFirstPage: > > > > WA to skip the first page of stolen > > memory due to sporadic HW write on *CS Idle" > > > > And also about FBC: > > > > "First line of FBC getting corrupted when FBC > > compressed frame buffer offset is programmed to > > zero. Command streamers are doing flush writes to > > base of stolen. > > WA: New restriction to program FBC compressed > > frame buffer offset to at least 4KB." > > > > So using the first 4kB for the *framebuffer* as done by the GOP will > > not cause any major problems (freezes, hangs, etc.), and commit > > d43537610470 ("drm/i915: skip the first 4k of stolen memory on > > everything >= gen8") was correct in deducing that the problem was > > likely that some *vital* information was being stored i the first 4k > > and that go overwritten. > > > > But the contents of the (first lines of) the framebuffer may become > > corrupted once we actually start using the command-streamers, which > > is still very much not wanted. > > > > In practice Xorg or Wayland will likely have setup another framebuffer > > by the time the command-streamers will start to get used. > > > > Alternatively we could start with inheriting the BIOS framebuffer > > (as my patch allows) so that we don't get the flicker and then soon > > afterwards atomically transit to a new framebuffer (which should > > contain a copy of the BIOS fb contents) at a different location. > > What I suggested long ago was to copy just the first page and adjust the > sg list. But I'm not sure if our stolen gem code would be happy with an > sg list with two entries instead of one. Oh I vaguely remember the issue and this suggestion. But looking to Hans code and explanation I believe that makes sense, although I understand the fear of regressions :/ > > -- > Ville Syrjälä > Intel OTC
Hi, On 04-04-18 17:50, Ville Syrjälä wrote: > On Fri, Mar 30, 2018 at 04:26:53PM +0200, Hans de Goede wrote: >> Hi, >> >> On 30-03-18 15:25, Hans de Goede wrote: >>> Hi, >>> >>> On 30-03-18 14:44, Chris Wilson wrote: >>>> Quoting Hans de Goede (2018-03-30 13:37:40) >>>>> Hi, >>>>> >>>>> On 30-03-18 14:30, Chris Wilson wrote: >>>>>> Quoting Hans de Goede (2018-03-30 13:27:15) >>>>>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was >>>>>>> skipping the first 4k by passing 4096 as start of the address range passed >>>>>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and >>>>>>> reserve the firmware framebuffer so that we can inherit it would always >>>>>>> fail, as the firmware framebuffer starts at address 0. >>>>>>> >>>>>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on >>>>>>> everything >= gen8") says in its commit message: "This is confirmed to fix >>>>>>> Skylake screen flickering issues (probably caused by the fact that we >>>>>>> initialized a ring in the first page of stolen, but I didn't 100% confirm >>>>>>> this theory)." >>>>>>> >>>>>>> Which suggests that it is safe to use the first page for a linear >>>>>>> framebuffer as the firmware is doing. >>>>>>> >>>>>>> This commit always passes 0 as start to drm_mm_init() and works around >>>>>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range() >>>>>>> by insuring the start address passed by to drm_mm_insert_node_in_range() >>>>>>> is always 4k or more. All entry points to i915_gem_stolen.c go through >>>>>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated >>>>>>> objects such as ring-buffers will not be allocated in the first 4k. >>>>>>> >>>>>>> The one exception is i915_gem_object_create_stolen_for_preallocated() >>>>>>> which directly calls drm_mm_reserve_node() which now will be able to >>>>>>> use the first 4k. >>>>>>> >>>>>>> This fixes the i915 driver no longer being able to inherit the firmware >>>>>>> framebuffer on gen8+, which fixes the video output changing from the >>>>>>> vendor logo to a black screen as soon as the i915 driver is loaded >>>>>>> (on systems without fbcon). >>>>>> >>>>>> We've been told by the HW guys not to use the first page. (That's my >>>>>> understanding from every time this gets questioned.) >>>>> >>>>> Yet the GOP is happily using the first page. I think we may need to make >>>>> a difference here between the GPU not using the first page and the >>>>> display engine/pipeline not using the first page. Note that my patch >>>>> only influences the inheriting of the initial framebuffer as allocated >>>>> by the GOP. It does not influence any other allocations from the >>>>> reserved range, those will still all avoid the first page. >>>>> >>>>> Without this patch fastboot / flickerfree support is essentially broken >>>>> on any gen8+ hardware given that one of the goals of atomic is to be >>>>> able to do flickerfree transitions I think that this warrants a closer >>>>> look then just simply saying never use the first page. >>>> >>>> The concern is what else (i.e. nothing that we allocated ourselves) that >>>> may be in the first page... >>> >>> Given that the GOP has put its framebuffer there at least at boot there >>> is nothing there, otherwise it would show up on the display. >>> >>> We have a whole bunch of code to inherit the BIOS fb in intel_display.c >>> and AFAIK that code is there because this inheriting the BIOS fb is >>> deemed an important feature. So I'm not happy at all with the handwavy >>> best to not touch it answer I'm getting to this patch. >>> >>> Unless there are some clear answer from the hardware folks which specifically >>> say we cannot put a framebuffer there (and then why is the GOP doing it?) >>> then I believe that the best way forward here is to get various people to >>> test with this patch and the best way to do that is probably to put it >>> in next. Note I deliberately did not add a Cc stable. >> >> To elaborate on this, the excluding of the first 4k of the stolen memory >> region causes intel_alloc_initial_plane_obj() from intel_display.c to fail, >> which in turn causes intel_find_initial_plane_obj() to call >> intel_plane_disable_noatomic(intel_crtc, intel_plane); which temporarily >> completely turns off the display which leads to a very ugly flickering >> of the display at boot (as well as replacing the vendor logo with a >> black screen). >> >> I think we can all agree that this behavior is undesirable and even a >> regression in behavior caused by the fix to exclude the first 4k. >> >> Chris, if my patch is not an acceptable way to fix this, then how do you >> suggest that we fix this? >> >> Digging a bit deeper I found this: >> >> https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-kbl-vol16-workarounds.pdf >> >> Which says: >> >> "WaSkipStolenMemoryFirstPage: >> >> WA to skip the first page of stolen >> memory due to sporadic HW write on *CS Idle" >> >> And also about FBC: >> >> "First line of FBC getting corrupted when FBC >> compressed frame buffer offset is programmed to >> zero. Command streamers are doing flush writes to >> base of stolen. >> WA: New restriction to program FBC compressed >> frame buffer offset to at least 4KB." >> >> So using the first 4kB for the *framebuffer* as done by the GOP will >> not cause any major problems (freezes, hangs, etc.), and commit >> d43537610470 ("drm/i915: skip the first 4k of stolen memory on >> everything >= gen8") was correct in deducing that the problem was >> likely that some *vital* information was being stored i the first 4k >> and that go overwritten. >> >> But the contents of the (first lines of) the framebuffer may become >> corrupted once we actually start using the command-streamers, which >> is still very much not wanted. >> >> In practice Xorg or Wayland will likely have setup another framebuffer >> by the time the command-streamers will start to get used. >> >> Alternatively we could start with inheriting the BIOS framebuffer >> (as my patch allows) so that we don't get the flicker and then soon >> afterwards atomically transit to a new framebuffer (which should >> contain a copy of the BIOS fb contents) at a different location. > > What I suggested long ago was to copy just the first page and adjust the > sg list. But I'm not sure if our stolen gem code would be happy with an > sg list with two entries instead of one. But that would still require an atomic-modeset to install the new sg-list, right? Then we might just as well simply alloc a new fb and copy the contents over, or are you worried that with say a 4k fb that takes too much time? FWIW I can see how the single memcpy this involves will take some time, but I don't take it will take so long as to be a problem. Anyways I could use some help with implementing either solution as I'm not familiar with the involved parts of the code. I will happily test a patch for this. Keep in mind that for this to work my original patch will also be necessary so that the initial takeover of the firmware fb will work. Regards, Hans
On Wed, Apr 04, 2018 at 10:06:29PM +0200, Hans de Goede wrote: > Hi, > > On 04-04-18 17:50, Ville Syrjälä wrote: > > On Fri, Mar 30, 2018 at 04:26:53PM +0200, Hans de Goede wrote: > >> Hi, > >> > >> On 30-03-18 15:25, Hans de Goede wrote: > >>> Hi, > >>> > >>> On 30-03-18 14:44, Chris Wilson wrote: > >>>> Quoting Hans de Goede (2018-03-30 13:37:40) > >>>>> Hi, > >>>>> > >>>>> On 30-03-18 14:30, Chris Wilson wrote: > >>>>>> Quoting Hans de Goede (2018-03-30 13:27:15) > >>>>>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was > >>>>>>> skipping the first 4k by passing 4096 as start of the address range passed > >>>>>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and > >>>>>>> reserve the firmware framebuffer so that we can inherit it would always > >>>>>>> fail, as the firmware framebuffer starts at address 0. > >>>>>>> > >>>>>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on > >>>>>>> everything >= gen8") says in its commit message: "This is confirmed to fix > >>>>>>> Skylake screen flickering issues (probably caused by the fact that we > >>>>>>> initialized a ring in the first page of stolen, but I didn't 100% confirm > >>>>>>> this theory)." > >>>>>>> > >>>>>>> Which suggests that it is safe to use the first page for a linear > >>>>>>> framebuffer as the firmware is doing. > >>>>>>> > >>>>>>> This commit always passes 0 as start to drm_mm_init() and works around > >>>>>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range() > >>>>>>> by insuring the start address passed by to drm_mm_insert_node_in_range() > >>>>>>> is always 4k or more. All entry points to i915_gem_stolen.c go through > >>>>>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated > >>>>>>> objects such as ring-buffers will not be allocated in the first 4k. > >>>>>>> > >>>>>>> The one exception is i915_gem_object_create_stolen_for_preallocated() > >>>>>>> which directly calls drm_mm_reserve_node() which now will be able to > >>>>>>> use the first 4k. > >>>>>>> > >>>>>>> This fixes the i915 driver no longer being able to inherit the firmware > >>>>>>> framebuffer on gen8+, which fixes the video output changing from the > >>>>>>> vendor logo to a black screen as soon as the i915 driver is loaded > >>>>>>> (on systems without fbcon). > >>>>>> > >>>>>> We've been told by the HW guys not to use the first page. (That's my > >>>>>> understanding from every time this gets questioned.) > >>>>> > >>>>> Yet the GOP is happily using the first page. I think we may need to make > >>>>> a difference here between the GPU not using the first page and the > >>>>> display engine/pipeline not using the first page. Note that my patch > >>>>> only influences the inheriting of the initial framebuffer as allocated > >>>>> by the GOP. It does not influence any other allocations from the > >>>>> reserved range, those will still all avoid the first page. > >>>>> > >>>>> Without this patch fastboot / flickerfree support is essentially broken > >>>>> on any gen8+ hardware given that one of the goals of atomic is to be > >>>>> able to do flickerfree transitions I think that this warrants a closer > >>>>> look then just simply saying never use the first page. > >>>> > >>>> The concern is what else (i.e. nothing that we allocated ourselves) that > >>>> may be in the first page... > >>> > >>> Given that the GOP has put its framebuffer there at least at boot there > >>> is nothing there, otherwise it would show up on the display. > >>> > >>> We have a whole bunch of code to inherit the BIOS fb in intel_display.c > >>> and AFAIK that code is there because this inheriting the BIOS fb is > >>> deemed an important feature. So I'm not happy at all with the handwavy > >>> best to not touch it answer I'm getting to this patch. > >>> > >>> Unless there are some clear answer from the hardware folks which specifically > >>> say we cannot put a framebuffer there (and then why is the GOP doing it?) > >>> then I believe that the best way forward here is to get various people to > >>> test with this patch and the best way to do that is probably to put it > >>> in next. Note I deliberately did not add a Cc stable. > >> > >> To elaborate on this, the excluding of the first 4k of the stolen memory > >> region causes intel_alloc_initial_plane_obj() from intel_display.c to fail, > >> which in turn causes intel_find_initial_plane_obj() to call > >> intel_plane_disable_noatomic(intel_crtc, intel_plane); which temporarily > >> completely turns off the display which leads to a very ugly flickering > >> of the display at boot (as well as replacing the vendor logo with a > >> black screen). > >> > >> I think we can all agree that this behavior is undesirable and even a > >> regression in behavior caused by the fix to exclude the first 4k. > >> > >> Chris, if my patch is not an acceptable way to fix this, then how do you > >> suggest that we fix this? > >> > >> Digging a bit deeper I found this: > >> > >> https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-kbl-vol16-workarounds.pdf > >> > >> Which says: > >> > >> "WaSkipStolenMemoryFirstPage: > >> > >> WA to skip the first page of stolen > >> memory due to sporadic HW write on *CS Idle" > >> > >> And also about FBC: > >> > >> "First line of FBC getting corrupted when FBC > >> compressed frame buffer offset is programmed to > >> zero. Command streamers are doing flush writes to > >> base of stolen. > >> WA: New restriction to program FBC compressed > >> frame buffer offset to at least 4KB." > >> > >> So using the first 4kB for the *framebuffer* as done by the GOP will > >> not cause any major problems (freezes, hangs, etc.), and commit > >> d43537610470 ("drm/i915: skip the first 4k of stolen memory on > >> everything >= gen8") was correct in deducing that the problem was > >> likely that some *vital* information was being stored i the first 4k > >> and that go overwritten. > >> > >> But the contents of the (first lines of) the framebuffer may become > >> corrupted once we actually start using the command-streamers, which > >> is still very much not wanted. > >> > >> In practice Xorg or Wayland will likely have setup another framebuffer > >> by the time the command-streamers will start to get used. > >> > >> Alternatively we could start with inheriting the BIOS framebuffer > >> (as my patch allows) so that we don't get the flicker and then soon > >> afterwards atomically transit to a new framebuffer (which should > >> contain a copy of the BIOS fb contents) at a different location. > > > > What I suggested long ago was to copy just the first page and adjust the > > sg list. But I'm not sure if our stolen gem code would be happy with an > > sg list with two entries instead of one. > > But that would still require an atomic-modeset to install the new sg-list, > right? Perhaps not. Not sure if the pte update would be atomic enough to just change it underneath the display engine without ill effects, and then do the equivalent of a page flip to invalidate the TLBs. > Then we might just as well simply alloc a new fb and copy the > contents over, or are you worried that with say a 4k fb that takes too > much time? FWIW I can see how the single memcpy this involves will take > some time, but I don't take it will take so long as to be a problem. Mainly just a question of keeping it in stolen. Assuming we want to keep things in stolen, which is a matter of some debate as FBC needs stolen and people might not be happy if it's all taken up by fbdev. > > Anyways I could use some help with implementing either solution as I'm > not familiar with the involved parts of the code. I will happily test > a patch for this. Keep in mind that for this to work my original patch > will also be necessary so that the initial takeover of the firmware > fb will work. I guess the trickiest part would be getting both the old and new location of the page mapped in the ggtt at the same time. Sadly you're not allowed to access stolen directly. So I suppose this part would involve some fairly low level frobbing of the ggtt ptes and a manual ioremap() of the matching ranges of the aperture.
On Fri, Mar 30, 2018 at 02:27:15PM +0200, Hans de Goede wrote: > Before this commit the WaSkipStolenMemoryFirstPage workaround code was > skipping the first 4k by passing 4096 as start of the address range passed > to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and > reserve the firmware framebuffer so that we can inherit it would always > fail, as the firmware framebuffer starts at address 0. > > Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on > everything >= gen8") says in its commit message: "This is confirmed to fix > Skylake screen flickering issues (probably caused by the fact that we > initialized a ring in the first page of stolen, but I didn't 100% confirm > this theory)." > > Which suggests that it is safe to use the first page for a linear > framebuffer as the firmware is doing. > > This commit always passes 0 as start to drm_mm_init() and works around > WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range() > by insuring the start address passed by to drm_mm_insert_node_in_range() > is always 4k or more. All entry points to i915_gem_stolen.c go through > i915_gem_stolen_insert_node_in_range(), so that any newly allocated > objects such as ring-buffers will not be allocated in the first 4k. > > The one exception is i915_gem_object_create_stolen_for_preallocated() > which directly calls drm_mm_reserve_node() which now will be able to > use the first 4k. > > This fixes the i915 driver no longer being able to inherit the firmware > framebuffer on gen8+, which fixes the video output changing from the > vendor logo to a black screen as soon as the i915 driver is loaded > (on systems without fbcon). > > Signed-off-by: Hans de Goede <hdegoede@redhat.com> I think this is worth a shot. The only explanation I can think of why the GOP could get away with this and still follow the w/a is if it doesn't have a 1:1 mapping between GGTT and stolen. Atm we hardcode that assumption in intel_alloc_initial_plane_obj by passing the base_aligned as both the stolen_offset and the gtt_offset (but it's only the gtt_offset really). And since we're not re-writing the ptes it's not noticeable. I think to decide whether this is the right approach we should double-check whether that 1:1 assumption really holds true: Either read back the ggtt ptes and check their addresses (but iirc on some platforms their write-only, readback doesn't work), or we also rewrite the ptes again for preallocated stuff, like when binding a normal object into the gtt. If either of these approaches confirms that those affected gen8+ machines still use the 1:1 mapping, then I'm happy to put my r-b on this patch. If not, well then we at least know what to fix: We need to read out the real stolen_offset, instead of making assumptions. Cheers, Daniel > --- > drivers/gpu/drm/i915/i915_gem_stolen.c | 15 ++++++--------- > 1 file changed, 6 insertions(+), 9 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c > index af915d041281..ad949cc30928 100644 > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c > @@ -51,6 +51,10 @@ int i915_gem_stolen_insert_node_in_range(struct drm_i915_private *dev_priv, > if (!drm_mm_initialized(&dev_priv->mm.stolen)) > return -ENODEV; > > + /* WaSkipStolenMemoryFirstPage:bdw+ */ > + if (INTEL_GEN(dev_priv) >= 8 && start < 4096) > + start = 4096; > + > mutex_lock(&dev_priv->mm.stolen_lock); > ret = drm_mm_insert_node_in_range(&dev_priv->mm.stolen, node, > size, alignment, 0, > @@ -343,7 +347,6 @@ int i915_gem_init_stolen(struct drm_i915_private *dev_priv) > { > resource_size_t reserved_base, stolen_top; > resource_size_t reserved_total, reserved_size; > - resource_size_t stolen_usable_start; > > mutex_init(&dev_priv->mm.stolen_lock); > > @@ -435,17 +438,11 @@ int i915_gem_init_stolen(struct drm_i915_private *dev_priv) > (u64)resource_size(&dev_priv->dsm) >> 10, > ((u64)resource_size(&dev_priv->dsm) - reserved_total) >> 10); > > - stolen_usable_start = 0; > - /* WaSkipStolenMemoryFirstPage:bdw+ */ > - if (INTEL_GEN(dev_priv) >= 8) > - stolen_usable_start = 4096; > - > dev_priv->stolen_usable_size = > - resource_size(&dev_priv->dsm) - reserved_total - stolen_usable_start; > + resource_size(&dev_priv->dsm) - reserved_total; > > /* Basic memrange allocator for stolen space. */ > - drm_mm_init(&dev_priv->mm.stolen, stolen_usable_start, > - dev_priv->stolen_usable_size); > + drm_mm_init(&dev_priv->mm.stolen, 0, dev_priv->stolen_usable_size); > > return 0; > } > -- > 2.17.0.rc2 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Hi, On 04-04-18 22:49, Ville Syrjälä wrote: > On Wed, Apr 04, 2018 at 10:06:29PM +0200, Hans de Goede wrote: >> Hi, >> >> On 04-04-18 17:50, Ville Syrjälä wrote: >>> On Fri, Mar 30, 2018 at 04:26:53PM +0200, Hans de Goede wrote: >>>> Hi, >>>> >>>> On 30-03-18 15:25, Hans de Goede wrote: >>>>> Hi, >>>>> >>>>> On 30-03-18 14:44, Chris Wilson wrote: >>>>>> Quoting Hans de Goede (2018-03-30 13:37:40) >>>>>>> Hi, >>>>>>> >>>>>>> On 30-03-18 14:30, Chris Wilson wrote: >>>>>>>> Quoting Hans de Goede (2018-03-30 13:27:15) >>>>>>>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was >>>>>>>>> skipping the first 4k by passing 4096 as start of the address range passed >>>>>>>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and >>>>>>>>> reserve the firmware framebuffer so that we can inherit it would always >>>>>>>>> fail, as the firmware framebuffer starts at address 0. >>>>>>>>> >>>>>>>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on >>>>>>>>> everything >= gen8") says in its commit message: "This is confirmed to fix >>>>>>>>> Skylake screen flickering issues (probably caused by the fact that we >>>>>>>>> initialized a ring in the first page of stolen, but I didn't 100% confirm >>>>>>>>> this theory)." >>>>>>>>> >>>>>>>>> Which suggests that it is safe to use the first page for a linear >>>>>>>>> framebuffer as the firmware is doing. >>>>>>>>> >>>>>>>>> This commit always passes 0 as start to drm_mm_init() and works around >>>>>>>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range() >>>>>>>>> by insuring the start address passed by to drm_mm_insert_node_in_range() >>>>>>>>> is always 4k or more. All entry points to i915_gem_stolen.c go through >>>>>>>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated >>>>>>>>> objects such as ring-buffers will not be allocated in the first 4k. >>>>>>>>> >>>>>>>>> The one exception is i915_gem_object_create_stolen_for_preallocated() >>>>>>>>> which directly calls drm_mm_reserve_node() which now will be able to >>>>>>>>> use the first 4k. >>>>>>>>> >>>>>>>>> This fixes the i915 driver no longer being able to inherit the firmware >>>>>>>>> framebuffer on gen8+, which fixes the video output changing from the >>>>>>>>> vendor logo to a black screen as soon as the i915 driver is loaded >>>>>>>>> (on systems without fbcon). >>>>>>>> >>>>>>>> We've been told by the HW guys not to use the first page. (That's my >>>>>>>> understanding from every time this gets questioned.) >>>>>>> >>>>>>> Yet the GOP is happily using the first page. I think we may need to make >>>>>>> a difference here between the GPU not using the first page and the >>>>>>> display engine/pipeline not using the first page. Note that my patch >>>>>>> only influences the inheriting of the initial framebuffer as allocated >>>>>>> by the GOP. It does not influence any other allocations from the >>>>>>> reserved range, those will still all avoid the first page. >>>>>>> >>>>>>> Without this patch fastboot / flickerfree support is essentially broken >>>>>>> on any gen8+ hardware given that one of the goals of atomic is to be >>>>>>> able to do flickerfree transitions I think that this warrants a closer >>>>>>> look then just simply saying never use the first page. >>>>>> >>>>>> The concern is what else (i.e. nothing that we allocated ourselves) that >>>>>> may be in the first page... >>>>> >>>>> Given that the GOP has put its framebuffer there at least at boot there >>>>> is nothing there, otherwise it would show up on the display. >>>>> >>>>> We have a whole bunch of code to inherit the BIOS fb in intel_display.c >>>>> and AFAIK that code is there because this inheriting the BIOS fb is >>>>> deemed an important feature. So I'm not happy at all with the handwavy >>>>> best to not touch it answer I'm getting to this patch. >>>>> >>>>> Unless there are some clear answer from the hardware folks which specifically >>>>> say we cannot put a framebuffer there (and then why is the GOP doing it?) >>>>> then I believe that the best way forward here is to get various people to >>>>> test with this patch and the best way to do that is probably to put it >>>>> in next. Note I deliberately did not add a Cc stable. >>>> >>>> To elaborate on this, the excluding of the first 4k of the stolen memory >>>> region causes intel_alloc_initial_plane_obj() from intel_display.c to fail, >>>> which in turn causes intel_find_initial_plane_obj() to call >>>> intel_plane_disable_noatomic(intel_crtc, intel_plane); which temporarily >>>> completely turns off the display which leads to a very ugly flickering >>>> of the display at boot (as well as replacing the vendor logo with a >>>> black screen). >>>> >>>> I think we can all agree that this behavior is undesirable and even a >>>> regression in behavior caused by the fix to exclude the first 4k. >>>> >>>> Chris, if my patch is not an acceptable way to fix this, then how do you >>>> suggest that we fix this? >>>> >>>> Digging a bit deeper I found this: >>>> >>>> https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-kbl-vol16-workarounds.pdf >>>> >>>> Which says: >>>> >>>> "WaSkipStolenMemoryFirstPage: >>>> >>>> WA to skip the first page of stolen >>>> memory due to sporadic HW write on *CS Idle" >>>> >>>> And also about FBC: >>>> >>>> "First line of FBC getting corrupted when FBC >>>> compressed frame buffer offset is programmed to >>>> zero. Command streamers are doing flush writes to >>>> base of stolen. >>>> WA: New restriction to program FBC compressed >>>> frame buffer offset to at least 4KB." >>>> >>>> So using the first 4kB for the *framebuffer* as done by the GOP will >>>> not cause any major problems (freezes, hangs, etc.), and commit >>>> d43537610470 ("drm/i915: skip the first 4k of stolen memory on >>>> everything >= gen8") was correct in deducing that the problem was >>>> likely that some *vital* information was being stored i the first 4k >>>> and that go overwritten. >>>> >>>> But the contents of the (first lines of) the framebuffer may become >>>> corrupted once we actually start using the command-streamers, which >>>> is still very much not wanted. >>>> >>>> In practice Xorg or Wayland will likely have setup another framebuffer >>>> by the time the command-streamers will start to get used. >>>> >>>> Alternatively we could start with inheriting the BIOS framebuffer >>>> (as my patch allows) so that we don't get the flicker and then soon >>>> afterwards atomically transit to a new framebuffer (which should >>>> contain a copy of the BIOS fb contents) at a different location. >>> >>> What I suggested long ago was to copy just the first page and adjust the >>> sg list. But I'm not sure if our stolen gem code would be happy with an >>> sg list with two entries instead of one. >> >> But that would still require an atomic-modeset to install the new sg-list, >> right? > > Perhaps not. Not sure if the pte update would be atomic enough to just > change it underneath the display engine without ill effects, and then > do the equivalent of a page flip to invalidate the TLBs. > >> Then we might just as well simply alloc a new fb and copy the >> contents over, or are you worried that with say a 4k fb that takes too >> much time? FWIW I can see how the single memcpy this involves will take >> some time, but I don't take it will take so long as to be a problem. > > Mainly just a question of keeping it in stolen. Ah I see. > Assuming we want to keep > things in stolen, which is a matter of some debate as FBC needs stolen > and people might not be happy if it's all taken up by fbdev. > >> >> Anyways I could use some help with implementing either solution as I'm >> not familiar with the involved parts of the code. I will happily test >> a patch for this. Keep in mind that for this to work my original patch >> will also be necessary so that the initial takeover of the firmware >> fb will work. > > I guess the trickiest part would be getting both the old and new > location of the page mapped in the ggtt at the same time. Sadly you're > not allowed to access stolen directly. So I suppose this part would > involve some fairly low level frobbing of the ggtt ptes and a > manual ioremap() of the matching ranges of the aperture. Hmm, you're talking about what needs to be done to copy the contents here, right? I have a feeling we really should just try only my patch first, as mentioned before the worst thing which can happen is some corruption of the first lines of the display, which I agree is not good, but also not the end of the world. Regards, Hans
Hi, On 05-04-18 09:14, Daniel Vetter wrote: > On Fri, Mar 30, 2018 at 02:27:15PM +0200, Hans de Goede wrote: >> Before this commit the WaSkipStolenMemoryFirstPage workaround code was >> skipping the first 4k by passing 4096 as start of the address range passed >> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and >> reserve the firmware framebuffer so that we can inherit it would always >> fail, as the firmware framebuffer starts at address 0. >> >> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on >> everything >= gen8") says in its commit message: "This is confirmed to fix >> Skylake screen flickering issues (probably caused by the fact that we >> initialized a ring in the first page of stolen, but I didn't 100% confirm >> this theory)." >> >> Which suggests that it is safe to use the first page for a linear >> framebuffer as the firmware is doing. >> >> This commit always passes 0 as start to drm_mm_init() and works around >> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range() >> by insuring the start address passed by to drm_mm_insert_node_in_range() >> is always 4k or more. All entry points to i915_gem_stolen.c go through >> i915_gem_stolen_insert_node_in_range(), so that any newly allocated >> objects such as ring-buffers will not be allocated in the first 4k. >> >> The one exception is i915_gem_object_create_stolen_for_preallocated() >> which directly calls drm_mm_reserve_node() which now will be able to >> use the first 4k. >> >> This fixes the i915 driver no longer being able to inherit the firmware >> framebuffer on gen8+, which fixes the video output changing from the >> vendor logo to a black screen as soon as the i915 driver is loaded >> (on systems without fbcon). >> >> Signed-off-by: Hans de Goede <hdegoede@redhat.com> > > I think this is worth a shot. The only explanation I can think of why the > GOP could get away with this and still follow the w/a is if it doesn't > have a 1:1 mapping between GGTT and stolen. My guess is that the GOP does not care about the w/a because the w/a states that the command-streamers sometimes do a spurious flush (write) to the first page, and the command-streamers are not used until much later, so the GOP is probably ignoring the w/a all together. At least that is my theory. > Atm we hardcode that > assumption in intel_alloc_initial_plane_obj by passing the base_aligned as > both the stolen_offset and the gtt_offset (but it's only the gtt_offset > really). And since we're not re-writing the ptes it's not noticeable. > > I think to decide whether this is the right approach we should > double-check whether that 1:1 assumption really holds true: Either read > back the ggtt ptes and check their addresses (but iirc on some platforms > their write-only, readback doesn't work), or we also rewrite the ptes > again for preallocated stuff, like when binding a normal object into the > gtt. If either of these approaches confirms that those affected gen8+ > machines still use the 1:1 mapping, then I'm happy to put my r-b on this > patch. If not, well then we at least know what to fix: We need to read out > the real stolen_offset, instead of making assumptions. I'm happy to give this a try on one or more affected machines, I can e.g. try this on both a skylake desktop and a cherry-trail based tablet. But I'm clueless about the whole PTE stuff, so I'm going to need someone to provide me with a patch following either approach. If readback of the PTEs works on skylake / cherrytrail I guess that would be the best way. Note to test this I'm currently booting the kernel with the machine's UEFI vendor logo still being displayed when efifb kicks in. I've added: "fbcon=vc:2-62" to the kernel commandline to make fbcon not bind to VC0 / VC1, so that the framebuffer contents stays intact, combined with the patch we are discussing now, this makes the vendor logo stay on the screen all the way till GDM loads, which looks rather nice actually :) And on shutdown we fall back to the original framebuffer and the vendor logo is back again. I cannot see any corruption in the display at either boot or shutdown time. Regards, Hans > > Cheers, Daniel >> --- >> drivers/gpu/drm/i915/i915_gem_stolen.c | 15 ++++++--------- >> 1 file changed, 6 insertions(+), 9 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c >> index af915d041281..ad949cc30928 100644 >> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c >> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c >> @@ -51,6 +51,10 @@ int i915_gem_stolen_insert_node_in_range(struct drm_i915_private *dev_priv, >> if (!drm_mm_initialized(&dev_priv->mm.stolen)) >> return -ENODEV; >> >> + /* WaSkipStolenMemoryFirstPage:bdw+ */ >> + if (INTEL_GEN(dev_priv) >= 8 && start < 4096) >> + start = 4096; >> + >> mutex_lock(&dev_priv->mm.stolen_lock); >> ret = drm_mm_insert_node_in_range(&dev_priv->mm.stolen, node, >> size, alignment, 0, >> @@ -343,7 +347,6 @@ int i915_gem_init_stolen(struct drm_i915_private *dev_priv) >> { >> resource_size_t reserved_base, stolen_top; >> resource_size_t reserved_total, reserved_size; >> - resource_size_t stolen_usable_start; >> >> mutex_init(&dev_priv->mm.stolen_lock); >> >> @@ -435,17 +438,11 @@ int i915_gem_init_stolen(struct drm_i915_private *dev_priv) >> (u64)resource_size(&dev_priv->dsm) >> 10, >> ((u64)resource_size(&dev_priv->dsm) - reserved_total) >> 10); >> >> - stolen_usable_start = 0; >> - /* WaSkipStolenMemoryFirstPage:bdw+ */ >> - if (INTEL_GEN(dev_priv) >= 8) >> - stolen_usable_start = 4096; >> - >> dev_priv->stolen_usable_size = >> - resource_size(&dev_priv->dsm) - reserved_total - stolen_usable_start; >> + resource_size(&dev_priv->dsm) - reserved_total; >> >> /* Basic memrange allocator for stolen space. */ >> - drm_mm_init(&dev_priv->mm.stolen, stolen_usable_start, >> - dev_priv->stolen_usable_size); >> + drm_mm_init(&dev_priv->mm.stolen, 0, dev_priv->stolen_usable_size); >> >> return 0; >> } >> -- >> 2.17.0.rc2 >> >> _______________________________________________ >> Intel-gfx mailing list >> Intel-gfx@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/intel-gfx >
On Thu, Apr 5, 2018 at 1:47 PM, Hans de Goede <hdegoede@redhat.com> wrote: > Hi, > > > On 05-04-18 09:14, Daniel Vetter wrote: >> >> On Fri, Mar 30, 2018 at 02:27:15PM +0200, Hans de Goede wrote: >>> >>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was >>> skipping the first 4k by passing 4096 as start of the address range >>> passed >>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try >>> and >>> reserve the firmware framebuffer so that we can inherit it would always >>> fail, as the firmware framebuffer starts at address 0. >>> >>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on >>> everything >= gen8") says in its commit message: "This is confirmed to >>> fix >>> Skylake screen flickering issues (probably caused by the fact that we >>> initialized a ring in the first page of stolen, but I didn't 100% confirm >>> this theory)." >>> >>> Which suggests that it is safe to use the first page for a linear >>> framebuffer as the firmware is doing. >>> >>> This commit always passes 0 as start to drm_mm_init() and works around >>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range() >>> by insuring the start address passed by to drm_mm_insert_node_in_range() >>> is always 4k or more. All entry points to i915_gem_stolen.c go through >>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated >>> objects such as ring-buffers will not be allocated in the first 4k. >>> >>> The one exception is i915_gem_object_create_stolen_for_preallocated() >>> which directly calls drm_mm_reserve_node() which now will be able to >>> use the first 4k. >>> >>> This fixes the i915 driver no longer being able to inherit the firmware >>> framebuffer on gen8+, which fixes the video output changing from the >>> vendor logo to a black screen as soon as the i915 driver is loaded >>> (on systems without fbcon). >>> >>> Signed-off-by: Hans de Goede <hdegoede@redhat.com> >> >> >> I think this is worth a shot. The only explanation I can think of why the >> GOP could get away with this and still follow the w/a is if it doesn't >> have a 1:1 mapping between GGTT and stolen. > > > My guess is that the GOP does not care about the w/a because the w/a > states that the command-streamers sometimes do a spurious flush (write) > to the first page, and the command-streamers are not used until much > later, so the GOP is probably ignoring the w/a all together. > > At least that is my theory. Atm we do not know whether the GOP actually implements the wa or not. That it doesn't care is just a conjecture, but we have no proof. On previous platforms the 1:1 mapping did hold, but there's no fundamental reason why it sitll does. >> Atm we hardcode that >> assumption in intel_alloc_initial_plane_obj by passing the base_aligned as >> both the stolen_offset and the gtt_offset (but it's only the gtt_offset >> really). And since we're not re-writing the ptes it's not noticeable. >> >> I think to decide whether this is the right approach we should >> double-check whether that 1:1 assumption really holds true: Either read >> back the ggtt ptes and check their addresses (but iirc on some platforms >> their write-only, readback doesn't work), or we also rewrite the ptes >> again for preallocated stuff, like when binding a normal object into the >> gtt. If either of these approaches confirms that those affected gen8+ >> machines still use the 1:1 mapping, then I'm happy to put my r-b on this >> patch. If not, well then we at least know what to fix: We need to read out >> the real stolen_offset, instead of making assumptions. > > > I'm happy to give this a try on one or more affected machines, I can e.g. > try this on both a skylake desktop and a cherry-trail based tablet. > > But I'm clueless about the whole PTE stuff, so I'm going to need someone > to provide me with a patch following either approach. If readback of the > PTEs works on skylake / cherrytrail I guess that would be the best way. > > Note to test this I'm currently booting the kernel with the machine's > UEFI vendor logo still being displayed when efifb kicks in. I've added: > "fbcon=vc:2-62" to the kernel commandline to make fbcon not bind to > VC0 / VC1, so that the framebuffer contents stays intact, combined with > the patch we are discussing now, this makes the vendor logo stay on > the screen all the way till GDM loads, which looks rather nice actually :) > > And on shutdown we fall back to the original framebuffer and the vendor > logo is back again. I cannot see any corruption in the display at either > boot or shutdown time. It shouldn't matter whether efifb takes over or not, it's still the GOP's framebuffer we're taking over. Different content for sure, logo is gone, but we only care about which pages we're using. Wrt the code: - (Re)binding happens by calling i915_vma_bind, if you put a call to that at the end of the stolen_preallocated functions you should get that. Ofc needs the logo still there so you can see it jump/get mangled. - GGTT PTEs for gen8+ are written through e.g. gen8_ggttt_insert_page or gen8_ggtt_insert_entries (which takes the vma). Since we only care about the first entry a readq(dev_priv->ggtt->gsm) is all we really need. If it's all 0 then it's not working (since at least _PAGE_PRESENT should be set, plus the physical address of a page in stolen memory). I can do some patches for each of those, but a bit swamped. Also no gen8 handy for testing I think to make sure it works, so would take a few weeks probably. Cheers, Daniel > > >> >> Cheers, Daniel >>> >>> --- >>> drivers/gpu/drm/i915/i915_gem_stolen.c | 15 ++++++--------- >>> 1 file changed, 6 insertions(+), 9 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c >>> b/drivers/gpu/drm/i915/i915_gem_stolen.c >>> index af915d041281..ad949cc30928 100644 >>> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c >>> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c >>> @@ -51,6 +51,10 @@ int i915_gem_stolen_insert_node_in_range(struct >>> drm_i915_private *dev_priv, >>> if (!drm_mm_initialized(&dev_priv->mm.stolen)) >>> return -ENODEV; >>> + /* WaSkipStolenMemoryFirstPage:bdw+ */ >>> + if (INTEL_GEN(dev_priv) >= 8 && start < 4096) >>> + start = 4096; >>> + >>> mutex_lock(&dev_priv->mm.stolen_lock); >>> ret = drm_mm_insert_node_in_range(&dev_priv->mm.stolen, node, >>> size, alignment, 0, >>> @@ -343,7 +347,6 @@ int i915_gem_init_stolen(struct drm_i915_private >>> *dev_priv) >>> { >>> resource_size_t reserved_base, stolen_top; >>> resource_size_t reserved_total, reserved_size; >>> - resource_size_t stolen_usable_start; >>> mutex_init(&dev_priv->mm.stolen_lock); >>> @@ -435,17 +438,11 @@ int i915_gem_init_stolen(struct drm_i915_private >>> *dev_priv) >>> (u64)resource_size(&dev_priv->dsm) >> 10, >>> ((u64)resource_size(&dev_priv->dsm) - >>> reserved_total) >> 10); >>> - stolen_usable_start = 0; >>> - /* WaSkipStolenMemoryFirstPage:bdw+ */ >>> - if (INTEL_GEN(dev_priv) >= 8) >>> - stolen_usable_start = 4096; >>> - >>> dev_priv->stolen_usable_size = >>> - resource_size(&dev_priv->dsm) - reserved_total - >>> stolen_usable_start; >>> + resource_size(&dev_priv->dsm) - reserved_total; >>> /* Basic memrange allocator for stolen space. */ >>> - drm_mm_init(&dev_priv->mm.stolen, stolen_usable_start, >>> - dev_priv->stolen_usable_size); >>> + drm_mm_init(&dev_priv->mm.stolen, 0, >>> dev_priv->stolen_usable_size); >>> return 0; >>> } >>> -- >>> 2.17.0.rc2 >>> >>> _______________________________________________ >>> Intel-gfx mailing list >>> Intel-gfx@lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx >> >> >
Hi, On 05-04-18 15:26, Daniel Vetter wrote: > On Thu, Apr 5, 2018 at 1:47 PM, Hans de Goede <hdegoede@redhat.com> wrote: >> Hi, >> >> >> On 05-04-18 09:14, Daniel Vetter wrote: >>> >>> On Fri, Mar 30, 2018 at 02:27:15PM +0200, Hans de Goede wrote: >>>> >>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was >>>> skipping the first 4k by passing 4096 as start of the address range >>>> passed >>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try >>>> and >>>> reserve the firmware framebuffer so that we can inherit it would always >>>> fail, as the firmware framebuffer starts at address 0. >>>> >>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on >>>> everything >= gen8") says in its commit message: "This is confirmed to >>>> fix >>>> Skylake screen flickering issues (probably caused by the fact that we >>>> initialized a ring in the first page of stolen, but I didn't 100% confirm >>>> this theory)." >>>> >>>> Which suggests that it is safe to use the first page for a linear >>>> framebuffer as the firmware is doing. >>>> >>>> This commit always passes 0 as start to drm_mm_init() and works around >>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range() >>>> by insuring the start address passed by to drm_mm_insert_node_in_range() >>>> is always 4k or more. All entry points to i915_gem_stolen.c go through >>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated >>>> objects such as ring-buffers will not be allocated in the first 4k. >>>> >>>> The one exception is i915_gem_object_create_stolen_for_preallocated() >>>> which directly calls drm_mm_reserve_node() which now will be able to >>>> use the first 4k. >>>> >>>> This fixes the i915 driver no longer being able to inherit the firmware >>>> framebuffer on gen8+, which fixes the video output changing from the >>>> vendor logo to a black screen as soon as the i915 driver is loaded >>>> (on systems without fbcon). >>>> >>>> Signed-off-by: Hans de Goede <hdegoede@redhat.com> >>> >>> >>> I think this is worth a shot. The only explanation I can think of why the >>> GOP could get away with this and still follow the w/a is if it doesn't >>> have a 1:1 mapping between GGTT and stolen. >> >> >> My guess is that the GOP does not care about the w/a because the w/a >> states that the command-streamers sometimes do a spurious flush (write) >> to the first page, and the command-streamers are not used until much >> later, so the GOP is probably ignoring the w/a all together. >> >> At least that is my theory. > > Atm we do not know whether the GOP actually implements the wa or not. > That it doesn't care is just a conjecture, but we have no proof. On > previous platforms the 1:1 mapping did hold, but there's no > fundamental reason why it sitll does. > >>> Atm we hardcode that >>> assumption in intel_alloc_initial_plane_obj by passing the base_aligned as >>> both the stolen_offset and the gtt_offset (but it's only the gtt_offset >>> really). And since we're not re-writing the ptes it's not noticeable. >>> >>> I think to decide whether this is the right approach we should >>> double-check whether that 1:1 assumption really holds true: Either read >>> back the ggtt ptes and check their addresses (but iirc on some platforms >>> their write-only, readback doesn't work), or we also rewrite the ptes >>> again for preallocated stuff, like when binding a normal object into the >>> gtt. If either of these approaches confirms that those affected gen8+ >>> machines still use the 1:1 mapping, then I'm happy to put my r-b on this >>> patch. If not, well then we at least know what to fix: We need to read out >>> the real stolen_offset, instead of making assumptions. >> >> >> I'm happy to give this a try on one or more affected machines, I can e.g. >> try this on both a skylake desktop and a cherry-trail based tablet. >> >> But I'm clueless about the whole PTE stuff, so I'm going to need someone >> to provide me with a patch following either approach. If readback of the >> PTEs works on skylake / cherrytrail I guess that would be the best way. >> >> Note to test this I'm currently booting the kernel with the machine's >> UEFI vendor logo still being displayed when efifb kicks in. I've added: >> "fbcon=vc:2-62" to the kernel commandline to make fbcon not bind to >> VC0 / VC1, so that the framebuffer contents stays intact, combined with >> the patch we are discussing now, this makes the vendor logo stay on >> the screen all the way till GDM loads, which looks rather nice actually :) >> >> And on shutdown we fall back to the original framebuffer and the vendor >> logo is back again. I cannot see any corruption in the display at either >> boot or shutdown time. > > It shouldn't matter whether efifb takes over or not, it's still the > GOP's framebuffer we're taking over. Different content for sure, logo > is gone, but we only care about which pages we're using. Right, I only mentioned this to explain that I'm not seeing any (visible) corruption. > Wrt the code: > - (Re)binding happens by calling i915_vma_bind, if you put a call to > that at the end of the stolen_preallocated functions you should get > that. Ofc needs the logo still there so you can see it jump/get > mangled. > - GGTT PTEs for gen8+ are written through e.g. gen8_ggttt_insert_page > or gen8_ggtt_insert_entries (which takes the vma). Since we only care > about the first entry a readq(dev_priv->ggtt->gsm) is all we really > need. If it's all 0 then it's not working (since at least > _PAGE_PRESENT should be set, plus the physical address of a page in > stolen memory). Ok, I will try to give this a shot, probably not before coming Monday though. > I can do some patches for each of those, but a bit swamped. Also no > gen8 handy for testing I think to make sure it works, so would take a > few weeks probably. I'll give this a shot myself first and attach the patch when I reply with the testing results, so that you can verify that the patch works as expected. Regards, Hans > > Cheers, Daniel >> >> >>> >>> Cheers, Daniel >>>> >>>> --- >>>> drivers/gpu/drm/i915/i915_gem_stolen.c | 15 ++++++--------- >>>> 1 file changed, 6 insertions(+), 9 deletions(-) >>>> >>>> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c >>>> b/drivers/gpu/drm/i915/i915_gem_stolen.c >>>> index af915d041281..ad949cc30928 100644 >>>> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c >>>> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c >>>> @@ -51,6 +51,10 @@ int i915_gem_stolen_insert_node_in_range(struct >>>> drm_i915_private *dev_priv, >>>> if (!drm_mm_initialized(&dev_priv->mm.stolen)) >>>> return -ENODEV; >>>> + /* WaSkipStolenMemoryFirstPage:bdw+ */ >>>> + if (INTEL_GEN(dev_priv) >= 8 && start < 4096) >>>> + start = 4096; >>>> + >>>> mutex_lock(&dev_priv->mm.stolen_lock); >>>> ret = drm_mm_insert_node_in_range(&dev_priv->mm.stolen, node, >>>> size, alignment, 0, >>>> @@ -343,7 +347,6 @@ int i915_gem_init_stolen(struct drm_i915_private >>>> *dev_priv) >>>> { >>>> resource_size_t reserved_base, stolen_top; >>>> resource_size_t reserved_total, reserved_size; >>>> - resource_size_t stolen_usable_start; >>>> mutex_init(&dev_priv->mm.stolen_lock); >>>> @@ -435,17 +438,11 @@ int i915_gem_init_stolen(struct drm_i915_private >>>> *dev_priv) >>>> (u64)resource_size(&dev_priv->dsm) >> 10, >>>> ((u64)resource_size(&dev_priv->dsm) - >>>> reserved_total) >> 10); >>>> - stolen_usable_start = 0; >>>> - /* WaSkipStolenMemoryFirstPage:bdw+ */ >>>> - if (INTEL_GEN(dev_priv) >= 8) >>>> - stolen_usable_start = 4096; >>>> - >>>> dev_priv->stolen_usable_size = >>>> - resource_size(&dev_priv->dsm) - reserved_total - >>>> stolen_usable_start; >>>> + resource_size(&dev_priv->dsm) - reserved_total; >>>> /* Basic memrange allocator for stolen space. */ >>>> - drm_mm_init(&dev_priv->mm.stolen, stolen_usable_start, >>>> - dev_priv->stolen_usable_size); >>>> + drm_mm_init(&dev_priv->mm.stolen, 0, >>>> dev_priv->stolen_usable_size); >>>> return 0; >>>> } >>>> -- >>>> 2.17.0.rc2 >>>> >>>> _______________________________________________ >>>> Intel-gfx mailing list >>>> Intel-gfx@lists.freedesktop.org >>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx >>> >>> >> > > >
On Thu, Apr 5, 2018 at 3:31 PM, Hans de Goede <hdegoede@redhat.com> wrote: > Hi, > > > On 05-04-18 15:26, Daniel Vetter wrote: >> >> On Thu, Apr 5, 2018 at 1:47 PM, Hans de Goede <hdegoede@redhat.com> wrote: >>> >>> Hi, >>> >>> >>> On 05-04-18 09:14, Daniel Vetter wrote: >>>> >>>> >>>> On Fri, Mar 30, 2018 at 02:27:15PM +0200, Hans de Goede wrote: >>>>> >>>>> >>>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was >>>>> skipping the first 4k by passing 4096 as start of the address range >>>>> passed >>>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try >>>>> and >>>>> reserve the firmware framebuffer so that we can inherit it would always >>>>> fail, as the firmware framebuffer starts at address 0. >>>>> >>>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on >>>>> everything >= gen8") says in its commit message: "This is confirmed to >>>>> fix >>>>> Skylake screen flickering issues (probably caused by the fact that we >>>>> initialized a ring in the first page of stolen, but I didn't 100% >>>>> confirm >>>>> this theory)." >>>>> >>>>> Which suggests that it is safe to use the first page for a linear >>>>> framebuffer as the firmware is doing. >>>>> >>>>> This commit always passes 0 as start to drm_mm_init() and works around >>>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range() >>>>> by insuring the start address passed by to >>>>> drm_mm_insert_node_in_range() >>>>> is always 4k or more. All entry points to i915_gem_stolen.c go through >>>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated >>>>> objects such as ring-buffers will not be allocated in the first 4k. >>>>> >>>>> The one exception is i915_gem_object_create_stolen_for_preallocated() >>>>> which directly calls drm_mm_reserve_node() which now will be able to >>>>> use the first 4k. >>>>> >>>>> This fixes the i915 driver no longer being able to inherit the firmware >>>>> framebuffer on gen8+, which fixes the video output changing from the >>>>> vendor logo to a black screen as soon as the i915 driver is loaded >>>>> (on systems without fbcon). >>>>> >>>>> Signed-off-by: Hans de Goede <hdegoede@redhat.com> >>>> >>>> >>>> >>>> I think this is worth a shot. The only explanation I can think of why >>>> the >>>> GOP could get away with this and still follow the w/a is if it doesn't >>>> have a 1:1 mapping between GGTT and stolen. >>> >>> >>> >>> My guess is that the GOP does not care about the w/a because the w/a >>> states that the command-streamers sometimes do a spurious flush (write) >>> to the first page, and the command-streamers are not used until much >>> later, so the GOP is probably ignoring the w/a all together. >>> >>> At least that is my theory. >> >> >> Atm we do not know whether the GOP actually implements the wa or not. >> That it doesn't care is just a conjecture, but we have no proof. On >> previous platforms the 1:1 mapping did hold, but there's no >> fundamental reason why it sitll does. >> >>>> Atm we hardcode that >>>> assumption in intel_alloc_initial_plane_obj by passing the base_aligned >>>> as >>>> both the stolen_offset and the gtt_offset (but it's only the gtt_offset >>>> really). And since we're not re-writing the ptes it's not noticeable. >>>> >>>> I think to decide whether this is the right approach we should >>>> double-check whether that 1:1 assumption really holds true: Either read >>>> back the ggtt ptes and check their addresses (but iirc on some platforms >>>> their write-only, readback doesn't work), or we also rewrite the ptes >>>> again for preallocated stuff, like when binding a normal object into the >>>> gtt. If either of these approaches confirms that those affected gen8+ >>>> machines still use the 1:1 mapping, then I'm happy to put my r-b on this >>>> patch. If not, well then we at least know what to fix: We need to read >>>> out >>>> the real stolen_offset, instead of making assumptions. >>> >>> >>> >>> I'm happy to give this a try on one or more affected machines, I can e.g. >>> try this on both a skylake desktop and a cherry-trail based tablet. >>> >>> But I'm clueless about the whole PTE stuff, so I'm going to need someone >>> to provide me with a patch following either approach. If readback of the >>> PTEs works on skylake / cherrytrail I guess that would be the best way. >>> >>> Note to test this I'm currently booting the kernel with the machine's >>> UEFI vendor logo still being displayed when efifb kicks in. I've added: >>> "fbcon=vc:2-62" to the kernel commandline to make fbcon not bind to >>> VC0 / VC1, so that the framebuffer contents stays intact, combined with >>> the patch we are discussing now, this makes the vendor logo stay on >>> the screen all the way till GDM loads, which looks rather nice actually >>> :) >>> >>> And on shutdown we fall back to the original framebuffer and the vendor >>> logo is back again. I cannot see any corruption in the display at either >>> boot or shutdown time. >> >> >> It shouldn't matter whether efifb takes over or not, it's still the >> GOP's framebuffer we're taking over. Different content for sure, logo >> is gone, but we only care about which pages we're using. > > > Right, I only mentioned this to explain that I'm not seeing any > (visible) corruption. > >> Wrt the code: >> - (Re)binding happens by calling i915_vma_bind, if you put a call to >> that at the end of the stolen_preallocated functions you should get >> that. Ofc needs the logo still there so you can see it jump/get >> mangled. >> - GGTT PTEs for gen8+ are written through e.g. gen8_ggttt_insert_page >> or gen8_ggtt_insert_entries (which takes the vma). Since we only care >> about the first entry a readq(dev_priv->ggtt->gsm) is all we really >> need. If it's all 0 then it's not working (since at least >> _PAGE_PRESENT should be set, plus the physical address of a page in >> stolen memory). > > > Ok, I will try to give this a shot, probably not before coming > Monday though. For the 2nd expirement we ofc also need the base physical address of stolen. /proc/iomem should dump that already, but if it's not there pls add a printk for that too. It's dev_priv->dsm.start, see i915_pages_create_for_stolen. >> I can do some patches for each of those, but a bit swamped. Also no >> gen8 handy for testing I think to make sure it works, so would take a >> few weeks probably. > > > I'll give this a shot myself first and attach the patch when I reply > with the testing results, so that you can verify that the patch works > as expected. Thanks a lot. I'll be travelling a bit later on next week, but I'll try to take a look. Cheers, Daniel > > Regards, > > Hans > > > >> >> Cheers, Daniel >>> >>> >>> >>>> >>>> Cheers, Daniel >>>>> >>>>> >>>>> --- >>>>> drivers/gpu/drm/i915/i915_gem_stolen.c | 15 ++++++--------- >>>>> 1 file changed, 6 insertions(+), 9 deletions(-) >>>>> >>>>> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c >>>>> b/drivers/gpu/drm/i915/i915_gem_stolen.c >>>>> index af915d041281..ad949cc30928 100644 >>>>> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c >>>>> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c >>>>> @@ -51,6 +51,10 @@ int i915_gem_stolen_insert_node_in_range(struct >>>>> drm_i915_private *dev_priv, >>>>> if (!drm_mm_initialized(&dev_priv->mm.stolen)) >>>>> return -ENODEV; >>>>> + /* WaSkipStolenMemoryFirstPage:bdw+ */ >>>>> + if (INTEL_GEN(dev_priv) >= 8 && start < 4096) >>>>> + start = 4096; >>>>> + >>>>> mutex_lock(&dev_priv->mm.stolen_lock); >>>>> ret = drm_mm_insert_node_in_range(&dev_priv->mm.stolen, node, >>>>> size, alignment, 0, >>>>> @@ -343,7 +347,6 @@ int i915_gem_init_stolen(struct drm_i915_private >>>>> *dev_priv) >>>>> { >>>>> resource_size_t reserved_base, stolen_top; >>>>> resource_size_t reserved_total, reserved_size; >>>>> - resource_size_t stolen_usable_start; >>>>> mutex_init(&dev_priv->mm.stolen_lock); >>>>> @@ -435,17 +438,11 @@ int i915_gem_init_stolen(struct >>>>> drm_i915_private >>>>> *dev_priv) >>>>> (u64)resource_size(&dev_priv->dsm) >> 10, >>>>> ((u64)resource_size(&dev_priv->dsm) - >>>>> reserved_total) >> 10); >>>>> - stolen_usable_start = 0; >>>>> - /* WaSkipStolenMemoryFirstPage:bdw+ */ >>>>> - if (INTEL_GEN(dev_priv) >= 8) >>>>> - stolen_usable_start = 4096; >>>>> - >>>>> dev_priv->stolen_usable_size = >>>>> - resource_size(&dev_priv->dsm) - reserved_total - >>>>> stolen_usable_start; >>>>> + resource_size(&dev_priv->dsm) - reserved_total; >>>>> /* Basic memrange allocator for stolen space. */ >>>>> - drm_mm_init(&dev_priv->mm.stolen, stolen_usable_start, >>>>> - dev_priv->stolen_usable_size); >>>>> + drm_mm_init(&dev_priv->mm.stolen, 0, >>>>> dev_priv->stolen_usable_size); >>>>> return 0; >>>>> } >>>>> -- >>>>> 2.17.0.rc2 >>>>> >>>>> _______________________________________________ >>>>> Intel-gfx mailing list >>>>> Intel-gfx@lists.freedesktop.org >>>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx >>>> >>>> >>>> >>> >> >> >> >
On Thu, Apr 05, 2018 at 01:37:31PM +0200, Hans de Goede wrote: > Hi, > > On 04-04-18 22:49, Ville Syrjälä wrote: > > On Wed, Apr 04, 2018 at 10:06:29PM +0200, Hans de Goede wrote: > >> Hi, > >> > >> On 04-04-18 17:50, Ville Syrjälä wrote: > >>> On Fri, Mar 30, 2018 at 04:26:53PM +0200, Hans de Goede wrote: > >>>> Hi, > >>>> > >>>> On 30-03-18 15:25, Hans de Goede wrote: > >>>>> Hi, > >>>>> > >>>>> On 30-03-18 14:44, Chris Wilson wrote: > >>>>>> Quoting Hans de Goede (2018-03-30 13:37:40) > >>>>>>> Hi, > >>>>>>> > >>>>>>> On 30-03-18 14:30, Chris Wilson wrote: > >>>>>>>> Quoting Hans de Goede (2018-03-30 13:27:15) > >>>>>>>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was > >>>>>>>>> skipping the first 4k by passing 4096 as start of the address range passed > >>>>>>>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and > >>>>>>>>> reserve the firmware framebuffer so that we can inherit it would always > >>>>>>>>> fail, as the firmware framebuffer starts at address 0. > >>>>>>>>> > >>>>>>>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on > >>>>>>>>> everything >= gen8") says in its commit message: "This is confirmed to fix > >>>>>>>>> Skylake screen flickering issues (probably caused by the fact that we > >>>>>>>>> initialized a ring in the first page of stolen, but I didn't 100% confirm > >>>>>>>>> this theory)." > >>>>>>>>> > >>>>>>>>> Which suggests that it is safe to use the first page for a linear > >>>>>>>>> framebuffer as the firmware is doing. > >>>>>>>>> > >>>>>>>>> This commit always passes 0 as start to drm_mm_init() and works around > >>>>>>>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range() > >>>>>>>>> by insuring the start address passed by to drm_mm_insert_node_in_range() > >>>>>>>>> is always 4k or more. All entry points to i915_gem_stolen.c go through > >>>>>>>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated > >>>>>>>>> objects such as ring-buffers will not be allocated in the first 4k. > >>>>>>>>> > >>>>>>>>> The one exception is i915_gem_object_create_stolen_for_preallocated() > >>>>>>>>> which directly calls drm_mm_reserve_node() which now will be able to > >>>>>>>>> use the first 4k. > >>>>>>>>> > >>>>>>>>> This fixes the i915 driver no longer being able to inherit the firmware > >>>>>>>>> framebuffer on gen8+, which fixes the video output changing from the > >>>>>>>>> vendor logo to a black screen as soon as the i915 driver is loaded > >>>>>>>>> (on systems without fbcon). > >>>>>>>> > >>>>>>>> We've been told by the HW guys not to use the first page. (That's my > >>>>>>>> understanding from every time this gets questioned.) > >>>>>>> > >>>>>>> Yet the GOP is happily using the first page. I think we may need to make > >>>>>>> a difference here between the GPU not using the first page and the > >>>>>>> display engine/pipeline not using the first page. Note that my patch > >>>>>>> only influences the inheriting of the initial framebuffer as allocated > >>>>>>> by the GOP. It does not influence any other allocations from the > >>>>>>> reserved range, those will still all avoid the first page. > >>>>>>> > >>>>>>> Without this patch fastboot / flickerfree support is essentially broken > >>>>>>> on any gen8+ hardware given that one of the goals of atomic is to be > >>>>>>> able to do flickerfree transitions I think that this warrants a closer > >>>>>>> look then just simply saying never use the first page. > >>>>>> > >>>>>> The concern is what else (i.e. nothing that we allocated ourselves) that > >>>>>> may be in the first page... > >>>>> > >>>>> Given that the GOP has put its framebuffer there at least at boot there > >>>>> is nothing there, otherwise it would show up on the display. > >>>>> > >>>>> We have a whole bunch of code to inherit the BIOS fb in intel_display.c > >>>>> and AFAIK that code is there because this inheriting the BIOS fb is > >>>>> deemed an important feature. So I'm not happy at all with the handwavy > >>>>> best to not touch it answer I'm getting to this patch. > >>>>> > >>>>> Unless there are some clear answer from the hardware folks which specifically > >>>>> say we cannot put a framebuffer there (and then why is the GOP doing it?) > >>>>> then I believe that the best way forward here is to get various people to > >>>>> test with this patch and the best way to do that is probably to put it > >>>>> in next. Note I deliberately did not add a Cc stable. > >>>> > >>>> To elaborate on this, the excluding of the first 4k of the stolen memory > >>>> region causes intel_alloc_initial_plane_obj() from intel_display.c to fail, > >>>> which in turn causes intel_find_initial_plane_obj() to call > >>>> intel_plane_disable_noatomic(intel_crtc, intel_plane); which temporarily > >>>> completely turns off the display which leads to a very ugly flickering > >>>> of the display at boot (as well as replacing the vendor logo with a > >>>> black screen). > >>>> > >>>> I think we can all agree that this behavior is undesirable and even a > >>>> regression in behavior caused by the fix to exclude the first 4k. > >>>> > >>>> Chris, if my patch is not an acceptable way to fix this, then how do you > >>>> suggest that we fix this? > >>>> > >>>> Digging a bit deeper I found this: > >>>> > >>>> https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-kbl-vol16-workarounds.pdf > >>>> > >>>> Which says: > >>>> > >>>> "WaSkipStolenMemoryFirstPage: > >>>> > >>>> WA to skip the first page of stolen > >>>> memory due to sporadic HW write on *CS Idle" > >>>> > >>>> And also about FBC: > >>>> > >>>> "First line of FBC getting corrupted when FBC > >>>> compressed frame buffer offset is programmed to > >>>> zero. Command streamers are doing flush writes to > >>>> base of stolen. > >>>> WA: New restriction to program FBC compressed > >>>> frame buffer offset to at least 4KB." > >>>> > >>>> So using the first 4kB for the *framebuffer* as done by the GOP will > >>>> not cause any major problems (freezes, hangs, etc.), and commit > >>>> d43537610470 ("drm/i915: skip the first 4k of stolen memory on > >>>> everything >= gen8") was correct in deducing that the problem was > >>>> likely that some *vital* information was being stored i the first 4k > >>>> and that go overwritten. > >>>> > >>>> But the contents of the (first lines of) the framebuffer may become > >>>> corrupted once we actually start using the command-streamers, which > >>>> is still very much not wanted. > >>>> > >>>> In practice Xorg or Wayland will likely have setup another framebuffer > >>>> by the time the command-streamers will start to get used. > >>>> > >>>> Alternatively we could start with inheriting the BIOS framebuffer > >>>> (as my patch allows) so that we don't get the flicker and then soon > >>>> afterwards atomically transit to a new framebuffer (which should > >>>> contain a copy of the BIOS fb contents) at a different location. > >>> > >>> What I suggested long ago was to copy just the first page and adjust the > >>> sg list. But I'm not sure if our stolen gem code would be happy with an > >>> sg list with two entries instead of one. > >> > >> But that would still require an atomic-modeset to install the new sg-list, > >> right? > > > > Perhaps not. Not sure if the pte update would be atomic enough to just > > change it underneath the display engine without ill effects, and then > > do the equivalent of a page flip to invalidate the TLBs. > > > >> Then we might just as well simply alloc a new fb and copy the > >> contents over, or are you worried that with say a 4k fb that takes too > >> much time? FWIW I can see how the single memcpy this involves will take > >> some time, but I don't take it will take so long as to be a problem. > > > > Mainly just a question of keeping it in stolen. > > Ah I see. > > > Assuming we want to keep > > things in stolen, which is a matter of some debate as FBC needs stolen > > and people might not be happy if it's all taken up by fbdev. > > > >> > >> Anyways I could use some help with implementing either solution as I'm > >> not familiar with the involved parts of the code. I will happily test > >> a patch for this. Keep in mind that for this to work my original patch > >> will also be necessary so that the initial takeover of the firmware > >> fb will work. > > > > I guess the trickiest part would be getting both the old and new > > location of the page mapped in the ggtt at the same time. Sadly you're > > not allowed to access stolen directly. So I suppose this part would > > involve some fairly low level frobbing of the ggtt ptes and a > > manual ioremap() of the matching ranges of the aperture. > > Hmm, you're talking about what needs to be done to copy the contents here, > right? Yeah. > I have a feeling we really should just try only my patch first, as > mentioned before the worst thing which can happen is some corruption > of the first lines of the display, which I agree is not good, but also > not the end of the world. One can hope :)
Hi, On 06-04-18 18:06, Ville Syrjälä wrote: > On Thu, Apr 05, 2018 at 01:37:31PM +0200, Hans de Goede wrote: >> Hi, >> >> On 04-04-18 22:49, Ville Syrjälä wrote: >>> On Wed, Apr 04, 2018 at 10:06:29PM +0200, Hans de Goede wrote: >>>> Hi, >>>> >>>> On 04-04-18 17:50, Ville Syrjälä wrote: >>>>> On Fri, Mar 30, 2018 at 04:26:53PM +0200, Hans de Goede wrote: >>>>>> Hi, >>>>>> >>>>>> On 30-03-18 15:25, Hans de Goede wrote: >>>>>>> Hi, >>>>>>> >>>>>>> On 30-03-18 14:44, Chris Wilson wrote: >>>>>>>> Quoting Hans de Goede (2018-03-30 13:37:40) >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> On 30-03-18 14:30, Chris Wilson wrote: >>>>>>>>>> Quoting Hans de Goede (2018-03-30 13:27:15) >>>>>>>>>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was >>>>>>>>>>> skipping the first 4k by passing 4096 as start of the address range passed >>>>>>>>>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and >>>>>>>>>>> reserve the firmware framebuffer so that we can inherit it would always >>>>>>>>>>> fail, as the firmware framebuffer starts at address 0. >>>>>>>>>>> >>>>>>>>>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on >>>>>>>>>>> everything >= gen8") says in its commit message: "This is confirmed to fix >>>>>>>>>>> Skylake screen flickering issues (probably caused by the fact that we >>>>>>>>>>> initialized a ring in the first page of stolen, but I didn't 100% confirm >>>>>>>>>>> this theory)." >>>>>>>>>>> >>>>>>>>>>> Which suggests that it is safe to use the first page for a linear >>>>>>>>>>> framebuffer as the firmware is doing. >>>>>>>>>>> >>>>>>>>>>> This commit always passes 0 as start to drm_mm_init() and works around >>>>>>>>>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range() >>>>>>>>>>> by insuring the start address passed by to drm_mm_insert_node_in_range() >>>>>>>>>>> is always 4k or more. All entry points to i915_gem_stolen.c go through >>>>>>>>>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated >>>>>>>>>>> objects such as ring-buffers will not be allocated in the first 4k. >>>>>>>>>>> >>>>>>>>>>> The one exception is i915_gem_object_create_stolen_for_preallocated() >>>>>>>>>>> which directly calls drm_mm_reserve_node() which now will be able to >>>>>>>>>>> use the first 4k. >>>>>>>>>>> >>>>>>>>>>> This fixes the i915 driver no longer being able to inherit the firmware >>>>>>>>>>> framebuffer on gen8+, which fixes the video output changing from the >>>>>>>>>>> vendor logo to a black screen as soon as the i915 driver is loaded >>>>>>>>>>> (on systems without fbcon). >>>>>>>>>> >>>>>>>>>> We've been told by the HW guys not to use the first page. (That's my >>>>>>>>>> understanding from every time this gets questioned.) >>>>>>>>> >>>>>>>>> Yet the GOP is happily using the first page. I think we may need to make >>>>>>>>> a difference here between the GPU not using the first page and the >>>>>>>>> display engine/pipeline not using the first page. Note that my patch >>>>>>>>> only influences the inheriting of the initial framebuffer as allocated >>>>>>>>> by the GOP. It does not influence any other allocations from the >>>>>>>>> reserved range, those will still all avoid the first page. >>>>>>>>> >>>>>>>>> Without this patch fastboot / flickerfree support is essentially broken >>>>>>>>> on any gen8+ hardware given that one of the goals of atomic is to be >>>>>>>>> able to do flickerfree transitions I think that this warrants a closer >>>>>>>>> look then just simply saying never use the first page. >>>>>>>> >>>>>>>> The concern is what else (i.e. nothing that we allocated ourselves) that >>>>>>>> may be in the first page... >>>>>>> >>>>>>> Given that the GOP has put its framebuffer there at least at boot there >>>>>>> is nothing there, otherwise it would show up on the display. >>>>>>> >>>>>>> We have a whole bunch of code to inherit the BIOS fb in intel_display.c >>>>>>> and AFAIK that code is there because this inheriting the BIOS fb is >>>>>>> deemed an important feature. So I'm not happy at all with the handwavy >>>>>>> best to not touch it answer I'm getting to this patch. >>>>>>> >>>>>>> Unless there are some clear answer from the hardware folks which specifically >>>>>>> say we cannot put a framebuffer there (and then why is the GOP doing it?) >>>>>>> then I believe that the best way forward here is to get various people to >>>>>>> test with this patch and the best way to do that is probably to put it >>>>>>> in next. Note I deliberately did not add a Cc stable. >>>>>> >>>>>> To elaborate on this, the excluding of the first 4k of the stolen memory >>>>>> region causes intel_alloc_initial_plane_obj() from intel_display.c to fail, >>>>>> which in turn causes intel_find_initial_plane_obj() to call >>>>>> intel_plane_disable_noatomic(intel_crtc, intel_plane); which temporarily >>>>>> completely turns off the display which leads to a very ugly flickering >>>>>> of the display at boot (as well as replacing the vendor logo with a >>>>>> black screen). >>>>>> >>>>>> I think we can all agree that this behavior is undesirable and even a >>>>>> regression in behavior caused by the fix to exclude the first 4k. >>>>>> >>>>>> Chris, if my patch is not an acceptable way to fix this, then how do you >>>>>> suggest that we fix this? >>>>>> >>>>>> Digging a bit deeper I found this: >>>>>> >>>>>> https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-kbl-vol16-workarounds.pdf >>>>>> >>>>>> Which says: >>>>>> >>>>>> "WaSkipStolenMemoryFirstPage: >>>>>> >>>>>> WA to skip the first page of stolen >>>>>> memory due to sporadic HW write on *CS Idle" >>>>>> >>>>>> And also about FBC: >>>>>> >>>>>> "First line of FBC getting corrupted when FBC >>>>>> compressed frame buffer offset is programmed to >>>>>> zero. Command streamers are doing flush writes to >>>>>> base of stolen. >>>>>> WA: New restriction to program FBC compressed >>>>>> frame buffer offset to at least 4KB." >>>>>> >>>>>> So using the first 4kB for the *framebuffer* as done by the GOP will >>>>>> not cause any major problems (freezes, hangs, etc.), and commit >>>>>> d43537610470 ("drm/i915: skip the first 4k of stolen memory on >>>>>> everything >= gen8") was correct in deducing that the problem was >>>>>> likely that some *vital* information was being stored i the first 4k >>>>>> and that go overwritten. >>>>>> >>>>>> But the contents of the (first lines of) the framebuffer may become >>>>>> corrupted once we actually start using the command-streamers, which >>>>>> is still very much not wanted. >>>>>> >>>>>> In practice Xorg or Wayland will likely have setup another framebuffer >>>>>> by the time the command-streamers will start to get used. >>>>>> >>>>>> Alternatively we could start with inheriting the BIOS framebuffer >>>>>> (as my patch allows) so that we don't get the flicker and then soon >>>>>> afterwards atomically transit to a new framebuffer (which should >>>>>> contain a copy of the BIOS fb contents) at a different location. >>>>> >>>>> What I suggested long ago was to copy just the first page and adjust the >>>>> sg list. But I'm not sure if our stolen gem code would be happy with an >>>>> sg list with two entries instead of one. >>>> >>>> But that would still require an atomic-modeset to install the new sg-list, >>>> right? >>> >>> Perhaps not. Not sure if the pte update would be atomic enough to just >>> change it underneath the display engine without ill effects, and then >>> do the equivalent of a page flip to invalidate the TLBs. >>> >>>> Then we might just as well simply alloc a new fb and copy the >>>> contents over, or are you worried that with say a 4k fb that takes too >>>> much time? FWIW I can see how the single memcpy this involves will take >>>> some time, but I don't take it will take so long as to be a problem. >>> >>> Mainly just a question of keeping it in stolen. >> >> Ah I see. >> >>> Assuming we want to keep >>> things in stolen, which is a matter of some debate as FBC needs stolen >>> and people might not be happy if it's all taken up by fbdev. >>> >>>> >>>> Anyways I could use some help with implementing either solution as I'm >>>> not familiar with the involved parts of the code. I will happily test >>>> a patch for this. Keep in mind that for this to work my original patch >>>> will also be necessary so that the initial takeover of the firmware >>>> fb will work. >>> >>> I guess the trickiest part would be getting both the old and new >>> location of the page mapped in the ggtt at the same time. Sadly you're >>> not allowed to access stolen directly. So I suppose this part would >>> involve some fairly low level frobbing of the ggtt ptes and a >>> manual ioremap() of the matching ranges of the aperture. >> >> Hmm, you're talking about what needs to be done to copy the contents here, >> right? > > Yeah. So thinking more about this, for the old / BIOS framebuffer there already is a mapping setup for efifb use and for the new one we should also already set up a mapping when we create it, so I think we can really just do a memcpy here after creating a new framebuffer? Anyways I will run the tests Daniel asked me to run first. Regards, Hans
On Sat, Apr 07, 2018 at 11:34:57AM +0200, Hans de Goede wrote: > Hi, > > On 06-04-18 18:06, Ville Syrjälä wrote: > > On Thu, Apr 05, 2018 at 01:37:31PM +0200, Hans de Goede wrote: > > > Hi, > > > > > > On 04-04-18 22:49, Ville Syrjälä wrote: > > > > On Wed, Apr 04, 2018 at 10:06:29PM +0200, Hans de Goede wrote: > > > > > Hi, > > > > > > > > > > On 04-04-18 17:50, Ville Syrjälä wrote: > > > > > > On Fri, Mar 30, 2018 at 04:26:53PM +0200, Hans de Goede wrote: > > > > > > > Hi, > > > > > > > > > > > > > > On 30-03-18 15:25, Hans de Goede wrote: > > > > > > > > Hi, > > > > > > > > > > > > > > > > On 30-03-18 14:44, Chris Wilson wrote: > > > > > > > > > Quoting Hans de Goede (2018-03-30 13:37:40) > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > On 30-03-18 14:30, Chris Wilson wrote: > > > > > > > > > > > Quoting Hans de Goede (2018-03-30 13:27:15) > > > > > > > > > > > > Before this commit the WaSkipStolenMemoryFirstPage workaround code was > > > > > > > > > > > > skipping the first 4k by passing 4096 as start of the address range passed > > > > > > > > > > > > to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and > > > > > > > > > > > > reserve the firmware framebuffer so that we can inherit it would always > > > > > > > > > > > > fail, as the firmware framebuffer starts at address 0. > > > > > > > > > > > > > > > > > > > > > > > > Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on > > > > > > > > > > > > everything >= gen8") says in its commit message: "This is confirmed to fix > > > > > > > > > > > > Skylake screen flickering issues (probably caused by the fact that we > > > > > > > > > > > > initialized a ring in the first page of stolen, but I didn't 100% confirm > > > > > > > > > > > > this theory)." > > > > > > > > > > > > > > > > > > > > > > > > Which suggests that it is safe to use the first page for a linear > > > > > > > > > > > > framebuffer as the firmware is doing. > > > > > > > > > > > > > > > > > > > > > > > > This commit always passes 0 as start to drm_mm_init() and works around > > > > > > > > > > > > WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range() > > > > > > > > > > > > by insuring the start address passed by to drm_mm_insert_node_in_range() > > > > > > > > > > > > is always 4k or more. All entry points to i915_gem_stolen.c go through > > > > > > > > > > > > i915_gem_stolen_insert_node_in_range(), so that any newly allocated > > > > > > > > > > > > objects such as ring-buffers will not be allocated in the first 4k. > > > > > > > > > > > > > > > > > > > > > > > > The one exception is i915_gem_object_create_stolen_for_preallocated() > > > > > > > > > > > > which directly calls drm_mm_reserve_node() which now will be able to > > > > > > > > > > > > use the first 4k. > > > > > > > > > > > > > > > > > > > > > > > > This fixes the i915 driver no longer being able to inherit the firmware > > > > > > > > > > > > framebuffer on gen8+, which fixes the video output changing from the > > > > > > > > > > > > vendor logo to a black screen as soon as the i915 driver is loaded > > > > > > > > > > > > (on systems without fbcon). > > > > > > > > > > > > > > > > > > > > > > We've been told by the HW guys not to use the first page. (That's my > > > > > > > > > > > understanding from every time this gets questioned.) > > > > > > > > > > > > > > > > > > > > Yet the GOP is happily using the first page. I think we may need to make > > > > > > > > > > a difference here between the GPU not using the first page and the > > > > > > > > > > display engine/pipeline not using the first page. Note that my patch > > > > > > > > > > only influences the inheriting of the initial framebuffer as allocated > > > > > > > > > > by the GOP. It does not influence any other allocations from the > > > > > > > > > > reserved range, those will still all avoid the first page. > > > > > > > > > > > > > > > > > > > > Without this patch fastboot / flickerfree support is essentially broken > > > > > > > > > > on any gen8+ hardware given that one of the goals of atomic is to be > > > > > > > > > > able to do flickerfree transitions I think that this warrants a closer > > > > > > > > > > look then just simply saying never use the first page. > > > > > > > > > > > > > > > > > > The concern is what else (i.e. nothing that we allocated ourselves) that > > > > > > > > > may be in the first page... > > > > > > > > > > > > > > > > Given that the GOP has put its framebuffer there at least at boot there > > > > > > > > is nothing there, otherwise it would show up on the display. > > > > > > > > > > > > > > > > We have a whole bunch of code to inherit the BIOS fb in intel_display.c > > > > > > > > and AFAIK that code is there because this inheriting the BIOS fb is > > > > > > > > deemed an important feature. So I'm not happy at all with the handwavy > > > > > > > > best to not touch it answer I'm getting to this patch. > > > > > > > > > > > > > > > > Unless there are some clear answer from the hardware folks which specifically > > > > > > > > say we cannot put a framebuffer there (and then why is the GOP doing it?) > > > > > > > > then I believe that the best way forward here is to get various people to > > > > > > > > test with this patch and the best way to do that is probably to put it > > > > > > > > in next. Note I deliberately did not add a Cc stable. > > > > > > > > > > > > > > To elaborate on this, the excluding of the first 4k of the stolen memory > > > > > > > region causes intel_alloc_initial_plane_obj() from intel_display.c to fail, > > > > > > > which in turn causes intel_find_initial_plane_obj() to call > > > > > > > intel_plane_disable_noatomic(intel_crtc, intel_plane); which temporarily > > > > > > > completely turns off the display which leads to a very ugly flickering > > > > > > > of the display at boot (as well as replacing the vendor logo with a > > > > > > > black screen). > > > > > > > > > > > > > > I think we can all agree that this behavior is undesirable and even a > > > > > > > regression in behavior caused by the fix to exclude the first 4k. > > > > > > > > > > > > > > Chris, if my patch is not an acceptable way to fix this, then how do you > > > > > > > suggest that we fix this? > > > > > > > > > > > > > > Digging a bit deeper I found this: > > > > > > > > > > > > > > https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-kbl-vol16-workarounds.pdf > > > > > > > > > > > > > > Which says: > > > > > > > > > > > > > > "WaSkipStolenMemoryFirstPage: > > > > > > > > > > > > > > WA to skip the first page of stolen > > > > > > > memory due to sporadic HW write on *CS Idle" > > > > > > > > > > > > > > And also about FBC: > > > > > > > > > > > > > > "First line of FBC getting corrupted when FBC > > > > > > > compressed frame buffer offset is programmed to > > > > > > > zero. Command streamers are doing flush writes to > > > > > > > base of stolen. > > > > > > > WA: New restriction to program FBC compressed > > > > > > > frame buffer offset to at least 4KB." > > > > > > > > > > > > > > So using the first 4kB for the *framebuffer* as done by the GOP will > > > > > > > not cause any major problems (freezes, hangs, etc.), and commit > > > > > > > d43537610470 ("drm/i915: skip the first 4k of stolen memory on > > > > > > > everything >= gen8") was correct in deducing that the problem was > > > > > > > likely that some *vital* information was being stored i the first 4k > > > > > > > and that go overwritten. > > > > > > > > > > > > > > But the contents of the (first lines of) the framebuffer may become > > > > > > > corrupted once we actually start using the command-streamers, which > > > > > > > is still very much not wanted. > > > > > > > > > > > > > > In practice Xorg or Wayland will likely have setup another framebuffer > > > > > > > by the time the command-streamers will start to get used. > > > > > > > > > > > > > > Alternatively we could start with inheriting the BIOS framebuffer > > > > > > > (as my patch allows) so that we don't get the flicker and then soon > > > > > > > afterwards atomically transit to a new framebuffer (which should > > > > > > > contain a copy of the BIOS fb contents) at a different location. > > > > > > > > > > > > What I suggested long ago was to copy just the first page and adjust the > > > > > > sg list. But I'm not sure if our stolen gem code would be happy with an > > > > > > sg list with two entries instead of one. > > > > > > > > > > But that would still require an atomic-modeset to install the new sg-list, > > > > > right? > > > > > > > > Perhaps not. Not sure if the pte update would be atomic enough to just > > > > change it underneath the display engine without ill effects, and then > > > > do the equivalent of a page flip to invalidate the TLBs. > > > > > > > > > Then we might just as well simply alloc a new fb and copy the > > > > > contents over, or are you worried that with say a 4k fb that takes too > > > > > much time? FWIW I can see how the single memcpy this involves will take > > > > > some time, but I don't take it will take so long as to be a problem. > > > > > > > > Mainly just a question of keeping it in stolen. > > > > > > Ah I see. > > > > > > > Assuming we want to keep > > > > things in stolen, which is a matter of some debate as FBC needs stolen > > > > and people might not be happy if it's all taken up by fbdev. > > > > > > > > > > > > > > Anyways I could use some help with implementing either solution as I'm > > > > > not familiar with the involved parts of the code. I will happily test > > > > > a patch for this. Keep in mind that for this to work my original patch > > > > > will also be necessary so that the initial takeover of the firmware > > > > > fb will work. > > > > > > > > I guess the trickiest part would be getting both the old and new > > > > location of the page mapped in the ggtt at the same time. Sadly you're > > > > not allowed to access stolen directly. So I suppose this part would > > > > involve some fairly low level frobbing of the ggtt ptes and a > > > > manual ioremap() of the matching ranges of the aperture. > > > > > > Hmm, you're talking about what needs to be done to copy the contents here, > > > right? > > > > Yeah. > > So thinking more about this, for the old / BIOS framebuffer there > already is a mapping setup for efifb use and for the new one we > should also already set up a mapping when we create it, so I think > we can really just do a memcpy here after creating a new framebuffer? > > Anyways I will run the tests Daniel asked me to run first. Yeah let's not invent solutions for problems we might not even have :-) -Daniel
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c index af915d041281..ad949cc30928 100644 --- a/drivers/gpu/drm/i915/i915_gem_stolen.c +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c @@ -51,6 +51,10 @@ int i915_gem_stolen_insert_node_in_range(struct drm_i915_private *dev_priv, if (!drm_mm_initialized(&dev_priv->mm.stolen)) return -ENODEV; + /* WaSkipStolenMemoryFirstPage:bdw+ */ + if (INTEL_GEN(dev_priv) >= 8 && start < 4096) + start = 4096; + mutex_lock(&dev_priv->mm.stolen_lock); ret = drm_mm_insert_node_in_range(&dev_priv->mm.stolen, node, size, alignment, 0, @@ -343,7 +347,6 @@ int i915_gem_init_stolen(struct drm_i915_private *dev_priv) { resource_size_t reserved_base, stolen_top; resource_size_t reserved_total, reserved_size; - resource_size_t stolen_usable_start; mutex_init(&dev_priv->mm.stolen_lock); @@ -435,17 +438,11 @@ int i915_gem_init_stolen(struct drm_i915_private *dev_priv) (u64)resource_size(&dev_priv->dsm) >> 10, ((u64)resource_size(&dev_priv->dsm) - reserved_total) >> 10); - stolen_usable_start = 0; - /* WaSkipStolenMemoryFirstPage:bdw+ */ - if (INTEL_GEN(dev_priv) >= 8) - stolen_usable_start = 4096; - dev_priv->stolen_usable_size = - resource_size(&dev_priv->dsm) - reserved_total - stolen_usable_start; + resource_size(&dev_priv->dsm) - reserved_total; /* Basic memrange allocator for stolen space. */ - drm_mm_init(&dev_priv->mm.stolen, stolen_usable_start, - dev_priv->stolen_usable_size); + drm_mm_init(&dev_priv->mm.stolen, 0, dev_priv->stolen_usable_size); return 0; }
Before this commit the WaSkipStolenMemoryFirstPage workaround code was skipping the first 4k by passing 4096 as start of the address range passed to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and reserve the firmware framebuffer so that we can inherit it would always fail, as the firmware framebuffer starts at address 0. Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on everything >= gen8") says in its commit message: "This is confirmed to fix Skylake screen flickering issues (probably caused by the fact that we initialized a ring in the first page of stolen, but I didn't 100% confirm this theory)." Which suggests that it is safe to use the first page for a linear framebuffer as the firmware is doing. This commit always passes 0 as start to drm_mm_init() and works around WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range() by insuring the start address passed by to drm_mm_insert_node_in_range() is always 4k or more. All entry points to i915_gem_stolen.c go through i915_gem_stolen_insert_node_in_range(), so that any newly allocated objects such as ring-buffers will not be allocated in the first 4k. The one exception is i915_gem_object_create_stolen_for_preallocated() which directly calls drm_mm_reserve_node() which now will be able to use the first 4k. This fixes the i915 driver no longer being able to inherit the firmware framebuffer on gen8+, which fixes the video output changing from the vendor logo to a black screen as soon as the i915 driver is loaded (on systems without fbcon). Signed-off-by: Hans de Goede <hdegoede@redhat.com> --- drivers/gpu/drm/i915/i915_gem_stolen.c | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-)