mbox series

[0/4] drm/amd/display: Base changes for isolating FPU operation in a single place

Message ID 20210331122502.1031073-1-Rodrigo.Siqueira@amd.com (mailing list archive)
Headers show
Series drm/amd/display: Base changes for isolating FPU operation in a single place | expand

Message

Rodrigo Siqueira Jordao March 31, 2021, 12:24 p.m. UTC
Hi,

In the display core, we utilize floats and doubles units for calculating
modesetting parameters. One side effect of our approach to use double-precision
is the fact that we spread multiple FPU access across our driver, which means
that we can accidentally clobber user space FPU state.

# Challenges

1. Keep in mind that this FPU code is ingrained in our display driver and
performs several crucial tasks. Additionally, we already have multiple
architectures available in the kernel and a large set of users; in other words,
we prefer to avoid a radical approach that might break our user's system.

2. We share our display code with Windows; thus, we need to maintain the
interoperability between these two systems.

3. We need a mechanism for identifying which function uses FPU registers;
fortunately, Peter Zijlstra wrote a series a couple of months ago where he
introduced an FPU check for objtool. I used the following command for
identifying the potential FPU usage:

 ./tools/objtool/objtool check -Ffa "drivers/gpu/drm/amd/display/dc/ANY_FILE.o"

4. Since our code heavily relies on FPU and the fact that we spread
kernel_fpu_begin/end across multiple functions, we can have some complex
scenarios that will require code refactoring. However, we want to avoid
complicated changes since this is a formula to introduce regressions; we want
something that allows us to fix it in small, safe, and reliable steps.

# Our approach

For trying to solve this problem, we came up with the following strategy:

1. Keep in mind that we are using kernel_fpu_begin/end spread in various areas
and sometimes across multiple functions. If we try to move some of the
functions to an isolated place, we can generate a situation where we can call
the FPU protection more than once, causing multiple warnings. We can deal with
this problem by adding a thin management layer around the kernel_fpu_begin/end
used inside the display.

2. We will need a trace mechanism for this FPU management inside our display
code.

3. After we get the thin layer that manages FPU, we can start to move each
function that uses FPU to the centralized place. Our DQE runs multiple tests in
different ASICs every week; we can take advantage of this to ensure that our
FPU patches work does not introduce any regression. The idea is to work on a
specific part of the code every week (e.g.,  week 1: DCN2, week 1: DCN2.1,
etc.).

4. Finally, after we can isolate the FPU operations in a single place, we can
altogether remove the FPU flags from other files and eliminate an unnecessary
code introduced to deal with this problem.

# This series

To maintain the interoperability between multiple OSes, we already have a
define named DC_FP_START/END, which is a straightforward wrapper to
kernel_fpu_begin/end in the Linux side. In this series, I decided to expand the
scope of this DC_FP_* wrapper to trace FPU entrance and exit in the display
code, but I also add a mechanism for managing the entrance and exit of
kernel_fpu_begin/end. You can see the details on how I did that in the last two
patches.

I also isolate a simple function that requires FPU access to demonstrate my
strategy for isolating this FPU access in a single place. If this series gets
accepted, the following steps consist of moving all FPU functions weekly until
we isolate everything in the fpu_operation folder.

Best Regards
Rodrigo Siqueira


Rodrigo Siqueira (4):
  drm/amd/display: Introduce FPU directory inside DC
  drm/amd/display: Add FPU event trace
  drm/amd/display: Add ref count control for FPU utilization
  drm/amd/display: Add DC_FP helper to check FPU state

 .../gpu/drm/amd/display/amdgpu_dm/Makefile    |   3 +-
 .../amd/display/amdgpu_dm/amdgpu_dm_trace.h   |  24 ++++
 .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.c    | 111 ++++++++++++++++++
 .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.h    |  34 ++++++
 drivers/gpu/drm/amd/display/dc/Makefile       |   1 +
 drivers/gpu/drm/amd/display/dc/dc_trace.h     |   3 +
 .../drm/amd/display/dc/dcn20/dcn20_resource.c |  41 +------
 .../drm/amd/display/dc/dcn20/dcn20_resource.h |   2 -
 .../drm/amd/display/dc/dcn21/dcn21_resource.c |   2 +
 .../amd/display/dc/fpu_operations/Makefile    |  58 +++++++++
 .../drm/amd/display/dc/fpu_operations/dcn2x.c | 106 +++++++++++++++++
 .../drm/amd/display/dc/fpu_operations/dcn2x.h |  33 ++++++
 drivers/gpu/drm/amd/display/dc/os_types.h     |   6 +-
 13 files changed, 381 insertions(+), 43 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
 create mode 100644 drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.h
 create mode 100644 drivers/gpu/drm/amd/display/dc/fpu_operations/Makefile
 create mode 100644 drivers/gpu/drm/amd/display/dc/fpu_operations/dcn2x.c
 create mode 100644 drivers/gpu/drm/amd/display/dc/fpu_operations/dcn2x.h

Comments

Christian König March 31, 2021, 12:49 p.m. UTC | #1
Hi Rodrigo,

I'm not so happy about the whole recursion thing, but I think that is 
something which can be worked on later on.

Apart from that the approach sounds solid to me.

Regards,
Christian.

Am 31.03.21 um 14:24 schrieb Rodrigo Siqueira:
> Hi,
>
> In the display core, we utilize floats and doubles units for calculating
> modesetting parameters. One side effect of our approach to use double-precision
> is the fact that we spread multiple FPU access across our driver, which means
> that we can accidentally clobber user space FPU state.
>
> # Challenges
>
> 1. Keep in mind that this FPU code is ingrained in our display driver and
> performs several crucial tasks. Additionally, we already have multiple
> architectures available in the kernel and a large set of users; in other words,
> we prefer to avoid a radical approach that might break our user's system.
>
> 2. We share our display code with Windows; thus, we need to maintain the
> interoperability between these two systems.
>
> 3. We need a mechanism for identifying which function uses FPU registers;
> fortunately, Peter Zijlstra wrote a series a couple of months ago where he
> introduced an FPU check for objtool. I used the following command for
> identifying the potential FPU usage:
>
>   ./tools/objtool/objtool check -Ffa "drivers/gpu/drm/amd/display/dc/ANY_FILE.o"
>
> 4. Since our code heavily relies on FPU and the fact that we spread
> kernel_fpu_begin/end across multiple functions, we can have some complex
> scenarios that will require code refactoring. However, we want to avoid
> complicated changes since this is a formula to introduce regressions; we want
> something that allows us to fix it in small, safe, and reliable steps.
>
> # Our approach
>
> For trying to solve this problem, we came up with the following strategy:
>
> 1. Keep in mind that we are using kernel_fpu_begin/end spread in various areas
> and sometimes across multiple functions. If we try to move some of the
> functions to an isolated place, we can generate a situation where we can call
> the FPU protection more than once, causing multiple warnings. We can deal with
> this problem by adding a thin management layer around the kernel_fpu_begin/end
> used inside the display.
>
> 2. We will need a trace mechanism for this FPU management inside our display
> code.
>
> 3. After we get the thin layer that manages FPU, we can start to move each
> function that uses FPU to the centralized place. Our DQE runs multiple tests in
> different ASICs every week; we can take advantage of this to ensure that our
> FPU patches work does not introduce any regression. The idea is to work on a
> specific part of the code every week (e.g.,  week 1: DCN2, week 1: DCN2.1,
> etc.).
>
> 4. Finally, after we can isolate the FPU operations in a single place, we can
> altogether remove the FPU flags from other files and eliminate an unnecessary
> code introduced to deal with this problem.
>
> # This series
>
> To maintain the interoperability between multiple OSes, we already have a
> define named DC_FP_START/END, which is a straightforward wrapper to
> kernel_fpu_begin/end in the Linux side. In this series, I decided to expand the
> scope of this DC_FP_* wrapper to trace FPU entrance and exit in the display
> code, but I also add a mechanism for managing the entrance and exit of
> kernel_fpu_begin/end. You can see the details on how I did that in the last two
> patches.
>
> I also isolate a simple function that requires FPU access to demonstrate my
> strategy for isolating this FPU access in a single place. If this series gets
> accepted, the following steps consist of moving all FPU functions weekly until
> we isolate everything in the fpu_operation folder.
>
> Best Regards
> Rodrigo Siqueira
>
>
> Rodrigo Siqueira (4):
>    drm/amd/display: Introduce FPU directory inside DC
>    drm/amd/display: Add FPU event trace
>    drm/amd/display: Add ref count control for FPU utilization
>    drm/amd/display: Add DC_FP helper to check FPU state
>
>   .../gpu/drm/amd/display/amdgpu_dm/Makefile    |   3 +-
>   .../amd/display/amdgpu_dm/amdgpu_dm_trace.h   |  24 ++++
>   .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.c    | 111 ++++++++++++++++++
>   .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.h    |  34 ++++++
>   drivers/gpu/drm/amd/display/dc/Makefile       |   1 +
>   drivers/gpu/drm/amd/display/dc/dc_trace.h     |   3 +
>   .../drm/amd/display/dc/dcn20/dcn20_resource.c |  41 +------
>   .../drm/amd/display/dc/dcn20/dcn20_resource.h |   2 -
>   .../drm/amd/display/dc/dcn21/dcn21_resource.c |   2 +
>   .../amd/display/dc/fpu_operations/Makefile    |  58 +++++++++
>   .../drm/amd/display/dc/fpu_operations/dcn2x.c | 106 +++++++++++++++++
>   .../drm/amd/display/dc/fpu_operations/dcn2x.h |  33 ++++++
>   drivers/gpu/drm/amd/display/dc/os_types.h     |   6 +-
>   13 files changed, 381 insertions(+), 43 deletions(-)
>   create mode 100644 drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
>   create mode 100644 drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.h
>   create mode 100644 drivers/gpu/drm/amd/display/dc/fpu_operations/Makefile
>   create mode 100644 drivers/gpu/drm/amd/display/dc/fpu_operations/dcn2x.c
>   create mode 100644 drivers/gpu/drm/amd/display/dc/fpu_operations/dcn2x.h
>