[00/10] drm/vkms: rework crc worker

Message ID	20190606222751.32567-1-daniel.vetter@ffwll.ch (mailing list archive)
Headers	show Return-Path: <dri-devel-bounces@lists.freedesktop.org> From: Daniel Vetter <daniel.vetter@ffwll.ch> To: DRI Development <dri-devel@lists.freedesktop.org> Subject: [PATCH 00/10] drm/vkms: rework crc worker Date: Fri, 7 Jun 2019 00:27:41 +0200 Message-Id: <20190606222751.32567-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 X-Mailman-Original-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=2jYTwmO+POOJhTxQ1RaidZH+aZqgLhTZnC2sZyUydQc=; b=Seg676epj0Ntn6V+ieat08HJl92LXuNQ0VKDdRjIjTifR1zrLNIrs6iKuZtOEgaeK5 4600K+yf9gd5ZdSIWXkn+twnbOGzgGnNSg/x6vIWCgvdg8kBxt1SbI97rpLTfcRV4m+N KClw7K5R5KMJC8yjW77VKMburlb1Z/7EQTYSQ= Precedence: list Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>
Series	drm/vkms: rework crc worker \| expand [00/10] drm/vkms: rework crc worker [01/10] drm/vkms: Fix crc worker races [02/10] drm/vkms: Use spin_lock_irq in process context [03/10] drm/vkms: Rename vkms_output.state_lock to crc_lock [04/10] drm/vkms: Move format arrays to vkms_plane.c [05/10] drm/vkms: Add our own commit_tail [06/10] drm/vkms: flush crc workers earlier in commit flow [07/10] drm/vkms: Dont flush crc worker when we change crc status [08/10] drm/vkms: No _irqsave within spin_lock_irq needed [09/10] drm/vkms: totally reworked crc data tracking [10/10] drm/vkms: No need for ->pages_lock in crc work anymore

Daniel Vetter June 6, 2019, 10:27 p.m. UTC

Hi all,

This here is the first part of a rework for the vkms crc worker. I think
this should fix all the locking/races/use-after-free issues I spotted in
the code. There's more work we can do in the future as a follow-up:

- directly access vkms_plane_state->base in the crc worker, with this
  approach in this series here that should be safe now. Follow-up patches
  could switch and remove all the separate crc_data infrastructure.

- I think some kerneldoc for vkms structures would be nice. Documentation
  the various functions is probably overkill.

- Implementing a more generic blending engine, as prep for adding more
  pixel formats, more planes, and more features in general.

Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make
things worse, but I didn't do a full run.

Cheers, Daniel

Daniel Vetter (10):
  drm/vkms: Fix crc worker races
  drm/vkms: Use spin_lock_irq in process context
  drm/vkms: Rename vkms_output.state_lock to crc_lock
  drm/vkms: Move format arrays to vkms_plane.c
  drm/vkms: Add our own commit_tail
  drm/vkms: flush crc workers earlier in commit flow
  drm/vkms: Dont flush crc worker when we change crc status
  drm/vkms: No _irqsave within spin_lock_irq needed
  drm/vkms: totally reworked crc data tracking
  drm/vkms: No need for ->pages_lock in crc work anymore

 drivers/gpu/drm/vkms/vkms_crc.c   | 74 +++++++++-------------------
 drivers/gpu/drm/vkms/vkms_crtc.c  | 80 ++++++++++++++++++++++++++-----
 drivers/gpu/drm/vkms/vkms_drv.c   | 35 ++++++++++++++
 drivers/gpu/drm/vkms/vkms_drv.h   | 24 +++++-----
 drivers/gpu/drm/vkms/vkms_plane.c |  8 ++++
 5 files changed, 146 insertions(+), 75 deletions(-)

Rodrigo Siqueira June 12, 2019, 1:28 p.m. UTC | #1

Hi Daniel,

First of all, thank you very much for your patchset.

I tried to make a detailed review of your series, and you can see my
comments in each patch. You’ll notice that I asked many things related
to the DRM subsystem with the hope that I could learn a little bit
more about DRM from your comments.

Before you go through the review, I would like to start a discussion
about the vkms race conditions. First, I have to admit that I did not
understand the race conditions that you described before because I
couldn’t reproduce them. Now, I'm suspecting that I could not
experience the problem because I'm using QEMU with KVM; with this idea
in mind, I suppose that we have two scenarios for using vkms in a
virtual machine:

* Scenario 1: The user has hardware virtualization support; in this
case, it is a little bit harder to experience race conditions with
vkms.

* Scenario 2: Without hardware virtualization support, it is much
easier to experience race conditions.

With these two scenarios in mind, I conducted a bunch of experiments
for trying to shed light on this issue. I did:

1. Enabled lockdep

2. Defined two different environments for testing both using QEMU with
and without kvm. See below the QEMU hardware options:

* env_kvm: -enable-kvm -daemonize -m 1G -smp cores=2,cpus=2
* env_no_kvm: -daemonize -m 2G -smp cores=4,cpus=4

3. My test protocol: I) turn on the vm, II) clean /proc/lock_stat,
III) execute kms_cursor_crc, III) save the lock_stat file, and IV)
turn off the vm.

4. From the lockdep_stat, I just highlighted the row related to vkms
and the columns holdtime-total and holdtime-avg

I would like to highlight that the following data does not have any
statistical approach, and its intention is solely to assist our
discussion. See below the summary of the collected data:

Summary of the experiment results:

+----------------+----------------+----------------+
|                |     env_kvm    |   env_no_kvm   |
+                +----------------+----------------+
| Test           | Before | After | Before | After |
+----------------+--------+-------+--------+-------+
| kms_cursor_crc |   S1   |   S2  |   M1   |   M2  |
+----------------+--------+-------+--------+-------+

* Before: before apply this patchset
* After: after apply this patchset

-----------------------------------------+------------------+-----------
S1: Without this patchset and with kvm   | holdtime-total | holdtime-avg
-----------------------------------------+----------------+-------------
&(&vkms_out->lock)->rlock:               |  21983.52      |  6.21
(work_completion)(&vkms_state->crc_wo:   |  20.47         |  20.47
(wq_completion)vkms_crc_workq:           |  3999507.87    |  3352.48
&(&vkms_out->state_lock)->rlock:         |  378.47        |  0.30
(work_completion)(&vkms_state->crc_#2:   |  3999066.30    |  2848.34
-----------------------------------------+----------------+-------------
S2: With this patchset and with kvm      |                |
-----------------------------------------+----------------+-------------
&(&vkms_out->lock)->rlock:               |  23262.83      |  6.34
(work_completion)(&vkms_state->crc_wo:   |  8.98          |  8.98
&(&vkms_out->crc_lock)->rlock:           |  307.28        |  0.32
(wq_completion)vkms_crc_workq:           |  6567727.05    |  12345.35
(work_completion)(&vkms_state->crc_#2:   |  6567135.15    |  4488.81
-----------------------------------------+----------------+-------------
M1: Without this patchset and without kvm|                |
-----------------------------------------+----------------+-------------
&(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
&(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
(work_completion)(&vkms_state->crc_wo:   |  31.32         |  31.32
(wq_completion)vkms_crc_workq:           |  20991073.78   |  13525.18
(work_completion)(&vkms_state->crc_#2:   |  20988347.18   |  11904.90
-----------------------------------------+----------------+-------------
M2: With this patchset and without kvm   |                |
-----------------------------------------+----------------+-------------
(wq_completion)vkms_crc_workq:           |  42819161.68   |  36597.57
&(&vkms_out->lock)->rlock:               |  251257.06     |  35.80
(work_completion)(&vkms_state->crc_wo:   |  69.37         |  69.37
&(&vkms_out->crc_lock)->rlock:           |  3620.92       |  1.54
(work_completion)(&vkms_state->crc_#2:   |  42803419.59   |  24306.31

First, I analyzed the scenarios with KVM (S1 and S2); more
specifically, I focused on these two classes:

1. (work_completion)(&vkms_state->crc_wo
2. (work_completion)(&vkms_state->crc_#2

After taking a look at the data, it looks like that this patchset
greatly reduces the hold time average for crc_wo. On the other hand,
it increases the hold time average for crc_#2. I didn’t understand
well the reason for the difference. Could you help me here?

When I looked for the second set of scenarios (M1 and M2, both without
KVM), the results look much more distant; basically, this patchset
increased the hold time average. Again, could you help me understand a
little bit better this issue?

Finally, after these tests, I have some questions:

1. VKMS is a software-only driver; because of this, how about defining
a minimal system resource for using it?

2. During my experiments, I noticed that running tests with a VM that
uses KVM had consistent results. For example, kms_flip never fails
with QEMU+KVM; however, without KVM, two or three tests failed (one of
them looks random). If we use vkms for test DRM stuff, should we
recommend the use of KVM?

Best Regards

On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> Hi all,
>
> This here is the first part of a rework for the vkms crc worker. I think
> this should fix all the locking/races/use-after-free issues I spotted in
> the code. There's more work we can do in the future as a follow-up:
>
> - directly access vkms_plane_state->base in the crc worker, with this
>   approach in this series here that should be safe now. Follow-up patches
>   could switch and remove all the separate crc_data infrastructure.
>
> - I think some kerneldoc for vkms structures would be nice. Documentation
>   the various functions is probably overkill.
>
> - Implementing a more generic blending engine, as prep for adding more
>   pixel formats, more planes, and more features in general.
>
> Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make
> things worse, but I didn't do a full run.
>
> Cheers, Daniel
>
> Daniel Vetter (10):
>   drm/vkms: Fix crc worker races
>   drm/vkms: Use spin_lock_irq in process context
>   drm/vkms: Rename vkms_output.state_lock to crc_lock
>   drm/vkms: Move format arrays to vkms_plane.c
>   drm/vkms: Add our own commit_tail
>   drm/vkms: flush crc workers earlier in commit flow
>   drm/vkms: Dont flush crc worker when we change crc status
>   drm/vkms: No _irqsave within spin_lock_irq needed
>   drm/vkms: totally reworked crc data tracking
>   drm/vkms: No need for ->pages_lock in crc work anymore
>
>  drivers/gpu/drm/vkms/vkms_crc.c   | 74 +++++++++-------------------
>  drivers/gpu/drm/vkms/vkms_crtc.c  | 80 ++++++++++++++++++++++++++-----
>  drivers/gpu/drm/vkms/vkms_drv.c   | 35 ++++++++++++++
>  drivers/gpu/drm/vkms/vkms_drv.h   | 24 +++++-----
>  drivers/gpu/drm/vkms/vkms_plane.c |  8 ++++
>  5 files changed, 146 insertions(+), 75 deletions(-)
>
> --
> 2.20.1
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

Daniel Vetter June 12, 2019, 2:42 p.m. UTC | #2

On Wed, Jun 12, 2019 at 10:28:41AM -0300, Rodrigo Siqueira wrote:
> Hi Daniel,
> 
> First of all, thank you very much for your patchset.
> 
> I tried to make a detailed review of your series, and you can see my
> comments in each patch. You’ll notice that I asked many things related
> to the DRM subsystem with the hope that I could learn a little bit
> more about DRM from your comments.
> 
> Before you go through the review, I would like to start a discussion
> about the vkms race conditions. First, I have to admit that I did not
> understand the race conditions that you described before because I
> couldn’t reproduce them. Now, I'm suspecting that I could not
> experience the problem because I'm using QEMU with KVM; with this idea
> in mind, I suppose that we have two scenarios for using vkms in a
> virtual machine:
> 
> * Scenario 1: The user has hardware virtualization support; in this
> case, it is a little bit harder to experience race conditions with
> vkms.
> 
> * Scenario 2: Without hardware virtualization support, it is much
> easier to experience race conditions.
> 
> With these two scenarios in mind, I conducted a bunch of experiments
> for trying to shed light on this issue. I did:
> 
> 1. Enabled lockdep
> 
> 2. Defined two different environments for testing both using QEMU with
> and without kvm. See below the QEMU hardware options:
> 
> * env_kvm: -enable-kvm -daemonize -m 1G -smp cores=2,cpus=2
> * env_no_kvm: -daemonize -m 2G -smp cores=4,cpus=4
> 
> 3. My test protocol: I) turn on the vm, II) clean /proc/lock_stat,
> III) execute kms_cursor_crc, III) save the lock_stat file, and IV)
> turn off the vm.
> 
> 4. From the lockdep_stat, I just highlighted the row related to vkms
> and the columns holdtime-total and holdtime-avg
> 
> I would like to highlight that the following data does not have any
> statistical approach, and its intention is solely to assist our
> discussion. See below the summary of the collected data:
> 
> Summary of the experiment results:
> 
> +----------------+----------------+----------------+
> |                |     env_kvm    |   env_no_kvm   |
> +                +----------------+----------------+
> | Test           | Before | After | Before | After |
> +----------------+--------+-------+--------+-------+
> | kms_cursor_crc |   S1   |   S2  |   M1   |   M2  |
> +----------------+--------+-------+--------+-------+
> 
> * Before: before apply this patchset
> * After: after apply this patchset
> 
> -----------------------------------------+------------------+-----------
> S1: Without this patchset and with kvm   | holdtime-total | holdtime-avg
> -----------------------------------------+----------------+-------------
> &(&vkms_out->lock)->rlock:               |  21983.52      |  6.21
> (work_completion)(&vkms_state->crc_wo:   |  20.47         |  20.47
> (wq_completion)vkms_crc_workq:           |  3999507.87    |  3352.48
> &(&vkms_out->state_lock)->rlock:         |  378.47        |  0.30
> (work_completion)(&vkms_state->crc_#2:   |  3999066.30    |  2848.34
> -----------------------------------------+----------------+-------------
> S2: With this patchset and with kvm      |                |
> -----------------------------------------+----------------+-------------
> &(&vkms_out->lock)->rlock:               |  23262.83      |  6.34
> (work_completion)(&vkms_state->crc_wo:   |  8.98          |  8.98
> &(&vkms_out->crc_lock)->rlock:           |  307.28        |  0.32
> (wq_completion)vkms_crc_workq:           |  6567727.05    |  12345.35
> (work_completion)(&vkms_state->crc_#2:   |  6567135.15    |  4488.81
> -----------------------------------------+----------------+-------------
> M1: Without this patchset and without kvm|                |
> -----------------------------------------+----------------+-------------
> &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> (work_completion)(&vkms_state->crc_wo:   |  31.32         |  31.32
> (wq_completion)vkms_crc_workq:           |  20991073.78   |  13525.18
> (work_completion)(&vkms_state->crc_#2:   |  20988347.18   |  11904.90
> -----------------------------------------+----------------+-------------
> M2: With this patchset and without kvm   |                |
> -----------------------------------------+----------------+-------------
> (wq_completion)vkms_crc_workq:           |  42819161.68   |  36597.57
> &(&vkms_out->lock)->rlock:               |  251257.06     |  35.80
> (work_completion)(&vkms_state->crc_wo:   |  69.37         |  69.37
> &(&vkms_out->crc_lock)->rlock:           |  3620.92       |  1.54
> (work_completion)(&vkms_state->crc_#2:   |  42803419.59   |  24306.31
> 
> First, I analyzed the scenarios with KVM (S1 and S2); more
> specifically, I focused on these two classes:
> 
> 1. (work_completion)(&vkms_state->crc_wo
> 2. (work_completion)(&vkms_state->crc_#2
> 
> After taking a look at the data, it looks like that this patchset
> greatly reduces the hold time average for crc_wo. On the other hand,
> it increases the hold time average for crc_#2. I didn’t understand
> well the reason for the difference. Could you help me here?

So there's two real locks here from our code, the ones you can see as
spinlocks:

&(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
&(&vkms_out->lock)->rlock:               |  247190.04     |  39.39

All the others are fake locks that the workqueue adds, which only exist in
lockdep. They are used to catch special kinds of deadlocks like the below:

thread A:
1. mutex_lock(mutex_A)
2. flush_work(work_A)

thread B
1. running work_A:
2. mutex_lock(mutex_A)

thread B can't continue becuase mutex_A is already held by thread A.
thread A can't continue because thread B is blocked and the work never
finishes.
-> deadlock

I haven't checked which is which, but essentially what you measure with
the hold times of these fake locks is how long a work execution takes on
average.

Since my patches are supposed to fix races where the worker can't keep up
with the vblank hrtimer, then the average worker will (probably) do more,
so that going up is expected. I think.

I'm honestly not sure what's going on here, never looking into this in
detail.

> When I looked for the second set of scenarios (M1 and M2, both without
> KVM), the results look much more distant; basically, this patchset
> increased the hold time average. Again, could you help me understand a
> little bit better this issue?
> 
> Finally, after these tests, I have some questions:
> 
> 1. VKMS is a software-only driver; because of this, how about defining
> a minimal system resource for using it?

No idea, in reality it's probably "if it fails too often, buy faster cpu".
I do think we should make the code robust against a slow cpu, since atm
that's needed even on pretty fast machines (because our blending code is
really, really slow and inefficient).

> 2. During my experiments, I noticed that running tests with a VM that
> uses KVM had consistent results. For example, kms_flip never fails
> with QEMU+KVM; however, without KVM, two or three tests failed (one of
> them looks random). If we use vkms for test DRM stuff, should we
> recommend the use of KVM?

What do you mean without kvm? In general running without kvm shouldn't be
slower, so I'm a bit confused ... I'm running vgem directly on the
machine, by booting into new kernels (and controlling the machine over the
network).
-Daniel

> Best Regards
> 
> On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > Hi all,
> >
> > This here is the first part of a rework for the vkms crc worker. I think
> > this should fix all the locking/races/use-after-free issues I spotted in
> > the code. There's more work we can do in the future as a follow-up:
> >
> > - directly access vkms_plane_state->base in the crc worker, with this
> >   approach in this series here that should be safe now. Follow-up patches
> >   could switch and remove all the separate crc_data infrastructure.
> >
> > - I think some kerneldoc for vkms structures would be nice. Documentation
> >   the various functions is probably overkill.
> >
> > - Implementing a more generic blending engine, as prep for adding more
> >   pixel formats, more planes, and more features in general.
> >
> > Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make
> > things worse, but I didn't do a full run.
> >
> > Cheers, Daniel
> >
> > Daniel Vetter (10):
> >   drm/vkms: Fix crc worker races
> >   drm/vkms: Use spin_lock_irq in process context
> >   drm/vkms: Rename vkms_output.state_lock to crc_lock
> >   drm/vkms: Move format arrays to vkms_plane.c
> >   drm/vkms: Add our own commit_tail
> >   drm/vkms: flush crc workers earlier in commit flow
> >   drm/vkms: Dont flush crc worker when we change crc status
> >   drm/vkms: No _irqsave within spin_lock_irq needed
> >   drm/vkms: totally reworked crc data tracking
> >   drm/vkms: No need for ->pages_lock in crc work anymore
> >
> >  drivers/gpu/drm/vkms/vkms_crc.c   | 74 +++++++++-------------------
> >  drivers/gpu/drm/vkms/vkms_crtc.c  | 80 ++++++++++++++++++++++++++-----
> >  drivers/gpu/drm/vkms/vkms_drv.c   | 35 ++++++++++++++
> >  drivers/gpu/drm/vkms/vkms_drv.h   | 24 +++++-----
> >  drivers/gpu/drm/vkms/vkms_plane.c |  8 ++++
> >  5 files changed, 146 insertions(+), 75 deletions(-)
> >
> > --
> > 2.20.1
> >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
> 
> 
> -- 
> 
> Rodrigo Siqueira
> https://siqueira.tech

Rodrigo Siqueira June 18, 2019, 2:49 a.m. UTC | #3

On 06/12, Daniel Vetter wrote:
> On Wed, Jun 12, 2019 at 10:28:41AM -0300, Rodrigo Siqueira wrote:
> > Hi Daniel,
> > 
> > First of all, thank you very much for your patchset.
> > 
> > I tried to make a detailed review of your series, and you can see my
> > comments in each patch. You’ll notice that I asked many things related
> > to the DRM subsystem with the hope that I could learn a little bit
> > more about DRM from your comments.
> > 
> > Before you go through the review, I would like to start a discussion
> > about the vkms race conditions. First, I have to admit that I did not
> > understand the race conditions that you described before because I
> > couldn’t reproduce them. Now, I'm suspecting that I could not
> > experience the problem because I'm using QEMU with KVM; with this idea
> > in mind, I suppose that we have two scenarios for using vkms in a
> > virtual machine:
> > 
> > * Scenario 1: The user has hardware virtualization support; in this
> > case, it is a little bit harder to experience race conditions with
> > vkms.
> > 
> > * Scenario 2: Without hardware virtualization support, it is much
> > easier to experience race conditions.
> > 
> > With these two scenarios in mind, I conducted a bunch of experiments
> > for trying to shed light on this issue. I did:
> > 
> > 1. Enabled lockdep
> > 
> > 2. Defined two different environments for testing both using QEMU with
> > and without kvm. See below the QEMU hardware options:
> > 
> > * env_kvm: -enable-kvm -daemonize -m 1G -smp cores=2,cpus=2
> > * env_no_kvm: -daemonize -m 2G -smp cores=4,cpus=4
> > 
> > 3. My test protocol: I) turn on the vm, II) clean /proc/lock_stat,
> > III) execute kms_cursor_crc, III) save the lock_stat file, and IV)
> > turn off the vm.
> > 
> > 4. From the lockdep_stat, I just highlighted the row related to vkms
> > and the columns holdtime-total and holdtime-avg
> > 
> > I would like to highlight that the following data does not have any
> > statistical approach, and its intention is solely to assist our
> > discussion. See below the summary of the collected data:
> > 
> > Summary of the experiment results:
> > 
> > +----------------+----------------+----------------+
> > |                |     env_kvm    |   env_no_kvm   |
> > +                +----------------+----------------+
> > | Test           | Before | After | Before | After |
> > +----------------+--------+-------+--------+-------+
> > | kms_cursor_crc |   S1   |   S2  |   M1   |   M2  |
> > +----------------+--------+-------+--------+-------+
> > 
> > * Before: before apply this patchset
> > * After: after apply this patchset
> > 
> > -----------------------------------------+------------------+-----------
> > S1: Without this patchset and with kvm   | holdtime-total | holdtime-avg
> > -----------------------------------------+----------------+-------------
> > &(&vkms_out->lock)->rlock:               |  21983.52      |  6.21
> > (work_completion)(&vkms_state->crc_wo:   |  20.47         |  20.47
> > (wq_completion)vkms_crc_workq:           |  3999507.87    |  3352.48
> > &(&vkms_out->state_lock)->rlock:         |  378.47        |  0.30
> > (work_completion)(&vkms_state->crc_#2:   |  3999066.30    |  2848.34
> > -----------------------------------------+----------------+-------------
> > S2: With this patchset and with kvm      |                |
> > -----------------------------------------+----------------+-------------
> > &(&vkms_out->lock)->rlock:               |  23262.83      |  6.34
> > (work_completion)(&vkms_state->crc_wo:   |  8.98          |  8.98
> > &(&vkms_out->crc_lock)->rlock:           |  307.28        |  0.32
> > (wq_completion)vkms_crc_workq:           |  6567727.05    |  12345.35
> > (work_completion)(&vkms_state->crc_#2:   |  6567135.15    |  4488.81
> > -----------------------------------------+----------------+-------------
> > M1: Without this patchset and without kvm|                |
> > -----------------------------------------+----------------+-------------
> > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > (work_completion)(&vkms_state->crc_wo:   |  31.32         |  31.32
> > (wq_completion)vkms_crc_workq:           |  20991073.78   |  13525.18
> > (work_completion)(&vkms_state->crc_#2:   |  20988347.18   |  11904.90
> > -----------------------------------------+----------------+-------------
> > M2: With this patchset and without kvm   |                |
> > -----------------------------------------+----------------+-------------
> > (wq_completion)vkms_crc_workq:           |  42819161.68   |  36597.57
> > &(&vkms_out->lock)->rlock:               |  251257.06     |  35.80
> > (work_completion)(&vkms_state->crc_wo:   |  69.37         |  69.37
> > &(&vkms_out->crc_lock)->rlock:           |  3620.92       |  1.54
> > (work_completion)(&vkms_state->crc_#2:   |  42803419.59   |  24306.31
> > 
> > First, I analyzed the scenarios with KVM (S1 and S2); more
> > specifically, I focused on these two classes:
> > 
> > 1. (work_completion)(&vkms_state->crc_wo
> > 2. (work_completion)(&vkms_state->crc_#2
> > 
> > After taking a look at the data, it looks like that this patchset
> > greatly reduces the hold time average for crc_wo. On the other hand,
> > it increases the hold time average for crc_#2. I didn’t understand
> > well the reason for the difference. Could you help me here?
> 
> So there's two real locks here from our code, the ones you can see as
> spinlocks:
> 
> &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> 
> All the others are fake locks that the workqueue adds, which only exist in
> lockdep. They are used to catch special kinds of deadlocks like the below:
> 
> thread A:
> 1. mutex_lock(mutex_A)
> 2. flush_work(work_A)
> 
> thread B
> 1. running work_A:
> 2. mutex_lock(mutex_A)
> 
> thread B can't continue becuase mutex_A is already held by thread A.
> thread A can't continue because thread B is blocked and the work never
> finishes.
> -> deadlock
> 
> I haven't checked which is which, but essentially what you measure with
> the hold times of these fake locks is how long a work execution takes on
> average.
> 
> Since my patches are supposed to fix races where the worker can't keep up
> with the vblank hrtimer, then the average worker will (probably) do more,
> so that going up is expected. I think.
> 
> I'm honestly not sure what's going on here, never looking into this in
> detail.
> 
> > When I looked for the second set of scenarios (M1 and M2, both without
> > KVM), the results look much more distant; basically, this patchset
> > increased the hold time average. Again, could you help me understand a
> > little bit better this issue?
> > 
> > Finally, after these tests, I have some questions:
> > 
> > 1. VKMS is a software-only driver; because of this, how about defining
> > a minimal system resource for using it?
> 
> No idea, in reality it's probably "if it fails too often, buy faster cpu".
> I do think we should make the code robust against a slow cpu, since atm
> that's needed even on pretty fast machines (because our blending code is
> really, really slow and inefficient).
> 
> > 2. During my experiments, I noticed that running tests with a VM that
> > uses KVM had consistent results. For example, kms_flip never fails
> > with QEMU+KVM; however, without KVM, two or three tests failed (one of
> > them looks random). If we use vkms for test DRM stuff, should we
> > recommend the use of KVM?
> 
> What do you mean without kvm? In general running without kvm shouldn't be
> slower, so I'm a bit confused ... I'm running vgem directly on the
> machine, by booting into new kernels (and controlling the machine over the
> network).

Sorry, I should have detailed my point.

Basically, I do all my testing with vkms in QEMU VM. In that sense, I
did some experiments which I enabled and disabled the KVM (i.e., flag
'-enable-kvm') to check the vkms behaviour in these two scenarios. I
noticed that the tests are consistent when I use the '-enable-kvm' flag,
in that context it seemed a good idea to recommend the use of KVM for
those users who want to test vkms with igt. Anyway, don't worry about
that I'll try to add more documentation for vkms in the future and in
that time we discuss about this again.

Anyway, from my side, I think we should merge this series. Do you want
to prepare a V2 with the fixes and maybe update the commit messages by
using some of your explanations? Or, if you want, I can fix the tiny
details and merge it.

> -Daniel
> 
> > Best Regards
> > 
> > On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > >
> > > Hi all,
> > >
> > > This here is the first part of a rework for the vkms crc worker. I think
> > > this should fix all the locking/races/use-after-free issues I spotted in
> > > the code. There's more work we can do in the future as a follow-up:
> > >
> > > - directly access vkms_plane_state->base in the crc worker, with this
> > >   approach in this series here that should be safe now. Follow-up patches
> > >   could switch and remove all the separate crc_data infrastructure.
> > >
> > > - I think some kerneldoc for vkms structures would be nice. Documentation
> > >   the various functions is probably overkill.
> > >
> > > - Implementing a more generic blending engine, as prep for adding more
> > >   pixel formats, more planes, and more features in general.
> > >
> > > Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make
> > > things worse, but I didn't do a full run.
> > >
> > > Cheers, Daniel
> > >
> > > Daniel Vetter (10):
> > >   drm/vkms: Fix crc worker races
> > >   drm/vkms: Use spin_lock_irq in process context
> > >   drm/vkms: Rename vkms_output.state_lock to crc_lock
> > >   drm/vkms: Move format arrays to vkms_plane.c
> > >   drm/vkms: Add our own commit_tail
> > >   drm/vkms: flush crc workers earlier in commit flow
> > >   drm/vkms: Dont flush crc worker when we change crc status
> > >   drm/vkms: No _irqsave within spin_lock_irq needed
> > >   drm/vkms: totally reworked crc data tracking
> > >   drm/vkms: No need for ->pages_lock in crc work anymore
> > >
> > >  drivers/gpu/drm/vkms/vkms_crc.c   | 74 +++++++++-------------------
> > >  drivers/gpu/drm/vkms/vkms_crtc.c  | 80 ++++++++++++++++++++++++++-----
> > >  drivers/gpu/drm/vkms/vkms_drv.c   | 35 ++++++++++++++
> > >  drivers/gpu/drm/vkms/vkms_drv.h   | 24 +++++-----
> > >  drivers/gpu/drm/vkms/vkms_plane.c |  8 ++++
> > >  5 files changed, 146 insertions(+), 75 deletions(-)
> > >
> > > --
> > > 2.20.1
> > >
> > > _______________________________________________
> > > dri-devel mailing list
> > > dri-devel@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > 
> > 
> > 
> > -- 
> > 
> > Rodrigo Siqueira
> > https://siqueira.tech
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

Daniel Vetter June 18, 2019, 8:56 a.m. UTC | #4

On Mon, Jun 17, 2019 at 11:49:04PM -0300, Rodrigo Siqueira wrote:
> On 06/12, Daniel Vetter wrote:
> > On Wed, Jun 12, 2019 at 10:28:41AM -0300, Rodrigo Siqueira wrote:
> > > Hi Daniel,
> > > 
> > > First of all, thank you very much for your patchset.
> > > 
> > > I tried to make a detailed review of your series, and you can see my
> > > comments in each patch. You’ll notice that I asked many things related
> > > to the DRM subsystem with the hope that I could learn a little bit
> > > more about DRM from your comments.
> > > 
> > > Before you go through the review, I would like to start a discussion
> > > about the vkms race conditions. First, I have to admit that I did not
> > > understand the race conditions that you described before because I
> > > couldn’t reproduce them. Now, I'm suspecting that I could not
> > > experience the problem because I'm using QEMU with KVM; with this idea
> > > in mind, I suppose that we have two scenarios for using vkms in a
> > > virtual machine:
> > > 
> > > * Scenario 1: The user has hardware virtualization support; in this
> > > case, it is a little bit harder to experience race conditions with
> > > vkms.
> > > 
> > > * Scenario 2: Without hardware virtualization support, it is much
> > > easier to experience race conditions.
> > > 
> > > With these two scenarios in mind, I conducted a bunch of experiments
> > > for trying to shed light on this issue. I did:
> > > 
> > > 1. Enabled lockdep
> > > 
> > > 2. Defined two different environments for testing both using QEMU with
> > > and without kvm. See below the QEMU hardware options:
> > > 
> > > * env_kvm: -enable-kvm -daemonize -m 1G -smp cores=2,cpus=2
> > > * env_no_kvm: -daemonize -m 2G -smp cores=4,cpus=4
> > > 
> > > 3. My test protocol: I) turn on the vm, II) clean /proc/lock_stat,
> > > III) execute kms_cursor_crc, III) save the lock_stat file, and IV)
> > > turn off the vm.
> > > 
> > > 4. From the lockdep_stat, I just highlighted the row related to vkms
> > > and the columns holdtime-total and holdtime-avg
> > > 
> > > I would like to highlight that the following data does not have any
> > > statistical approach, and its intention is solely to assist our
> > > discussion. See below the summary of the collected data:
> > > 
> > > Summary of the experiment results:
> > > 
> > > +----------------+----------------+----------------+
> > > |                |     env_kvm    |   env_no_kvm   |
> > > +                +----------------+----------------+
> > > | Test           | Before | After | Before | After |
> > > +----------------+--------+-------+--------+-------+
> > > | kms_cursor_crc |   S1   |   S2  |   M1   |   M2  |
> > > +----------------+--------+-------+--------+-------+
> > > 
> > > * Before: before apply this patchset
> > > * After: after apply this patchset
> > > 
> > > -----------------------------------------+------------------+-----------
> > > S1: Without this patchset and with kvm   | holdtime-total | holdtime-avg
> > > -----------------------------------------+----------------+-------------
> > > &(&vkms_out->lock)->rlock:               |  21983.52      |  6.21
> > > (work_completion)(&vkms_state->crc_wo:   |  20.47         |  20.47
> > > (wq_completion)vkms_crc_workq:           |  3999507.87    |  3352.48
> > > &(&vkms_out->state_lock)->rlock:         |  378.47        |  0.30
> > > (work_completion)(&vkms_state->crc_#2:   |  3999066.30    |  2848.34
> > > -----------------------------------------+----------------+-------------
> > > S2: With this patchset and with kvm      |                |
> > > -----------------------------------------+----------------+-------------
> > > &(&vkms_out->lock)->rlock:               |  23262.83      |  6.34
> > > (work_completion)(&vkms_state->crc_wo:   |  8.98          |  8.98
> > > &(&vkms_out->crc_lock)->rlock:           |  307.28        |  0.32
> > > (wq_completion)vkms_crc_workq:           |  6567727.05    |  12345.35
> > > (work_completion)(&vkms_state->crc_#2:   |  6567135.15    |  4488.81
> > > -----------------------------------------+----------------+-------------
> > > M1: Without this patchset and without kvm|                |
> > > -----------------------------------------+----------------+-------------
> > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > (work_completion)(&vkms_state->crc_wo:   |  31.32         |  31.32
> > > (wq_completion)vkms_crc_workq:           |  20991073.78   |  13525.18
> > > (work_completion)(&vkms_state->crc_#2:   |  20988347.18   |  11904.90
> > > -----------------------------------------+----------------+-------------
> > > M2: With this patchset and without kvm   |                |
> > > -----------------------------------------+----------------+-------------
> > > (wq_completion)vkms_crc_workq:           |  42819161.68   |  36597.57
> > > &(&vkms_out->lock)->rlock:               |  251257.06     |  35.80
> > > (work_completion)(&vkms_state->crc_wo:   |  69.37         |  69.37
> > > &(&vkms_out->crc_lock)->rlock:           |  3620.92       |  1.54
> > > (work_completion)(&vkms_state->crc_#2:   |  42803419.59   |  24306.31
> > > 
> > > First, I analyzed the scenarios with KVM (S1 and S2); more
> > > specifically, I focused on these two classes:
> > > 
> > > 1. (work_completion)(&vkms_state->crc_wo
> > > 2. (work_completion)(&vkms_state->crc_#2
> > > 
> > > After taking a look at the data, it looks like that this patchset
> > > greatly reduces the hold time average for crc_wo. On the other hand,
> > > it increases the hold time average for crc_#2. I didn’t understand
> > > well the reason for the difference. Could you help me here?
> > 
> > So there's two real locks here from our code, the ones you can see as
> > spinlocks:
> > 
> > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > 
> > All the others are fake locks that the workqueue adds, which only exist in
> > lockdep. They are used to catch special kinds of deadlocks like the below:
> > 
> > thread A:
> > 1. mutex_lock(mutex_A)
> > 2. flush_work(work_A)
> > 
> > thread B
> > 1. running work_A:
> > 2. mutex_lock(mutex_A)
> > 
> > thread B can't continue becuase mutex_A is already held by thread A.
> > thread A can't continue because thread B is blocked and the work never
> > finishes.
> > -> deadlock
> > 
> > I haven't checked which is which, but essentially what you measure with
> > the hold times of these fake locks is how long a work execution takes on
> > average.
> > 
> > Since my patches are supposed to fix races where the worker can't keep up
> > with the vblank hrtimer, then the average worker will (probably) do more,
> > so that going up is expected. I think.
> > 
> > I'm honestly not sure what's going on here, never looking into this in
> > detail.
> > 
> > > When I looked for the second set of scenarios (M1 and M2, both without
> > > KVM), the results look much more distant; basically, this patchset
> > > increased the hold time average. Again, could you help me understand a
> > > little bit better this issue?
> > > 
> > > Finally, after these tests, I have some questions:
> > > 
> > > 1. VKMS is a software-only driver; because of this, how about defining
> > > a minimal system resource for using it?
> > 
> > No idea, in reality it's probably "if it fails too often, buy faster cpu".
> > I do think we should make the code robust against a slow cpu, since atm
> > that's needed even on pretty fast machines (because our blending code is
> > really, really slow and inefficient).
> > 
> > > 2. During my experiments, I noticed that running tests with a VM that
> > > uses KVM had consistent results. For example, kms_flip never fails
> > > with QEMU+KVM; however, without KVM, two or three tests failed (one of
> > > them looks random). If we use vkms for test DRM stuff, should we
> > > recommend the use of KVM?
> > 
> > What do you mean without kvm? In general running without kvm shouldn't be
> > slower, so I'm a bit confused ... I'm running vgem directly on the
> > machine, by booting into new kernels (and controlling the machine over the
> > network).
> 
> Sorry, I should have detailed my point.
> 
> Basically, I do all my testing with vkms in QEMU VM. In that sense, I
> did some experiments which I enabled and disabled the KVM (i.e., flag
> '-enable-kvm') to check the vkms behaviour in these two scenarios. I
> noticed that the tests are consistent when I use the '-enable-kvm' flag,
> in that context it seemed a good idea to recommend the use of KVM for
> those users who want to test vkms with igt. Anyway, don't worry about
> that I'll try to add more documentation for vkms in the future and in
> that time we discuss about this again.

Ah, qemu without kvm is going to use software emulation for a lot of the
kernel stuff. That's going to be terribly slow indeed.

> Anyway, from my side, I think we should merge this series. Do you want
> to prepare a V2 with the fixes and maybe update the commit messages by
> using some of your explanations? Or, if you want, I can fix the tiny
> details and merge it.

I'm deeply burried in my prime cleanup/doc series right now, so will take
a few days until I get around to this again. If you want, please go ahead
with merging.

btw if you edit a patch when merging, please add a comment about that to
the commit message. And ime it's best to only augment the commit message
and maybe delete an unused variable or so that got forgotten. For
everything more better to do the edits and resubmit.

Thanks, Daniel

> 
> > -Daniel
> > 
> > > Best Regards
> > > 
> > > On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > >
> > > > Hi all,
> > > >
> > > > This here is the first part of a rework for the vkms crc worker. I think
> > > > this should fix all the locking/races/use-after-free issues I spotted in
> > > > the code. There's more work we can do in the future as a follow-up:
> > > >
> > > > - directly access vkms_plane_state->base in the crc worker, with this
> > > >   approach in this series here that should be safe now. Follow-up patches
> > > >   could switch and remove all the separate crc_data infrastructure.
> > > >
> > > > - I think some kerneldoc for vkms structures would be nice. Documentation
> > > >   the various functions is probably overkill.
> > > >
> > > > - Implementing a more generic blending engine, as prep for adding more
> > > >   pixel formats, more planes, and more features in general.
> > > >
> > > > Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make
> > > > things worse, but I didn't do a full run.
> > > >
> > > > Cheers, Daniel
> > > >
> > > > Daniel Vetter (10):
> > > >   drm/vkms: Fix crc worker races
> > > >   drm/vkms: Use spin_lock_irq in process context
> > > >   drm/vkms: Rename vkms_output.state_lock to crc_lock
> > > >   drm/vkms: Move format arrays to vkms_plane.c
> > > >   drm/vkms: Add our own commit_tail
> > > >   drm/vkms: flush crc workers earlier in commit flow
> > > >   drm/vkms: Dont flush crc worker when we change crc status
> > > >   drm/vkms: No _irqsave within spin_lock_irq needed
> > > >   drm/vkms: totally reworked crc data tracking
> > > >   drm/vkms: No need for ->pages_lock in crc work anymore
> > > >
> > > >  drivers/gpu/drm/vkms/vkms_crc.c   | 74 +++++++++-------------------
> > > >  drivers/gpu/drm/vkms/vkms_crtc.c  | 80 ++++++++++++++++++++++++++-----
> > > >  drivers/gpu/drm/vkms/vkms_drv.c   | 35 ++++++++++++++
> > > >  drivers/gpu/drm/vkms/vkms_drv.h   | 24 +++++-----
> > > >  drivers/gpu/drm/vkms/vkms_plane.c |  8 ++++
> > > >  5 files changed, 146 insertions(+), 75 deletions(-)
> > > >
> > > > --
> > > > 2.20.1
> > > >
> > > > _______________________________________________
> > > > dri-devel mailing list
> > > > dri-devel@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > > 
> > > 
> > > 
> > > -- 
> > > 
> > > Rodrigo Siqueira
> > > https://siqueira.tech
> > 
> > -- 
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
> 
> -- 
> Rodrigo Siqueira
> https://siqueira.tech

Rodrigo Siqueira June 18, 2019, 9:54 p.m. UTC | #5

On Tue, Jun 18, 2019 at 5:56 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Mon, Jun 17, 2019 at 11:49:04PM -0300, Rodrigo Siqueira wrote:
> > On 06/12, Daniel Vetter wrote:
> > > On Wed, Jun 12, 2019 at 10:28:41AM -0300, Rodrigo Siqueira wrote:
> > > > Hi Daniel,
> > > >
> > > > First of all, thank you very much for your patchset.
> > > >
> > > > I tried to make a detailed review of your series, and you can see my
> > > > comments in each patch. You’ll notice that I asked many things related
> > > > to the DRM subsystem with the hope that I could learn a little bit
> > > > more about DRM from your comments.
> > > >
> > > > Before you go through the review, I would like to start a discussion
> > > > about the vkms race conditions. First, I have to admit that I did not
> > > > understand the race conditions that you described before because I
> > > > couldn’t reproduce them. Now, I'm suspecting that I could not
> > > > experience the problem because I'm using QEMU with KVM; with this idea
> > > > in mind, I suppose that we have two scenarios for using vkms in a
> > > > virtual machine:
> > > >
> > > > * Scenario 1: The user has hardware virtualization support; in this
> > > > case, it is a little bit harder to experience race conditions with
> > > > vkms.
> > > >
> > > > * Scenario 2: Without hardware virtualization support, it is much
> > > > easier to experience race conditions.
> > > >
> > > > With these two scenarios in mind, I conducted a bunch of experiments
> > > > for trying to shed light on this issue. I did:
> > > >
> > > > 1. Enabled lockdep
> > > >
> > > > 2. Defined two different environments for testing both using QEMU with
> > > > and without kvm. See below the QEMU hardware options:
> > > >
> > > > * env_kvm: -enable-kvm -daemonize -m 1G -smp cores=2,cpus=2
> > > > * env_no_kvm: -daemonize -m 2G -smp cores=4,cpus=4
> > > >
> > > > 3. My test protocol: I) turn on the vm, II) clean /proc/lock_stat,
> > > > III) execute kms_cursor_crc, III) save the lock_stat file, and IV)
> > > > turn off the vm.
> > > >
> > > > 4. From the lockdep_stat, I just highlighted the row related to vkms
> > > > and the columns holdtime-total and holdtime-avg
> > > >
> > > > I would like to highlight that the following data does not have any
> > > > statistical approach, and its intention is solely to assist our
> > > > discussion. See below the summary of the collected data:
> > > >
> > > > Summary of the experiment results:
> > > >
> > > > +----------------+----------------+----------------+
> > > > |                |     env_kvm    |   env_no_kvm   |
> > > > +                +----------------+----------------+
> > > > | Test           | Before | After | Before | After |
> > > > +----------------+--------+-------+--------+-------+
> > > > | kms_cursor_crc |   S1   |   S2  |   M1   |   M2  |
> > > > +----------------+--------+-------+--------+-------+
> > > >
> > > > * Before: before apply this patchset
> > > > * After: after apply this patchset
> > > >
> > > > -----------------------------------------+------------------+-----------
> > > > S1: Without this patchset and with kvm   | holdtime-total | holdtime-avg
> > > > -----------------------------------------+----------------+-------------
> > > > &(&vkms_out->lock)->rlock:               |  21983.52      |  6.21
> > > > (work_completion)(&vkms_state->crc_wo:   |  20.47         |  20.47
> > > > (wq_completion)vkms_crc_workq:           |  3999507.87    |  3352.48
> > > > &(&vkms_out->state_lock)->rlock:         |  378.47        |  0.30
> > > > (work_completion)(&vkms_state->crc_#2:   |  3999066.30    |  2848.34
> > > > -----------------------------------------+----------------+-------------
> > > > S2: With this patchset and with kvm      |                |
> > > > -----------------------------------------+----------------+-------------
> > > > &(&vkms_out->lock)->rlock:               |  23262.83      |  6.34
> > > > (work_completion)(&vkms_state->crc_wo:   |  8.98          |  8.98
> > > > &(&vkms_out->crc_lock)->rlock:           |  307.28        |  0.32
> > > > (wq_completion)vkms_crc_workq:           |  6567727.05    |  12345.35
> > > > (work_completion)(&vkms_state->crc_#2:   |  6567135.15    |  4488.81
> > > > -----------------------------------------+----------------+-------------
> > > > M1: Without this patchset and without kvm|                |
> > > > -----------------------------------------+----------------+-------------
> > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > (work_completion)(&vkms_state->crc_wo:   |  31.32         |  31.32
> > > > (wq_completion)vkms_crc_workq:           |  20991073.78   |  13525.18
> > > > (work_completion)(&vkms_state->crc_#2:   |  20988347.18   |  11904.90
> > > > -----------------------------------------+----------------+-------------
> > > > M2: With this patchset and without kvm   |                |
> > > > -----------------------------------------+----------------+-------------
> > > > (wq_completion)vkms_crc_workq:           |  42819161.68   |  36597.57
> > > > &(&vkms_out->lock)->rlock:               |  251257.06     |  35.80
> > > > (work_completion)(&vkms_state->crc_wo:   |  69.37         |  69.37
> > > > &(&vkms_out->crc_lock)->rlock:           |  3620.92       |  1.54
> > > > (work_completion)(&vkms_state->crc_#2:   |  42803419.59   |  24306.31
> > > >
> > > > First, I analyzed the scenarios with KVM (S1 and S2); more
> > > > specifically, I focused on these two classes:
> > > >
> > > > 1. (work_completion)(&vkms_state->crc_wo
> > > > 2. (work_completion)(&vkms_state->crc_#2
> > > >
> > > > After taking a look at the data, it looks like that this patchset
> > > > greatly reduces the hold time average for crc_wo. On the other hand,
> > > > it increases the hold time average for crc_#2. I didn’t understand
> > > > well the reason for the difference. Could you help me here?
> > >
> > > So there's two real locks here from our code, the ones you can see as
> > > spinlocks:
> > >
> > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > >
> > > All the others are fake locks that the workqueue adds, which only exist in
> > > lockdep. They are used to catch special kinds of deadlocks like the below:
> > >
> > > thread A:
> > > 1. mutex_lock(mutex_A)
> > > 2. flush_work(work_A)
> > >
> > > thread B
> > > 1. running work_A:
> > > 2. mutex_lock(mutex_A)
> > >
> > > thread B can't continue becuase mutex_A is already held by thread A.
> > > thread A can't continue because thread B is blocked and the work never
> > > finishes.
> > > -> deadlock
> > >
> > > I haven't checked which is which, but essentially what you measure with
> > > the hold times of these fake locks is how long a work execution takes on
> > > average.
> > >
> > > Since my patches are supposed to fix races where the worker can't keep up
> > > with the vblank hrtimer, then the average worker will (probably) do more,
> > > so that going up is expected. I think.
> > >
> > > I'm honestly not sure what's going on here, never looking into this in
> > > detail.
> > >
> > > > When I looked for the second set of scenarios (M1 and M2, both without
> > > > KVM), the results look much more distant; basically, this patchset
> > > > increased the hold time average. Again, could you help me understand a
> > > > little bit better this issue?
> > > >
> > > > Finally, after these tests, I have some questions:
> > > >
> > > > 1. VKMS is a software-only driver; because of this, how about defining
> > > > a minimal system resource for using it?
> > >
> > > No idea, in reality it's probably "if it fails too often, buy faster cpu".
> > > I do think we should make the code robust against a slow cpu, since atm
> > > that's needed even on pretty fast machines (because our blending code is
> > > really, really slow and inefficient).
> > >
> > > > 2. During my experiments, I noticed that running tests with a VM that
> > > > uses KVM had consistent results. For example, kms_flip never fails
> > > > with QEMU+KVM; however, without KVM, two or three tests failed (one of
> > > > them looks random). If we use vkms for test DRM stuff, should we
> > > > recommend the use of KVM?
> > >
> > > What do you mean without kvm? In general running without kvm shouldn't be
> > > slower, so I'm a bit confused ... I'm running vgem directly on the
> > > machine, by booting into new kernels (and controlling the machine over the
> > > network).
> >
> > Sorry, I should have detailed my point.
> >
> > Basically, I do all my testing with vkms in QEMU VM. In that sense, I
> > did some experiments which I enabled and disabled the KVM (i.e., flag
> > '-enable-kvm') to check the vkms behaviour in these two scenarios. I
> > noticed that the tests are consistent when I use the '-enable-kvm' flag,
> > in that context it seemed a good idea to recommend the use of KVM for
> > those users who want to test vkms with igt. Anyway, don't worry about
> > that I'll try to add more documentation for vkms in the future and in
> > that time we discuss about this again.
>
> Ah, qemu without kvm is going to use software emulation for a lot of the
> kernel stuff. That's going to be terribly slow indeed.
>
> > Anyway, from my side, I think we should merge this series. Do you want
> > to prepare a V2 with the fixes and maybe update the commit messages by
> > using some of your explanations? Or, if you want, I can fix the tiny
> > details and merge it.
>
> I'm deeply burried in my prime cleanup/doc series right now, so will take
> a few days until I get around to this again. If you want, please go ahead
> with merging.
>
> btw if you edit a patch when merging, please add a comment about that to
> the commit message. And ime it's best to only augment the commit message
> and maybe delete an unused variable or so that got forgotten. For
> everything more better to do the edits and resubmit.

First of all, thank you very much for all your reviews and
explanation. I’ll try the exercise that you proposed on Patch 1.

I’ll merge patch [4] and [7] since they’re not related to these
series. For the other patches, I’ll merge them after I finish the new
version of writeback series. I’ll ping you later.

4. https://patchwork.freedesktop.org/patch/309031/?series=61737&rev=1
7. https://patchwork.freedesktop.org/patch/309029/?series=61737&rev=1

Finally, not related with this patchset, can I apply the patch
“drm/drm_vblank: Change EINVAL by the correct errno” [1] or do I need
more SoB? I’ll also apply Oleg patch (drm/vkms: add crc sources list).

1. https://patchwork.freedesktop.org/patch/310006/?series=50697&rev=4

Thanks

> Thanks, Daniel
>
> >
> > > -Daniel
> > >
> > > > Best Regards
> > > >
> > > > On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > > >
> > > > > Hi all,
> > > > >
> > > > > This here is the first part of a rework for the vkms crc worker. I think
> > > > > this should fix all the locking/races/use-after-free issues I spotted in
> > > > > the code. There's more work we can do in the future as a follow-up:
> > > > >
> > > > > - directly access vkms_plane_state->base in the crc worker, with this
> > > > >   approach in this series here that should be safe now. Follow-up patches
> > > > >   could switch and remove all the separate crc_data infrastructure.
> > > > >
> > > > > - I think some kerneldoc for vkms structures would be nice. Documentation
> > > > >   the various functions is probably overkill.
> > > > >
> > > > > - Implementing a more generic blending engine, as prep for adding more
> > > > >   pixel formats, more planes, and more features in general.
> > > > >
> > > > > Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make
> > > > > things worse, but I didn't do a full run.
> > > > >
> > > > > Cheers, Daniel
> > > > >
> > > > > Daniel Vetter (10):
> > > > >   drm/vkms: Fix crc worker races
> > > > >   drm/vkms: Use spin_lock_irq in process context
> > > > >   drm/vkms: Rename vkms_output.state_lock to crc_lock
> > > > >   drm/vkms: Move format arrays to vkms_plane.c
> > > > >   drm/vkms: Add our own commit_tail
> > > > >   drm/vkms: flush crc workers earlier in commit flow
> > > > >   drm/vkms: Dont flush crc worker when we change crc status
> > > > >   drm/vkms: No _irqsave within spin_lock_irq needed
> > > > >   drm/vkms: totally reworked crc data tracking
> > > > >   drm/vkms: No need for ->pages_lock in crc work anymore
> > > > >
> > > > >  drivers/gpu/drm/vkms/vkms_crc.c   | 74 +++++++++-------------------
> > > > >  drivers/gpu/drm/vkms/vkms_crtc.c  | 80 ++++++++++++++++++++++++++-----
> > > > >  drivers/gpu/drm/vkms/vkms_drv.c   | 35 ++++++++++++++
> > > > >  drivers/gpu/drm/vkms/vkms_drv.h   | 24 +++++-----
> > > > >  drivers/gpu/drm/vkms/vkms_plane.c |  8 ++++
> > > > >  5 files changed, 146 insertions(+), 75 deletions(-)
> > > > >
> > > > > --
> > > > > 2.20.1
> > > > >
> > > > > _______________________________________________
> > > > > dri-devel mailing list
> > > > > dri-devel@lists.freedesktop.org
> > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Rodrigo Siqueira
> > > > https://siqueira.tech
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > http://blog.ffwll.ch
> >
> > --
> > Rodrigo Siqueira
> > https://siqueira.tech
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

Daniel Vetter June 18, 2019, 10:06 p.m. UTC | #6

On Tue, Jun 18, 2019 at 11:54 PM Rodrigo Siqueira
<rodrigosiqueiramelo@gmail.com> wrote:
>
> On Tue, Jun 18, 2019 at 5:56 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Mon, Jun 17, 2019 at 11:49:04PM -0300, Rodrigo Siqueira wrote:
> > > On 06/12, Daniel Vetter wrote:
> > > > On Wed, Jun 12, 2019 at 10:28:41AM -0300, Rodrigo Siqueira wrote:
> > > > > Hi Daniel,
> > > > >
> > > > > First of all, thank you very much for your patchset.
> > > > >
> > > > > I tried to make a detailed review of your series, and you can see my
> > > > > comments in each patch. You’ll notice that I asked many things related
> > > > > to the DRM subsystem with the hope that I could learn a little bit
> > > > > more about DRM from your comments.
> > > > >
> > > > > Before you go through the review, I would like to start a discussion
> > > > > about the vkms race conditions. First, I have to admit that I did not
> > > > > understand the race conditions that you described before because I
> > > > > couldn’t reproduce them. Now, I'm suspecting that I could not
> > > > > experience the problem because I'm using QEMU with KVM; with this idea
> > > > > in mind, I suppose that we have two scenarios for using vkms in a
> > > > > virtual machine:
> > > > >
> > > > > * Scenario 1: The user has hardware virtualization support; in this
> > > > > case, it is a little bit harder to experience race conditions with
> > > > > vkms.
> > > > >
> > > > > * Scenario 2: Without hardware virtualization support, it is much
> > > > > easier to experience race conditions.
> > > > >
> > > > > With these two scenarios in mind, I conducted a bunch of experiments
> > > > > for trying to shed light on this issue. I did:
> > > > >
> > > > > 1. Enabled lockdep
> > > > >
> > > > > 2. Defined two different environments for testing both using QEMU with
> > > > > and without kvm. See below the QEMU hardware options:
> > > > >
> > > > > * env_kvm: -enable-kvm -daemonize -m 1G -smp cores=2,cpus=2
> > > > > * env_no_kvm: -daemonize -m 2G -smp cores=4,cpus=4
> > > > >
> > > > > 3. My test protocol: I) turn on the vm, II) clean /proc/lock_stat,
> > > > > III) execute kms_cursor_crc, III) save the lock_stat file, and IV)
> > > > > turn off the vm.
> > > > >
> > > > > 4. From the lockdep_stat, I just highlighted the row related to vkms
> > > > > and the columns holdtime-total and holdtime-avg
> > > > >
> > > > > I would like to highlight that the following data does not have any
> > > > > statistical approach, and its intention is solely to assist our
> > > > > discussion. See below the summary of the collected data:
> > > > >
> > > > > Summary of the experiment results:
> > > > >
> > > > > +----------------+----------------+----------------+
> > > > > |                |     env_kvm    |   env_no_kvm   |
> > > > > +                +----------------+----------------+
> > > > > | Test           | Before | After | Before | After |
> > > > > +----------------+--------+-------+--------+-------+
> > > > > | kms_cursor_crc |   S1   |   S2  |   M1   |   M2  |
> > > > > +----------------+--------+-------+--------+-------+
> > > > >
> > > > > * Before: before apply this patchset
> > > > > * After: after apply this patchset
> > > > >
> > > > > -----------------------------------------+------------------+-----------
> > > > > S1: Without this patchset and with kvm   | holdtime-total | holdtime-avg
> > > > > -----------------------------------------+----------------+-------------
> > > > > &(&vkms_out->lock)->rlock:               |  21983.52      |  6.21
> > > > > (work_completion)(&vkms_state->crc_wo:   |  20.47         |  20.47
> > > > > (wq_completion)vkms_crc_workq:           |  3999507.87    |  3352.48
> > > > > &(&vkms_out->state_lock)->rlock:         |  378.47        |  0.30
> > > > > (work_completion)(&vkms_state->crc_#2:   |  3999066.30    |  2848.34
> > > > > -----------------------------------------+----------------+-------------
> > > > > S2: With this patchset and with kvm      |                |
> > > > > -----------------------------------------+----------------+-------------
> > > > > &(&vkms_out->lock)->rlock:               |  23262.83      |  6.34
> > > > > (work_completion)(&vkms_state->crc_wo:   |  8.98          |  8.98
> > > > > &(&vkms_out->crc_lock)->rlock:           |  307.28        |  0.32
> > > > > (wq_completion)vkms_crc_workq:           |  6567727.05    |  12345.35
> > > > > (work_completion)(&vkms_state->crc_#2:   |  6567135.15    |  4488.81
> > > > > -----------------------------------------+----------------+-------------
> > > > > M1: Without this patchset and without kvm|                |
> > > > > -----------------------------------------+----------------+-------------
> > > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > > (work_completion)(&vkms_state->crc_wo:   |  31.32         |  31.32
> > > > > (wq_completion)vkms_crc_workq:           |  20991073.78   |  13525.18
> > > > > (work_completion)(&vkms_state->crc_#2:   |  20988347.18   |  11904.90
> > > > > -----------------------------------------+----------------+-------------
> > > > > M2: With this patchset and without kvm   |                |
> > > > > -----------------------------------------+----------------+-------------
> > > > > (wq_completion)vkms_crc_workq:           |  42819161.68   |  36597.57
> > > > > &(&vkms_out->lock)->rlock:               |  251257.06     |  35.80
> > > > > (work_completion)(&vkms_state->crc_wo:   |  69.37         |  69.37
> > > > > &(&vkms_out->crc_lock)->rlock:           |  3620.92       |  1.54
> > > > > (work_completion)(&vkms_state->crc_#2:   |  42803419.59   |  24306.31
> > > > >
> > > > > First, I analyzed the scenarios with KVM (S1 and S2); more
> > > > > specifically, I focused on these two classes:
> > > > >
> > > > > 1. (work_completion)(&vkms_state->crc_wo
> > > > > 2. (work_completion)(&vkms_state->crc_#2
> > > > >
> > > > > After taking a look at the data, it looks like that this patchset
> > > > > greatly reduces the hold time average for crc_wo. On the other hand,
> > > > > it increases the hold time average for crc_#2. I didn’t understand
> > > > > well the reason for the difference. Could you help me here?
> > > >
> > > > So there's two real locks here from our code, the ones you can see as
> > > > spinlocks:
> > > >
> > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > >
> > > > All the others are fake locks that the workqueue adds, which only exist in
> > > > lockdep. They are used to catch special kinds of deadlocks like the below:
> > > >
> > > > thread A:
> > > > 1. mutex_lock(mutex_A)
> > > > 2. flush_work(work_A)
> > > >
> > > > thread B
> > > > 1. running work_A:
> > > > 2. mutex_lock(mutex_A)
> > > >
> > > > thread B can't continue becuase mutex_A is already held by thread A.
> > > > thread A can't continue because thread B is blocked and the work never
> > > > finishes.
> > > > -> deadlock
> > > >
> > > > I haven't checked which is which, but essentially what you measure with
> > > > the hold times of these fake locks is how long a work execution takes on
> > > > average.
> > > >
> > > > Since my patches are supposed to fix races where the worker can't keep up
> > > > with the vblank hrtimer, then the average worker will (probably) do more,
> > > > so that going up is expected. I think.
> > > >
> > > > I'm honestly not sure what's going on here, never looking into this in
> > > > detail.
> > > >
> > > > > When I looked for the second set of scenarios (M1 and M2, both without
> > > > > KVM), the results look much more distant; basically, this patchset
> > > > > increased the hold time average. Again, could you help me understand a
> > > > > little bit better this issue?
> > > > >
> > > > > Finally, after these tests, I have some questions:
> > > > >
> > > > > 1. VKMS is a software-only driver; because of this, how about defining
> > > > > a minimal system resource for using it?
> > > >
> > > > No idea, in reality it's probably "if it fails too often, buy faster cpu".
> > > > I do think we should make the code robust against a slow cpu, since atm
> > > > that's needed even on pretty fast machines (because our blending code is
> > > > really, really slow and inefficient).
> > > >
> > > > > 2. During my experiments, I noticed that running tests with a VM that
> > > > > uses KVM had consistent results. For example, kms_flip never fails
> > > > > with QEMU+KVM; however, without KVM, two or three tests failed (one of
> > > > > them looks random). If we use vkms for test DRM stuff, should we
> > > > > recommend the use of KVM?
> > > >
> > > > What do you mean without kvm? In general running without kvm shouldn't be
> > > > slower, so I'm a bit confused ... I'm running vgem directly on the
> > > > machine, by booting into new kernels (and controlling the machine over the
> > > > network).
> > >
> > > Sorry, I should have detailed my point.
> > >
> > > Basically, I do all my testing with vkms in QEMU VM. In that sense, I
> > > did some experiments which I enabled and disabled the KVM (i.e., flag
> > > '-enable-kvm') to check the vkms behaviour in these two scenarios. I
> > > noticed that the tests are consistent when I use the '-enable-kvm' flag,
> > > in that context it seemed a good idea to recommend the use of KVM for
> > > those users who want to test vkms with igt. Anyway, don't worry about
> > > that I'll try to add more documentation for vkms in the future and in
> > > that time we discuss about this again.
> >
> > Ah, qemu without kvm is going to use software emulation for a lot of the
> > kernel stuff. That's going to be terribly slow indeed.
> >
> > > Anyway, from my side, I think we should merge this series. Do you want
> > > to prepare a V2 with the fixes and maybe update the commit messages by
> > > using some of your explanations? Or, if you want, I can fix the tiny
> > > details and merge it.
> >
> > I'm deeply burried in my prime cleanup/doc series right now, so will take
> > a few days until I get around to this again. If you want, please go ahead
> > with merging.
> >
> > btw if you edit a patch when merging, please add a comment about that to
> > the commit message. And ime it's best to only augment the commit message
> > and maybe delete an unused variable or so that got forgotten. For
> > everything more better to do the edits and resubmit.
>
> First of all, thank you very much for all your reviews and
> explanation. I’ll try the exercise that you proposed on Patch 1.
>
> I’ll merge patch [4] and [7] since they’re not related to these
> series. For the other patches, I’ll merge them after I finish the new
> version of writeback series. I’ll ping you later.
>
> 4. https://patchwork.freedesktop.org/patch/309031/?series=61737&rev=1
> 7. https://patchwork.freedesktop.org/patch/309029/?series=61737&rev=1$

Can you merge them quicker? I have another 3 vkms patches here
touching that area with some follow-up stuff ...

> Finally, not related with this patchset, can I apply the patch
> “drm/drm_vblank: Change EINVAL by the correct errno” [1] or do I need
> more SoB? I’ll also apply Oleg patch (drm/vkms: add crc sources list).
>
> 1. https://patchwork.freedesktop.org/patch/310006/?series=50697&rev=4

If you want get some acks from igt maintainers (those patches landed
now, right), but this is good enough.
-Daniel


> Thanks
>
> > Thanks, Daniel
> >
> > >
> > > > -Daniel
> > > >
> > > > > Best Regards
> > > > >
> > > > > On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > This here is the first part of a rework for the vkms crc worker. I think
> > > > > > this should fix all the locking/races/use-after-free issues I spotted in
> > > > > > the code. There's more work we can do in the future as a follow-up:
> > > > > >
> > > > > > - directly access vkms_plane_state->base in the crc worker, with this
> > > > > >   approach in this series here that should be safe now. Follow-up patches
> > > > > >   could switch and remove all the separate crc_data infrastructure.
> > > > > >
> > > > > > - I think some kerneldoc for vkms structures would be nice. Documentation
> > > > > >   the various functions is probably overkill.
> > > > > >
> > > > > > - Implementing a more generic blending engine, as prep for adding more
> > > > > >   pixel formats, more planes, and more features in general.
> > > > > >
> > > > > > Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make
> > > > > > things worse, but I didn't do a full run.
> > > > > >
> > > > > > Cheers, Daniel
> > > > > >
> > > > > > Daniel Vetter (10):
> > > > > >   drm/vkms: Fix crc worker races
> > > > > >   drm/vkms: Use spin_lock_irq in process context
> > > > > >   drm/vkms: Rename vkms_output.state_lock to crc_lock
> > > > > >   drm/vkms: Move format arrays to vkms_plane.c
> > > > > >   drm/vkms: Add our own commit_tail
> > > > > >   drm/vkms: flush crc workers earlier in commit flow
> > > > > >   drm/vkms: Dont flush crc worker when we change crc status
> > > > > >   drm/vkms: No _irqsave within spin_lock_irq needed
> > > > > >   drm/vkms: totally reworked crc data tracking
> > > > > >   drm/vkms: No need for ->pages_lock in crc work anymore
> > > > > >
> > > > > >  drivers/gpu/drm/vkms/vkms_crc.c   | 74 +++++++++-------------------
> > > > > >  drivers/gpu/drm/vkms/vkms_crtc.c  | 80 ++++++++++++++++++++++++++-----
> > > > > >  drivers/gpu/drm/vkms/vkms_drv.c   | 35 ++++++++++++++
> > > > > >  drivers/gpu/drm/vkms/vkms_drv.h   | 24 +++++-----
> > > > > >  drivers/gpu/drm/vkms/vkms_plane.c |  8 ++++
> > > > > >  5 files changed, 146 insertions(+), 75 deletions(-)
> > > > > >
> > > > > > --
> > > > > > 2.20.1
> > > > > >
> > > > > > _______________________________________________
> > > > > > dri-devel mailing list
> > > > > > dri-devel@lists.freedesktop.org
> > > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Rodrigo Siqueira
> > > > > https://siqueira.tech
> > > >
> > > > --
> > > > Daniel Vetter
> > > > Software Engineer, Intel Corporation
> > > > http://blog.ffwll.ch
> > >
> > > --
> > > Rodrigo Siqueira
> > > https://siqueira.tech
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
>
>
>
> --
>
> Rodrigo Siqueira
> https://siqueira.tech

Daniel Vetter June 18, 2019, 10:07 p.m. UTC | #7

On Wed, Jun 19, 2019 at 12:06 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Tue, Jun 18, 2019 at 11:54 PM Rodrigo Siqueira
> <rodrigosiqueiramelo@gmail.com> wrote:
> >
> > On Tue, Jun 18, 2019 at 5:56 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Mon, Jun 17, 2019 at 11:49:04PM -0300, Rodrigo Siqueira wrote:
> > > > On 06/12, Daniel Vetter wrote:
> > > > > On Wed, Jun 12, 2019 at 10:28:41AM -0300, Rodrigo Siqueira wrote:
> > > > > > Hi Daniel,
> > > > > >
> > > > > > First of all, thank you very much for your patchset.
> > > > > >
> > > > > > I tried to make a detailed review of your series, and you can see my
> > > > > > comments in each patch. You’ll notice that I asked many things related
> > > > > > to the DRM subsystem with the hope that I could learn a little bit
> > > > > > more about DRM from your comments.
> > > > > >
> > > > > > Before you go through the review, I would like to start a discussion
> > > > > > about the vkms race conditions. First, I have to admit that I did not
> > > > > > understand the race conditions that you described before because I
> > > > > > couldn’t reproduce them. Now, I'm suspecting that I could not
> > > > > > experience the problem because I'm using QEMU with KVM; with this idea
> > > > > > in mind, I suppose that we have two scenarios for using vkms in a
> > > > > > virtual machine:
> > > > > >
> > > > > > * Scenario 1: The user has hardware virtualization support; in this
> > > > > > case, it is a little bit harder to experience race conditions with
> > > > > > vkms.
> > > > > >
> > > > > > * Scenario 2: Without hardware virtualization support, it is much
> > > > > > easier to experience race conditions.
> > > > > >
> > > > > > With these two scenarios in mind, I conducted a bunch of experiments
> > > > > > for trying to shed light on this issue. I did:
> > > > > >
> > > > > > 1. Enabled lockdep
> > > > > >
> > > > > > 2. Defined two different environments for testing both using QEMU with
> > > > > > and without kvm. See below the QEMU hardware options:
> > > > > >
> > > > > > * env_kvm: -enable-kvm -daemonize -m 1G -smp cores=2,cpus=2
> > > > > > * env_no_kvm: -daemonize -m 2G -smp cores=4,cpus=4
> > > > > >
> > > > > > 3. My test protocol: I) turn on the vm, II) clean /proc/lock_stat,
> > > > > > III) execute kms_cursor_crc, III) save the lock_stat file, and IV)
> > > > > > turn off the vm.
> > > > > >
> > > > > > 4. From the lockdep_stat, I just highlighted the row related to vkms
> > > > > > and the columns holdtime-total and holdtime-avg
> > > > > >
> > > > > > I would like to highlight that the following data does not have any
> > > > > > statistical approach, and its intention is solely to assist our
> > > > > > discussion. See below the summary of the collected data:
> > > > > >
> > > > > > Summary of the experiment results:
> > > > > >
> > > > > > +----------------+----------------+----------------+
> > > > > > |                |     env_kvm    |   env_no_kvm   |
> > > > > > +                +----------------+----------------+
> > > > > > | Test           | Before | After | Before | After |
> > > > > > +----------------+--------+-------+--------+-------+
> > > > > > | kms_cursor_crc |   S1   |   S2  |   M1   |   M2  |
> > > > > > +----------------+--------+-------+--------+-------+
> > > > > >
> > > > > > * Before: before apply this patchset
> > > > > > * After: after apply this patchset
> > > > > >
> > > > > > -----------------------------------------+------------------+-----------
> > > > > > S1: Without this patchset and with kvm   | holdtime-total | holdtime-avg
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > &(&vkms_out->lock)->rlock:               |  21983.52      |  6.21
> > > > > > (work_completion)(&vkms_state->crc_wo:   |  20.47         |  20.47
> > > > > > (wq_completion)vkms_crc_workq:           |  3999507.87    |  3352.48
> > > > > > &(&vkms_out->state_lock)->rlock:         |  378.47        |  0.30
> > > > > > (work_completion)(&vkms_state->crc_#2:   |  3999066.30    |  2848.34
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > S2: With this patchset and with kvm      |                |
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > &(&vkms_out->lock)->rlock:               |  23262.83      |  6.34
> > > > > > (work_completion)(&vkms_state->crc_wo:   |  8.98          |  8.98
> > > > > > &(&vkms_out->crc_lock)->rlock:           |  307.28        |  0.32
> > > > > > (wq_completion)vkms_crc_workq:           |  6567727.05    |  12345.35
> > > > > > (work_completion)(&vkms_state->crc_#2:   |  6567135.15    |  4488.81
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > M1: Without this patchset and without kvm|                |
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > > > (work_completion)(&vkms_state->crc_wo:   |  31.32         |  31.32
> > > > > > (wq_completion)vkms_crc_workq:           |  20991073.78   |  13525.18
> > > > > > (work_completion)(&vkms_state->crc_#2:   |  20988347.18   |  11904.90
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > M2: With this patchset and without kvm   |                |
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > (wq_completion)vkms_crc_workq:           |  42819161.68   |  36597.57
> > > > > > &(&vkms_out->lock)->rlock:               |  251257.06     |  35.80
> > > > > > (work_completion)(&vkms_state->crc_wo:   |  69.37         |  69.37
> > > > > > &(&vkms_out->crc_lock)->rlock:           |  3620.92       |  1.54
> > > > > > (work_completion)(&vkms_state->crc_#2:   |  42803419.59   |  24306.31
> > > > > >
> > > > > > First, I analyzed the scenarios with KVM (S1 and S2); more
> > > > > > specifically, I focused on these two classes:
> > > > > >
> > > > > > 1. (work_completion)(&vkms_state->crc_wo
> > > > > > 2. (work_completion)(&vkms_state->crc_#2
> > > > > >
> > > > > > After taking a look at the data, it looks like that this patchset
> > > > > > greatly reduces the hold time average for crc_wo. On the other hand,
> > > > > > it increases the hold time average for crc_#2. I didn’t understand
> > > > > > well the reason for the difference. Could you help me here?
> > > > >
> > > > > So there's two real locks here from our code, the ones you can see as
> > > > > spinlocks:
> > > > >
> > > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > >
> > > > > All the others are fake locks that the workqueue adds, which only exist in
> > > > > lockdep. They are used to catch special kinds of deadlocks like the below:
> > > > >
> > > > > thread A:
> > > > > 1. mutex_lock(mutex_A)
> > > > > 2. flush_work(work_A)
> > > > >
> > > > > thread B
> > > > > 1. running work_A:
> > > > > 2. mutex_lock(mutex_A)
> > > > >
> > > > > thread B can't continue becuase mutex_A is already held by thread A.
> > > > > thread A can't continue because thread B is blocked and the work never
> > > > > finishes.
> > > > > -> deadlock
> > > > >
> > > > > I haven't checked which is which, but essentially what you measure with
> > > > > the hold times of these fake locks is how long a work execution takes on
> > > > > average.
> > > > >
> > > > > Since my patches are supposed to fix races where the worker can't keep up
> > > > > with the vblank hrtimer, then the average worker will (probably) do more,
> > > > > so that going up is expected. I think.
> > > > >
> > > > > I'm honestly not sure what's going on here, never looking into this in
> > > > > detail.
> > > > >
> > > > > > When I looked for the second set of scenarios (M1 and M2, both without
> > > > > > KVM), the results look much more distant; basically, this patchset
> > > > > > increased the hold time average. Again, could you help me understand a
> > > > > > little bit better this issue?
> > > > > >
> > > > > > Finally, after these tests, I have some questions:
> > > > > >
> > > > > > 1. VKMS is a software-only driver; because of this, how about defining
> > > > > > a minimal system resource for using it?
> > > > >
> > > > > No idea, in reality it's probably "if it fails too often, buy faster cpu".
> > > > > I do think we should make the code robust against a slow cpu, since atm
> > > > > that's needed even on pretty fast machines (because our blending code is
> > > > > really, really slow and inefficient).
> > > > >
> > > > > > 2. During my experiments, I noticed that running tests with a VM that
> > > > > > uses KVM had consistent results. For example, kms_flip never fails
> > > > > > with QEMU+KVM; however, without KVM, two or three tests failed (one of
> > > > > > them looks random). If we use vkms for test DRM stuff, should we
> > > > > > recommend the use of KVM?
> > > > >
> > > > > What do you mean without kvm? In general running without kvm shouldn't be
> > > > > slower, so I'm a bit confused ... I'm running vgem directly on the
> > > > > machine, by booting into new kernels (and controlling the machine over the
> > > > > network).
> > > >
> > > > Sorry, I should have detailed my point.
> > > >
> > > > Basically, I do all my testing with vkms in QEMU VM. In that sense, I
> > > > did some experiments which I enabled and disabled the KVM (i.e., flag
> > > > '-enable-kvm') to check the vkms behaviour in these two scenarios. I
> > > > noticed that the tests are consistent when I use the '-enable-kvm' flag,
> > > > in that context it seemed a good idea to recommend the use of KVM for
> > > > those users who want to test vkms with igt. Anyway, don't worry about
> > > > that I'll try to add more documentation for vkms in the future and in
> > > > that time we discuss about this again.
> > >
> > > Ah, qemu without kvm is going to use software emulation for a lot of the
> > > kernel stuff. That's going to be terribly slow indeed.
> > >
> > > > Anyway, from my side, I think we should merge this series. Do you want
> > > > to prepare a V2 with the fixes and maybe update the commit messages by
> > > > using some of your explanations? Or, if you want, I can fix the tiny
> > > > details and merge it.
> > >
> > > I'm deeply burried in my prime cleanup/doc series right now, so will take
> > > a few days until I get around to this again. If you want, please go ahead
> > > with merging.
> > >
> > > btw if you edit a patch when merging, please add a comment about that to
> > > the commit message. And ime it's best to only augment the commit message
> > > and maybe delete an unused variable or so that got forgotten. For
> > > everything more better to do the edits and resubmit.
> >
> > First of all, thank you very much for all your reviews and
> > explanation. I’ll try the exercise that you proposed on Patch 1.
> >
> > I’ll merge patch [4] and [7] since they’re not related to these
> > series. For the other patches, I’ll merge them after I finish the new
> > version of writeback series. I’ll ping you later.
> >
> > 4. https://patchwork.freedesktop.org/patch/309031/?series=61737&rev=1
> > 7. https://patchwork.freedesktop.org/patch/309029/?series=61737&rev=1$
>
> Can you merge them quicker? I have another 3 vkms patches here
> touching that area with some follow-up stuff ...
>
> > Finally, not related with this patchset, can I apply the patch
> > “drm/drm_vblank: Change EINVAL by the correct errno” [1] or do I need
> > more SoB? I’ll also apply Oleg patch (drm/vkms: add crc sources list).
> >
> > 1. https://patchwork.freedesktop.org/patch/310006/?series=50697&rev=4
>
> If you want get some acks from igt maintainers (those patches landed
> now, right), but this is good enough.

Oh wait correction: My review is conditional on you changing that one
thing. So needs another version. Since this is a functional change imo
too much to fix up while applying.
-Daniel

> -Daniel
>
>
> > Thanks
> >
> > > Thanks, Daniel
> > >
> > > >
> > > > > -Daniel
> > > > >
> > > > > > Best Regards
> > > > > >
> > > > > > On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > This here is the first part of a rework for the vkms crc worker. I think
> > > > > > > this should fix all the locking/races/use-after-free issues I spotted in
> > > > > > > the code. There's more work we can do in the future as a follow-up:
> > > > > > >
> > > > > > > - directly access vkms_plane_state->base in the crc worker, with this
> > > > > > >   approach in this series here that should be safe now. Follow-up patches
> > > > > > >   could switch and remove all the separate crc_data infrastructure.
> > > > > > >
> > > > > > > - I think some kerneldoc for vkms structures would be nice. Documentation
> > > > > > >   the various functions is probably overkill.
> > > > > > >
> > > > > > > - Implementing a more generic blending engine, as prep for adding more
> > > > > > >   pixel formats, more planes, and more features in general.
> > > > > > >
> > > > > > > Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make
> > > > > > > things worse, but I didn't do a full run.
> > > > > > >
> > > > > > > Cheers, Daniel
> > > > > > >
> > > > > > > Daniel Vetter (10):
> > > > > > >   drm/vkms: Fix crc worker races
> > > > > > >   drm/vkms: Use spin_lock_irq in process context
> > > > > > >   drm/vkms: Rename vkms_output.state_lock to crc_lock
> > > > > > >   drm/vkms: Move format arrays to vkms_plane.c
> > > > > > >   drm/vkms: Add our own commit_tail
> > > > > > >   drm/vkms: flush crc workers earlier in commit flow
> > > > > > >   drm/vkms: Dont flush crc worker when we change crc status
> > > > > > >   drm/vkms: No _irqsave within spin_lock_irq needed
> > > > > > >   drm/vkms: totally reworked crc data tracking
> > > > > > >   drm/vkms: No need for ->pages_lock in crc work anymore
> > > > > > >
> > > > > > >  drivers/gpu/drm/vkms/vkms_crc.c   | 74 +++++++++-------------------
> > > > > > >  drivers/gpu/drm/vkms/vkms_crtc.c  | 80 ++++++++++++++++++++++++++-----
> > > > > > >  drivers/gpu/drm/vkms/vkms_drv.c   | 35 ++++++++++++++
> > > > > > >  drivers/gpu/drm/vkms/vkms_drv.h   | 24 +++++-----
> > > > > > >  drivers/gpu/drm/vkms/vkms_plane.c |  8 ++++
> > > > > > >  5 files changed, 146 insertions(+), 75 deletions(-)
> > > > > > >
> > > > > > > --
> > > > > > > 2.20.1
> > > > > > >
> > > > > > > _______________________________________________
> > > > > > > dri-devel mailing list
> > > > > > > dri-devel@lists.freedesktop.org
> > > > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Rodrigo Siqueira
> > > > > > https://siqueira.tech
> > > > >
> > > > > --
> > > > > Daniel Vetter
> > > > > Software Engineer, Intel Corporation
> > > > > http://blog.ffwll.ch
> > > >
> > > > --
> > > > Rodrigo Siqueira
> > > > https://siqueira.tech
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > http://blog.ffwll.ch
> >
> >
> >
> > --
> >
> > Rodrigo Siqueira
> > https://siqueira.tech
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

Rodrigo Siqueira June 18, 2019, 10:25 p.m. UTC | #8

On Tue, Jun 18, 2019 at 7:08 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Wed, Jun 19, 2019 at 12:06 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Tue, Jun 18, 2019 at 11:54 PM Rodrigo Siqueira
> > <rodrigosiqueiramelo@gmail.com> wrote:
> > >
> > > On Tue, Jun 18, 2019 at 5:56 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >
> > > > On Mon, Jun 17, 2019 at 11:49:04PM -0300, Rodrigo Siqueira wrote:
> > > > > On 06/12, Daniel Vetter wrote:
> > > > > > On Wed, Jun 12, 2019 at 10:28:41AM -0300, Rodrigo Siqueira wrote:
> > > > > > > Hi Daniel,
> > > > > > >
> > > > > > > First of all, thank you very much for your patchset.
> > > > > > >
> > > > > > > I tried to make a detailed review of your series, and you can see my
> > > > > > > comments in each patch. You’ll notice that I asked many things related
> > > > > > > to the DRM subsystem with the hope that I could learn a little bit
> > > > > > > more about DRM from your comments.
> > > > > > >
> > > > > > > Before you go through the review, I would like to start a discussion
> > > > > > > about the vkms race conditions. First, I have to admit that I did not
> > > > > > > understand the race conditions that you described before because I
> > > > > > > couldn’t reproduce them. Now, I'm suspecting that I could not
> > > > > > > experience the problem because I'm using QEMU with KVM; with this idea
> > > > > > > in mind, I suppose that we have two scenarios for using vkms in a
> > > > > > > virtual machine:
> > > > > > >
> > > > > > > * Scenario 1: The user has hardware virtualization support; in this
> > > > > > > case, it is a little bit harder to experience race conditions with
> > > > > > > vkms.
> > > > > > >
> > > > > > > * Scenario 2: Without hardware virtualization support, it is much
> > > > > > > easier to experience race conditions.
> > > > > > >
> > > > > > > With these two scenarios in mind, I conducted a bunch of experiments
> > > > > > > for trying to shed light on this issue. I did:
> > > > > > >
> > > > > > > 1. Enabled lockdep
> > > > > > >
> > > > > > > 2. Defined two different environments for testing both using QEMU with
> > > > > > > and without kvm. See below the QEMU hardware options:
> > > > > > >
> > > > > > > * env_kvm: -enable-kvm -daemonize -m 1G -smp cores=2,cpus=2
> > > > > > > * env_no_kvm: -daemonize -m 2G -smp cores=4,cpus=4
> > > > > > >
> > > > > > > 3. My test protocol: I) turn on the vm, II) clean /proc/lock_stat,
> > > > > > > III) execute kms_cursor_crc, III) save the lock_stat file, and IV)
> > > > > > > turn off the vm.
> > > > > > >
> > > > > > > 4. From the lockdep_stat, I just highlighted the row related to vkms
> > > > > > > and the columns holdtime-total and holdtime-avg
> > > > > > >
> > > > > > > I would like to highlight that the following data does not have any
> > > > > > > statistical approach, and its intention is solely to assist our
> > > > > > > discussion. See below the summary of the collected data:
> > > > > > >
> > > > > > > Summary of the experiment results:
> > > > > > >
> > > > > > > +----------------+----------------+----------------+
> > > > > > > |                |     env_kvm    |   env_no_kvm   |
> > > > > > > +                +----------------+----------------+
> > > > > > > | Test           | Before | After | Before | After |
> > > > > > > +----------------+--------+-------+--------+-------+
> > > > > > > | kms_cursor_crc |   S1   |   S2  |   M1   |   M2  |
> > > > > > > +----------------+--------+-------+--------+-------+
> > > > > > >
> > > > > > > * Before: before apply this patchset
> > > > > > > * After: after apply this patchset
> > > > > > >
> > > > > > > -----------------------------------------+------------------+-----------
> > > > > > > S1: Without this patchset and with kvm   | holdtime-total | holdtime-avg
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > &(&vkms_out->lock)->rlock:               |  21983.52      |  6.21
> > > > > > > (work_completion)(&vkms_state->crc_wo:   |  20.47         |  20.47
> > > > > > > (wq_completion)vkms_crc_workq:           |  3999507.87    |  3352.48
> > > > > > > &(&vkms_out->state_lock)->rlock:         |  378.47        |  0.30
> > > > > > > (work_completion)(&vkms_state->crc_#2:   |  3999066.30    |  2848.34
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > S2: With this patchset and with kvm      |                |
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > &(&vkms_out->lock)->rlock:               |  23262.83      |  6.34
> > > > > > > (work_completion)(&vkms_state->crc_wo:   |  8.98          |  8.98
> > > > > > > &(&vkms_out->crc_lock)->rlock:           |  307.28        |  0.32
> > > > > > > (wq_completion)vkms_crc_workq:           |  6567727.05    |  12345.35
> > > > > > > (work_completion)(&vkms_state->crc_#2:   |  6567135.15    |  4488.81
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > M1: Without this patchset and without kvm|                |
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > > > > (work_completion)(&vkms_state->crc_wo:   |  31.32         |  31.32
> > > > > > > (wq_completion)vkms_crc_workq:           |  20991073.78   |  13525.18
> > > > > > > (work_completion)(&vkms_state->crc_#2:   |  20988347.18   |  11904.90
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > M2: With this patchset and without kvm   |                |
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > (wq_completion)vkms_crc_workq:           |  42819161.68   |  36597.57
> > > > > > > &(&vkms_out->lock)->rlock:               |  251257.06     |  35.80
> > > > > > > (work_completion)(&vkms_state->crc_wo:   |  69.37         |  69.37
> > > > > > > &(&vkms_out->crc_lock)->rlock:           |  3620.92       |  1.54
> > > > > > > (work_completion)(&vkms_state->crc_#2:   |  42803419.59   |  24306.31
> > > > > > >
> > > > > > > First, I analyzed the scenarios with KVM (S1 and S2); more
> > > > > > > specifically, I focused on these two classes:
> > > > > > >
> > > > > > > 1. (work_completion)(&vkms_state->crc_wo
> > > > > > > 2. (work_completion)(&vkms_state->crc_#2
> > > > > > >
> > > > > > > After taking a look at the data, it looks like that this patchset
> > > > > > > greatly reduces the hold time average for crc_wo. On the other hand,
> > > > > > > it increases the hold time average for crc_#2. I didn’t understand
> > > > > > > well the reason for the difference. Could you help me here?
> > > > > >
> > > > > > So there's two real locks here from our code, the ones you can see as
> > > > > > spinlocks:
> > > > > >
> > > > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > > >
> > > > > > All the others are fake locks that the workqueue adds, which only exist in
> > > > > > lockdep. They are used to catch special kinds of deadlocks like the below:
> > > > > >
> > > > > > thread A:
> > > > > > 1. mutex_lock(mutex_A)
> > > > > > 2. flush_work(work_A)
> > > > > >
> > > > > > thread B
> > > > > > 1. running work_A:
> > > > > > 2. mutex_lock(mutex_A)
> > > > > >
> > > > > > thread B can't continue becuase mutex_A is already held by thread A.
> > > > > > thread A can't continue because thread B is blocked and the work never
> > > > > > finishes.
> > > > > > -> deadlock
> > > > > >
> > > > > > I haven't checked which is which, but essentially what you measure with
> > > > > > the hold times of these fake locks is how long a work execution takes on
> > > > > > average.
> > > > > >
> > > > > > Since my patches are supposed to fix races where the worker can't keep up
> > > > > > with the vblank hrtimer, then the average worker will (probably) do more,
> > > > > > so that going up is expected. I think.
> > > > > >
> > > > > > I'm honestly not sure what's going on here, never looking into this in
> > > > > > detail.
> > > > > >
> > > > > > > When I looked for the second set of scenarios (M1 and M2, both without
> > > > > > > KVM), the results look much more distant; basically, this patchset
> > > > > > > increased the hold time average. Again, could you help me understand a
> > > > > > > little bit better this issue?
> > > > > > >
> > > > > > > Finally, after these tests, I have some questions:
> > > > > > >
> > > > > > > 1. VKMS is a software-only driver; because of this, how about defining
> > > > > > > a minimal system resource for using it?
> > > > > >
> > > > > > No idea, in reality it's probably "if it fails too often, buy faster cpu".
> > > > > > I do think we should make the code robust against a slow cpu, since atm
> > > > > > that's needed even on pretty fast machines (because our blending code is
> > > > > > really, really slow and inefficient).
> > > > > >
> > > > > > > 2. During my experiments, I noticed that running tests with a VM that
> > > > > > > uses KVM had consistent results. For example, kms_flip never fails
> > > > > > > with QEMU+KVM; however, without KVM, two or three tests failed (one of
> > > > > > > them looks random). If we use vkms for test DRM stuff, should we
> > > > > > > recommend the use of KVM?
> > > > > >
> > > > > > What do you mean without kvm? In general running without kvm shouldn't be
> > > > > > slower, so I'm a bit confused ... I'm running vgem directly on the
> > > > > > machine, by booting into new kernels (and controlling the machine over the
> > > > > > network).
> > > > >
> > > > > Sorry, I should have detailed my point.
> > > > >
> > > > > Basically, I do all my testing with vkms in QEMU VM. In that sense, I
> > > > > did some experiments which I enabled and disabled the KVM (i.e., flag
> > > > > '-enable-kvm') to check the vkms behaviour in these two scenarios. I
> > > > > noticed that the tests are consistent when I use the '-enable-kvm' flag,
> > > > > in that context it seemed a good idea to recommend the use of KVM for
> > > > > those users who want to test vkms with igt. Anyway, don't worry about
> > > > > that I'll try to add more documentation for vkms in the future and in
> > > > > that time we discuss about this again.
> > > >
> > > > Ah, qemu without kvm is going to use software emulation for a lot of the
> > > > kernel stuff. That's going to be terribly slow indeed.
> > > >
> > > > > Anyway, from my side, I think we should merge this series. Do you want
> > > > > to prepare a V2 with the fixes and maybe update the commit messages by
> > > > > using some of your explanations? Or, if you want, I can fix the tiny
> > > > > details and merge it.
> > > >
> > > > I'm deeply burried in my prime cleanup/doc series right now, so will take
> > > > a few days until I get around to this again. If you want, please go ahead
> > > > with merging.
> > > >
> > > > btw if you edit a patch when merging, please add a comment about that to
> > > > the commit message. And ime it's best to only augment the commit message
> > > > and maybe delete an unused variable or so that got forgotten. For
> > > > everything more better to do the edits and resubmit.
> > >
> > > First of all, thank you very much for all your reviews and
> > > explanation. I’ll try the exercise that you proposed on Patch 1.
> > >
> > > I’ll merge patch [4] and [7] since they’re not related to these
> > > series. For the other patches, I’ll merge them after I finish the new
> > > version of writeback series. I’ll ping you later.
> > >
> > > 4. https://patchwork.freedesktop.org/patch/309031/?series=61737&rev=1
> > > 7. https://patchwork.freedesktop.org/patch/309029/?series=61737&rev=1$
> >
> > Can you merge them quicker? I have another 3 vkms patches here
> > touching that area with some follow-up stuff ...

Do you mean patch 4 and 7 right? I cannot merge it right now, but I
can merge it tonight; however, I'm fine if you want to merge it.

> > > Finally, not related with this patchset, can I apply the patch
> > > “drm/drm_vblank: Change EINVAL by the correct errno” [1] or do I need
> > > more SoB? I’ll also apply Oleg patch (drm/vkms: add crc sources list).
> > >
> > > 1. https://patchwork.freedesktop.org/patch/310006/?series=50697&rev=4
> >
> > If you want get some acks from igt maintainers (those patches landed
> > now, right), but this is good enough.
>
> Oh wait correction: My review is conditional on you changing that one
> thing. So needs another version. Since this is a functional change imo
> too much to fix up while applying.

In your comment you said:

  >   if (vblwait->request.type & _DRM_VBLANK_SIGNAL)
  > - return -EINVAL;
  > + return -EOPNOTSUPP;

  Not sure we want EINVAL here, that's kinda a "parameters are wrong"
  version too. With that changed:

I think I did not got your point here, sorry for that... so, do you
want that I change EOPNOTSUPP by EINVAL in the above code?

> -Daniel
>
> > -Daniel
> >
> >
> > > Thanks
> > >
> > > > Thanks, Daniel
> > > >
> > > > >
> > > > > > -Daniel
> > > > > >
> > > > > > > Best Regards
> > > > > > >
> > > > > > > On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > > > > > >
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > This here is the first part of a rework for the vkms crc worker. I think
> > > > > > > > this should fix all the locking/races/use-after-free issues I spotted in
> > > > > > > > the code. There's more work we can do in the future as a follow-up:
> > > > > > > >
> > > > > > > > - directly access vkms_plane_state->base in the crc worker, with this
> > > > > > > >   approach in this series here that should be safe now. Follow-up patches
> > > > > > > >   could switch and remove all the separate crc_data infrastructure.
> > > > > > > >
> > > > > > > > - I think some kerneldoc for vkms structures would be nice. Documentation
> > > > > > > >   the various functions is probably overkill.
> > > > > > > >
> > > > > > > > - Implementing a more generic blending engine, as prep for adding more
> > > > > > > >   pixel formats, more planes, and more features in general.
> > > > > > > >
> > > > > > > > Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make
> > > > > > > > things worse, but I didn't do a full run.
> > > > > > > >
> > > > > > > > Cheers, Daniel
> > > > > > > >
> > > > > > > > Daniel Vetter (10):
> > > > > > > >   drm/vkms: Fix crc worker races
> > > > > > > >   drm/vkms: Use spin_lock_irq in process context
> > > > > > > >   drm/vkms: Rename vkms_output.state_lock to crc_lock
> > > > > > > >   drm/vkms: Move format arrays to vkms_plane.c
> > > > > > > >   drm/vkms: Add our own commit_tail
> > > > > > > >   drm/vkms: flush crc workers earlier in commit flow
> > > > > > > >   drm/vkms: Dont flush crc worker when we change crc status
> > > > > > > >   drm/vkms: No _irqsave within spin_lock_irq needed
> > > > > > > >   drm/vkms: totally reworked crc data tracking
> > > > > > > >   drm/vkms: No need for ->pages_lock in crc work anymore
> > > > > > > >
> > > > > > > >  drivers/gpu/drm/vkms/vkms_crc.c   | 74 +++++++++-------------------
> > > > > > > >  drivers/gpu/drm/vkms/vkms_crtc.c  | 80 ++++++++++++++++++++++++++-----
> > > > > > > >  drivers/gpu/drm/vkms/vkms_drv.c   | 35 ++++++++++++++
> > > > > > > >  drivers/gpu/drm/vkms/vkms_drv.h   | 24 +++++-----
> > > > > > > >  drivers/gpu/drm/vkms/vkms_plane.c |  8 ++++
> > > > > > > >  5 files changed, 146 insertions(+), 75 deletions(-)
> > > > > > > >
> > > > > > > > --
> > > > > > > > 2.20.1
> > > > > > > >
> > > > > > > > _______________________________________________
> > > > > > > > dri-devel mailing list
> > > > > > > > dri-devel@lists.freedesktop.org
> > > > > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Rodrigo Siqueira
> > > > > > > https://siqueira.tech
> > > > > >
> > > > > > --
> > > > > > Daniel Vetter
> > > > > > Software Engineer, Intel Corporation
> > > > > > http://blog.ffwll.ch
> > > > >
> > > > > --
> > > > > Rodrigo Siqueira
> > > > > https://siqueira.tech
> > > >
> > > > --
> > > > Daniel Vetter
> > > > Software Engineer, Intel Corporation
> > > > http://blog.ffwll.ch
> > >
> > >
> > >
> > > --
> > >
> > > Rodrigo Siqueira
> > > https://siqueira.tech
> >
> >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

Daniel Vetter June 18, 2019, 10:39 p.m. UTC | #9

On Wed, Jun 19, 2019 at 12:25 AM Rodrigo Siqueira
<rodrigosiqueiramelo@gmail.com> wrote:
>
> On Tue, Jun 18, 2019 at 7:08 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Wed, Jun 19, 2019 at 12:06 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Tue, Jun 18, 2019 at 11:54 PM Rodrigo Siqueira
> > > <rodrigosiqueiramelo@gmail.com> wrote:
> > > > Finally, not related with this patchset, can I apply the patch
> > > > “drm/drm_vblank: Change EINVAL by the correct errno” [1] or do I need
> > > > more SoB? I’ll also apply Oleg patch (drm/vkms: add crc sources list).
> > > >
> > > > 1. https://patchwork.freedesktop.org/patch/310006/?series=50697&rev=4
> > >
> > > If you want get some acks from igt maintainers (those patches landed
> > > now, right), but this is good enough.
> >
> > Oh wait correction: My review is conditional on you changing that one
> > thing. So needs another version. Since this is a functional change imo
> > too much to fix up while applying.
>
> In your comment you said:
>
>   >   if (vblwait->request.type & _DRM_VBLANK_SIGNAL)
>   > - return -EINVAL;
>   > + return -EOPNOTSUPP;
>
>   Not sure we want EINVAL here, that's kinda a "parameters are wrong"
>   version too. With that changed:
>
> I think I did not got your point here, sorry for that... so, do you
> want that I change EOPNOTSUPP by EINVAL in the above code?

Oops, that was wrong. I meant to say that I don't see why we should
use EOPNOTSUPP here, the EINVAL indicating a wrong argument seems more
fitting to me. It's been pretty much forever (if we every supported
this) that vblank signal worked on linux. Ok, did a quick check, it
died in 2009. That's before the kms stuff landed, there's definitely
no userspace around anymore that ever expected this to work :-) Hence
why I think EINVAL is more fitting ...
-Daniel

Rodrigo Siqueira June 26, 2019, 1:44 a.m. UTC | #10

On 06/19, Daniel Vetter wrote:
> On Tue, Jun 18, 2019 at 11:54 PM Rodrigo Siqueira
> <rodrigosiqueiramelo@gmail.com> wrote:
> >
> > On Tue, Jun 18, 2019 at 5:56 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Mon, Jun 17, 2019 at 11:49:04PM -0300, Rodrigo Siqueira wrote:
> > > > On 06/12, Daniel Vetter wrote:
> > > > > On Wed, Jun 12, 2019 at 10:28:41AM -0300, Rodrigo Siqueira wrote:
> > > > > > Hi Daniel,
> > > > > >
> > > > > > First of all, thank you very much for your patchset.
> > > > > >
> > > > > > I tried to make a detailed review of your series, and you can see my
> > > > > > comments in each patch. You’ll notice that I asked many things related
> > > > > > to the DRM subsystem with the hope that I could learn a little bit
> > > > > > more about DRM from your comments.
> > > > > >
> > > > > > Before you go through the review, I would like to start a discussion
> > > > > > about the vkms race conditions. First, I have to admit that I did not
> > > > > > understand the race conditions that you described before because I
> > > > > > couldn’t reproduce them. Now, I'm suspecting that I could not
> > > > > > experience the problem because I'm using QEMU with KVM; with this idea
> > > > > > in mind, I suppose that we have two scenarios for using vkms in a
> > > > > > virtual machine:
> > > > > >
> > > > > > * Scenario 1: The user has hardware virtualization support; in this
> > > > > > case, it is a little bit harder to experience race conditions with
> > > > > > vkms.
> > > > > >
> > > > > > * Scenario 2: Without hardware virtualization support, it is much
> > > > > > easier to experience race conditions.
> > > > > >
> > > > > > With these two scenarios in mind, I conducted a bunch of experiments
> > > > > > for trying to shed light on this issue. I did:
> > > > > >
> > > > > > 1. Enabled lockdep
> > > > > >
> > > > > > 2. Defined two different environments for testing both using QEMU with
> > > > > > and without kvm. See below the QEMU hardware options:
> > > > > >
> > > > > > * env_kvm: -enable-kvm -daemonize -m 1G -smp cores=2,cpus=2
> > > > > > * env_no_kvm: -daemonize -m 2G -smp cores=4,cpus=4
> > > > > >
> > > > > > 3. My test protocol: I) turn on the vm, II) clean /proc/lock_stat,
> > > > > > III) execute kms_cursor_crc, III) save the lock_stat file, and IV)
> > > > > > turn off the vm.
> > > > > >
> > > > > > 4. From the lockdep_stat, I just highlighted the row related to vkms
> > > > > > and the columns holdtime-total and holdtime-avg
> > > > > >
> > > > > > I would like to highlight that the following data does not have any
> > > > > > statistical approach, and its intention is solely to assist our
> > > > > > discussion. See below the summary of the collected data:
> > > > > >
> > > > > > Summary of the experiment results:
> > > > > >
> > > > > > +----------------+----------------+----------------+
> > > > > > |                |     env_kvm    |   env_no_kvm   |
> > > > > > +                +----------------+----------------+
> > > > > > | Test           | Before | After | Before | After |
> > > > > > +----------------+--------+-------+--------+-------+
> > > > > > | kms_cursor_crc |   S1   |   S2  |   M1   |   M2  |
> > > > > > +----------------+--------+-------+--------+-------+
> > > > > >
> > > > > > * Before: before apply this patchset
> > > > > > * After: after apply this patchset
> > > > > >
> > > > > > -----------------------------------------+------------------+-----------
> > > > > > S1: Without this patchset and with kvm   | holdtime-total | holdtime-avg
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > &(&vkms_out->lock)->rlock:               |  21983.52      |  6.21
> > > > > > (work_completion)(&vkms_state->crc_wo:   |  20.47         |  20.47
> > > > > > (wq_completion)vkms_crc_workq:           |  3999507.87    |  3352.48
> > > > > > &(&vkms_out->state_lock)->rlock:         |  378.47        |  0.30
> > > > > > (work_completion)(&vkms_state->crc_#2:   |  3999066.30    |  2848.34
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > S2: With this patchset and with kvm      |                |
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > &(&vkms_out->lock)->rlock:               |  23262.83      |  6.34
> > > > > > (work_completion)(&vkms_state->crc_wo:   |  8.98          |  8.98
> > > > > > &(&vkms_out->crc_lock)->rlock:           |  307.28        |  0.32
> > > > > > (wq_completion)vkms_crc_workq:           |  6567727.05    |  12345.35
> > > > > > (work_completion)(&vkms_state->crc_#2:   |  6567135.15    |  4488.81
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > M1: Without this patchset and without kvm|                |
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > > > (work_completion)(&vkms_state->crc_wo:   |  31.32         |  31.32
> > > > > > (wq_completion)vkms_crc_workq:           |  20991073.78   |  13525.18
> > > > > > (work_completion)(&vkms_state->crc_#2:   |  20988347.18   |  11904.90
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > M2: With this patchset and without kvm   |                |
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > (wq_completion)vkms_crc_workq:           |  42819161.68   |  36597.57
> > > > > > &(&vkms_out->lock)->rlock:               |  251257.06     |  35.80
> > > > > > (work_completion)(&vkms_state->crc_wo:   |  69.37         |  69.37
> > > > > > &(&vkms_out->crc_lock)->rlock:           |  3620.92       |  1.54
> > > > > > (work_completion)(&vkms_state->crc_#2:   |  42803419.59   |  24306.31
> > > > > >
> > > > > > First, I analyzed the scenarios with KVM (S1 and S2); more
> > > > > > specifically, I focused on these two classes:
> > > > > >
> > > > > > 1. (work_completion)(&vkms_state->crc_wo
> > > > > > 2. (work_completion)(&vkms_state->crc_#2
> > > > > >
> > > > > > After taking a look at the data, it looks like that this patchset
> > > > > > greatly reduces the hold time average for crc_wo. On the other hand,
> > > > > > it increases the hold time average for crc_#2. I didn’t understand
> > > > > > well the reason for the difference. Could you help me here?
> > > > >
> > > > > So there's two real locks here from our code, the ones you can see as
> > > > > spinlocks:
> > > > >
> > > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > >
> > > > > All the others are fake locks that the workqueue adds, which only exist in
> > > > > lockdep. They are used to catch special kinds of deadlocks like the below:
> > > > >
> > > > > thread A:
> > > > > 1. mutex_lock(mutex_A)
> > > > > 2. flush_work(work_A)
> > > > >
> > > > > thread B
> > > > > 1. running work_A:
> > > > > 2. mutex_lock(mutex_A)
> > > > >
> > > > > thread B can't continue becuase mutex_A is already held by thread A.
> > > > > thread A can't continue because thread B is blocked and the work never
> > > > > finishes.
> > > > > -> deadlock
> > > > >
> > > > > I haven't checked which is which, but essentially what you measure with
> > > > > the hold times of these fake locks is how long a work execution takes on
> > > > > average.
> > > > >
> > > > > Since my patches are supposed to fix races where the worker can't keep up
> > > > > with the vblank hrtimer, then the average worker will (probably) do more,
> > > > > so that going up is expected. I think.
> > > > >
> > > > > I'm honestly not sure what's going on here, never looking into this in
> > > > > detail.
> > > > >
> > > > > > When I looked for the second set of scenarios (M1 and M2, both without
> > > > > > KVM), the results look much more distant; basically, this patchset
> > > > > > increased the hold time average. Again, could you help me understand a
> > > > > > little bit better this issue?
> > > > > >
> > > > > > Finally, after these tests, I have some questions:
> > > > > >
> > > > > > 1. VKMS is a software-only driver; because of this, how about defining
> > > > > > a minimal system resource for using it?
> > > > >
> > > > > No idea, in reality it's probably "if it fails too often, buy faster cpu".
> > > > > I do think we should make the code robust against a slow cpu, since atm
> > > > > that's needed even on pretty fast machines (because our blending code is
> > > > > really, really slow and inefficient).
> > > > >
> > > > > > 2. During my experiments, I noticed that running tests with a VM that
> > > > > > uses KVM had consistent results. For example, kms_flip never fails
> > > > > > with QEMU+KVM; however, without KVM, two or three tests failed (one of
> > > > > > them looks random). If we use vkms for test DRM stuff, should we
> > > > > > recommend the use of KVM?
> > > > >
> > > > > What do you mean without kvm? In general running without kvm shouldn't be
> > > > > slower, so I'm a bit confused ... I'm running vgem directly on the
> > > > > machine, by booting into new kernels (and controlling the machine over the
> > > > > network).
> > > >
> > > > Sorry, I should have detailed my point.
> > > >
> > > > Basically, I do all my testing with vkms in QEMU VM. In that sense, I
> > > > did some experiments which I enabled and disabled the KVM (i.e., flag
> > > > '-enable-kvm') to check the vkms behaviour in these two scenarios. I
> > > > noticed that the tests are consistent when I use the '-enable-kvm' flag,
> > > > in that context it seemed a good idea to recommend the use of KVM for
> > > > those users who want to test vkms with igt. Anyway, don't worry about
> > > > that I'll try to add more documentation for vkms in the future and in
> > > > that time we discuss about this again.
> > >
> > > Ah, qemu without kvm is going to use software emulation for a lot of the
> > > kernel stuff. That's going to be terribly slow indeed.
> > >
> > > > Anyway, from my side, I think we should merge this series. Do you want
> > > > to prepare a V2 with the fixes and maybe update the commit messages by
> > > > using some of your explanations? Or, if you want, I can fix the tiny
> > > > details and merge it.
> > >
> > > I'm deeply burried in my prime cleanup/doc series right now, so will take
> > > a few days until I get around to this again. If you want, please go ahead
> > > with merging.
> > >
> > > btw if you edit a patch when merging, please add a comment about that to
> > > the commit message. And ime it's best to only augment the commit message
> > > and maybe delete an unused variable or so that got forgotten. For
> > > everything more better to do the edits and resubmit.
> >
> > First of all, thank you very much for all your reviews and
> > explanation. I’ll try the exercise that you proposed on Patch 1.
> >
> > I’ll merge patch [4] and [7] since they’re not related to these
> > series. For the other patches, I’ll merge them after I finish the new
> > version of writeback series. I’ll ping you later.
> >

Hi,

I already sent the new version of the writeback patchset. So, how do you
want to proceed with this series? Do you prefer to send a V2 or should I
apply the patchset and make the tiny fixes?

Thanks

> > 4. https://patchwork.freedesktop.org/patch/309031/?series=61737&rev=1
> > 7. https://patchwork.freedesktop.org/patch/309029/?series=61737&rev=1$
> 
> Can you merge them quicker? I have another 3 vkms patches here
> touching that area with some follow-up stuff ...
> 
> > Finally, not related with this patchset, can I apply the patch
> > “drm/drm_vblank: Change EINVAL by the correct errno” [1] or do I need
> > more SoB? I’ll also apply Oleg patch (drm/vkms: add crc sources list).
> >
> > 1. https://patchwork.freedesktop.org/patch/310006/?series=50697&rev=4
> 
> If you want get some acks from igt maintainers (those patches landed
> now, right), but this is good enough.
> -Daniel
> 
> 
> > Thanks
> >
> > > Thanks, Daniel
> > >
> > > >
> > > > > -Daniel
> > > > >
> > > > > > Best Regards
> > > > > >
> > > > > > On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > This here is the first part of a rework for the vkms crc worker. I think
> > > > > > > this should fix all the locking/races/use-after-free issues I spotted in
> > > > > > > the code. There's more work we can do in the future as a follow-up:
> > > > > > >
> > > > > > > - directly access vkms_plane_state->base in the crc worker, with this
> > > > > > >   approach in this series here that should be safe now. Follow-up patches
> > > > > > >   could switch and remove all the separate crc_data infrastructure.
> > > > > > >
> > > > > > > - I think some kerneldoc for vkms structures would be nice. Documentation
> > > > > > >   the various functions is probably overkill.
> > > > > > >
> > > > > > > - Implementing a more generic blending engine, as prep for adding more
> > > > > > >   pixel formats, more planes, and more features in general.
> > > > > > >
> > > > > > > Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make
> > > > > > > things worse, but I didn't do a full run.
> > > > > > >
> > > > > > > Cheers, Daniel
> > > > > > >
> > > > > > > Daniel Vetter (10):
> > > > > > >   drm/vkms: Fix crc worker races
> > > > > > >   drm/vkms: Use spin_lock_irq in process context
> > > > > > >   drm/vkms: Rename vkms_output.state_lock to crc_lock
> > > > > > >   drm/vkms: Move format arrays to vkms_plane.c
> > > > > > >   drm/vkms: Add our own commit_tail
> > > > > > >   drm/vkms: flush crc workers earlier in commit flow
> > > > > > >   drm/vkms: Dont flush crc worker when we change crc status
> > > > > > >   drm/vkms: No _irqsave within spin_lock_irq needed
> > > > > > >   drm/vkms: totally reworked crc data tracking
> > > > > > >   drm/vkms: No need for ->pages_lock in crc work anymore
> > > > > > >
> > > > > > >  drivers/gpu/drm/vkms/vkms_crc.c   | 74 +++++++++-------------------
> > > > > > >  drivers/gpu/drm/vkms/vkms_crtc.c  | 80 ++++++++++++++++++++++++++-----
> > > > > > >  drivers/gpu/drm/vkms/vkms_drv.c   | 35 ++++++++++++++
> > > > > > >  drivers/gpu/drm/vkms/vkms_drv.h   | 24 +++++-----
> > > > > > >  drivers/gpu/drm/vkms/vkms_plane.c |  8 ++++
> > > > > > >  5 files changed, 146 insertions(+), 75 deletions(-)
> > > > > > >
> > > > > > > --
> > > > > > > 2.20.1
> > > > > > >
> > > > > > > _______________________________________________
> > > > > > > dri-devel mailing list
> > > > > > > dri-devel@lists.freedesktop.org
> > > > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Rodrigo Siqueira
> > > > > > https://siqueira.tech
> > > > >
> > > > > --
> > > > > Daniel Vetter
> > > > > Software Engineer, Intel Corporation
> > > > > http://blog.ffwll.ch
> > > >
> > > > --
> > > > Rodrigo Siqueira
> > > > https://siqueira.tech
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > http://blog.ffwll.ch
> >
> >
> >
> > --
> >
> > Rodrigo Siqueira
> > https://siqueira.tech
> 
> 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

Daniel Vetter June 26, 2019, 7:54 a.m. UTC | #11

On Wed, Jun 26, 2019 at 3:44 AM Rodrigo Siqueira
<rodrigosiqueiramelo@gmail.com> wrote:
>
> On 06/19, Daniel Vetter wrote:
> > On Tue, Jun 18, 2019 at 11:54 PM Rodrigo Siqueira
> > <rodrigosiqueiramelo@gmail.com> wrote:
> > >
> > > On Tue, Jun 18, 2019 at 5:56 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >
> > > > On Mon, Jun 17, 2019 at 11:49:04PM -0300, Rodrigo Siqueira wrote:
> > > > > On 06/12, Daniel Vetter wrote:
> > > > > > On Wed, Jun 12, 2019 at 10:28:41AM -0300, Rodrigo Siqueira wrote:
> > > > > > > Hi Daniel,
> > > > > > >
> > > > > > > First of all, thank you very much for your patchset.
> > > > > > >
> > > > > > > I tried to make a detailed review of your series, and you can see my
> > > > > > > comments in each patch. You’ll notice that I asked many things related
> > > > > > > to the DRM subsystem with the hope that I could learn a little bit
> > > > > > > more about DRM from your comments.
> > > > > > >
> > > > > > > Before you go through the review, I would like to start a discussion
> > > > > > > about the vkms race conditions. First, I have to admit that I did not
> > > > > > > understand the race conditions that you described before because I
> > > > > > > couldn’t reproduce them. Now, I'm suspecting that I could not
> > > > > > > experience the problem because I'm using QEMU with KVM; with this idea
> > > > > > > in mind, I suppose that we have two scenarios for using vkms in a
> > > > > > > virtual machine:
> > > > > > >
> > > > > > > * Scenario 1: The user has hardware virtualization support; in this
> > > > > > > case, it is a little bit harder to experience race conditions with
> > > > > > > vkms.
> > > > > > >
> > > > > > > * Scenario 2: Without hardware virtualization support, it is much
> > > > > > > easier to experience race conditions.
> > > > > > >
> > > > > > > With these two scenarios in mind, I conducted a bunch of experiments
> > > > > > > for trying to shed light on this issue. I did:
> > > > > > >
> > > > > > > 1. Enabled lockdep
> > > > > > >
> > > > > > > 2. Defined two different environments for testing both using QEMU with
> > > > > > > and without kvm. See below the QEMU hardware options:
> > > > > > >
> > > > > > > * env_kvm: -enable-kvm -daemonize -m 1G -smp cores=2,cpus=2
> > > > > > > * env_no_kvm: -daemonize -m 2G -smp cores=4,cpus=4
> > > > > > >
> > > > > > > 3. My test protocol: I) turn on the vm, II) clean /proc/lock_stat,
> > > > > > > III) execute kms_cursor_crc, III) save the lock_stat file, and IV)
> > > > > > > turn off the vm.
> > > > > > >
> > > > > > > 4. From the lockdep_stat, I just highlighted the row related to vkms
> > > > > > > and the columns holdtime-total and holdtime-avg
> > > > > > >
> > > > > > > I would like to highlight that the following data does not have any
> > > > > > > statistical approach, and its intention is solely to assist our
> > > > > > > discussion. See below the summary of the collected data:
> > > > > > >
> > > > > > > Summary of the experiment results:
> > > > > > >
> > > > > > > +----------------+----------------+----------------+
> > > > > > > |                |     env_kvm    |   env_no_kvm   |
> > > > > > > +                +----------------+----------------+
> > > > > > > | Test           | Before | After | Before | After |
> > > > > > > +----------------+--------+-------+--------+-------+
> > > > > > > | kms_cursor_crc |   S1   |   S2  |   M1   |   M2  |
> > > > > > > +----------------+--------+-------+--------+-------+
> > > > > > >
> > > > > > > * Before: before apply this patchset
> > > > > > > * After: after apply this patchset
> > > > > > >
> > > > > > > -----------------------------------------+------------------+-----------
> > > > > > > S1: Without this patchset and with kvm   | holdtime-total | holdtime-avg
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > &(&vkms_out->lock)->rlock:               |  21983.52      |  6.21
> > > > > > > (work_completion)(&vkms_state->crc_wo:   |  20.47         |  20.47
> > > > > > > (wq_completion)vkms_crc_workq:           |  3999507.87    |  3352.48
> > > > > > > &(&vkms_out->state_lock)->rlock:         |  378.47        |  0.30
> > > > > > > (work_completion)(&vkms_state->crc_#2:   |  3999066.30    |  2848.34
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > S2: With this patchset and with kvm      |                |
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > &(&vkms_out->lock)->rlock:               |  23262.83      |  6.34
> > > > > > > (work_completion)(&vkms_state->crc_wo:   |  8.98          |  8.98
> > > > > > > &(&vkms_out->crc_lock)->rlock:           |  307.28        |  0.32
> > > > > > > (wq_completion)vkms_crc_workq:           |  6567727.05    |  12345.35
> > > > > > > (work_completion)(&vkms_state->crc_#2:   |  6567135.15    |  4488.81
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > M1: Without this patchset and without kvm|                |
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > > > > (work_completion)(&vkms_state->crc_wo:   |  31.32         |  31.32
> > > > > > > (wq_completion)vkms_crc_workq:           |  20991073.78   |  13525.18
> > > > > > > (work_completion)(&vkms_state->crc_#2:   |  20988347.18   |  11904.90
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > M2: With this patchset and without kvm   |                |
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > (wq_completion)vkms_crc_workq:           |  42819161.68   |  36597.57
> > > > > > > &(&vkms_out->lock)->rlock:               |  251257.06     |  35.80
> > > > > > > (work_completion)(&vkms_state->crc_wo:   |  69.37         |  69.37
> > > > > > > &(&vkms_out->crc_lock)->rlock:           |  3620.92       |  1.54
> > > > > > > (work_completion)(&vkms_state->crc_#2:   |  42803419.59   |  24306.31
> > > > > > >
> > > > > > > First, I analyzed the scenarios with KVM (S1 and S2); more
> > > > > > > specifically, I focused on these two classes:
> > > > > > >
> > > > > > > 1. (work_completion)(&vkms_state->crc_wo
> > > > > > > 2. (work_completion)(&vkms_state->crc_#2
> > > > > > >
> > > > > > > After taking a look at the data, it looks like that this patchset
> > > > > > > greatly reduces the hold time average for crc_wo. On the other hand,
> > > > > > > it increases the hold time average for crc_#2. I didn’t understand
> > > > > > > well the reason for the difference. Could you help me here?
> > > > > >
> > > > > > So there's two real locks here from our code, the ones you can see as
> > > > > > spinlocks:
> > > > > >
> > > > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > > >
> > > > > > All the others are fake locks that the workqueue adds, which only exist in
> > > > > > lockdep. They are used to catch special kinds of deadlocks like the below:
> > > > > >
> > > > > > thread A:
> > > > > > 1. mutex_lock(mutex_A)
> > > > > > 2. flush_work(work_A)
> > > > > >
> > > > > > thread B
> > > > > > 1. running work_A:
> > > > > > 2. mutex_lock(mutex_A)
> > > > > >
> > > > > > thread B can't continue becuase mutex_A is already held by thread A.
> > > > > > thread A can't continue because thread B is blocked and the work never
> > > > > > finishes.
> > > > > > -> deadlock
> > > > > >
> > > > > > I haven't checked which is which, but essentially what you measure with
> > > > > > the hold times of these fake locks is how long a work execution takes on
> > > > > > average.
> > > > > >
> > > > > > Since my patches are supposed to fix races where the worker can't keep up
> > > > > > with the vblank hrtimer, then the average worker will (probably) do more,
> > > > > > so that going up is expected. I think.
> > > > > >
> > > > > > I'm honestly not sure what's going on here, never looking into this in
> > > > > > detail.
> > > > > >
> > > > > > > When I looked for the second set of scenarios (M1 and M2, both without
> > > > > > > KVM), the results look much more distant; basically, this patchset
> > > > > > > increased the hold time average. Again, could you help me understand a
> > > > > > > little bit better this issue?
> > > > > > >
> > > > > > > Finally, after these tests, I have some questions:
> > > > > > >
> > > > > > > 1. VKMS is a software-only driver; because of this, how about defining
> > > > > > > a minimal system resource for using it?
> > > > > >
> > > > > > No idea, in reality it's probably "if it fails too often, buy faster cpu".
> > > > > > I do think we should make the code robust against a slow cpu, since atm
> > > > > > that's needed even on pretty fast machines (because our blending code is
> > > > > > really, really slow and inefficient).
> > > > > >
> > > > > > > 2. During my experiments, I noticed that running tests with a VM that
> > > > > > > uses KVM had consistent results. For example, kms_flip never fails
> > > > > > > with QEMU+KVM; however, without KVM, two or three tests failed (one of
> > > > > > > them looks random). If we use vkms for test DRM stuff, should we
> > > > > > > recommend the use of KVM?
> > > > > >
> > > > > > What do you mean without kvm? In general running without kvm shouldn't be
> > > > > > slower, so I'm a bit confused ... I'm running vgem directly on the
> > > > > > machine, by booting into new kernels (and controlling the machine over the
> > > > > > network).
> > > > >
> > > > > Sorry, I should have detailed my point.
> > > > >
> > > > > Basically, I do all my testing with vkms in QEMU VM. In that sense, I
> > > > > did some experiments which I enabled and disabled the KVM (i.e., flag
> > > > > '-enable-kvm') to check the vkms behaviour in these two scenarios. I
> > > > > noticed that the tests are consistent when I use the '-enable-kvm' flag,
> > > > > in that context it seemed a good idea to recommend the use of KVM for
> > > > > those users who want to test vkms with igt. Anyway, don't worry about
> > > > > that I'll try to add more documentation for vkms in the future and in
> > > > > that time we discuss about this again.
> > > >
> > > > Ah, qemu without kvm is going to use software emulation for a lot of the
> > > > kernel stuff. That's going to be terribly slow indeed.
> > > >
> > > > > Anyway, from my side, I think we should merge this series. Do you want
> > > > > to prepare a V2 with the fixes and maybe update the commit messages by
> > > > > using some of your explanations? Or, if you want, I can fix the tiny
> > > > > details and merge it.
> > > >
> > > > I'm deeply burried in my prime cleanup/doc series right now, so will take
> > > > a few days until I get around to this again. If you want, please go ahead
> > > > with merging.
> > > >
> > > > btw if you edit a patch when merging, please add a comment about that to
> > > > the commit message. And ime it's best to only augment the commit message
> > > > and maybe delete an unused variable or so that got forgotten. For
> > > > everything more better to do the edits and resubmit.
> > >
> > > First of all, thank you very much for all your reviews and
> > > explanation. I’ll try the exercise that you proposed on Patch 1.
> > >
> > > I’ll merge patch [4] and [7] since they’re not related to these
> > > series. For the other patches, I’ll merge them after I finish the new
> > > version of writeback series. I’ll ping you later.
> > >
>
> Hi,
>
> I already sent the new version of the writeback patchset. So, how do you
> want to proceed with this series? Do you prefer to send a V2 or should I
> apply the patchset and make the tiny fixes?

I don't have r-b tags from you (usually if it's just nits then you can
do "r-b with those tiny details addressed"), so I can't merge. I
thought you wanted to merge them, which I've been waiting for you to
do ... so that's why they're stuck right now. Just decide and we move
them, up to you really.
-Daniel

>
> Thanks
>
> > > 4. https://patchwork.freedesktop.org/patch/309031/?series=61737&rev=1
> > > 7. https://patchwork.freedesktop.org/patch/309029/?series=61737&rev=1$
> >
> > Can you merge them quicker? I have another 3 vkms patches here
> > touching that area with some follow-up stuff ...
> >
> > > Finally, not related with this patchset, can I apply the patch
> > > “drm/drm_vblank: Change EINVAL by the correct errno” [1] or do I need
> > > more SoB? I’ll also apply Oleg patch (drm/vkms: add crc sources list).
> > >
> > > 1. https://patchwork.freedesktop.org/patch/310006/?series=50697&rev=4
> >
> > If you want get some acks from igt maintainers (those patches landed
> > now, right), but this is good enough.
> > -Daniel
> >
> >
> > > Thanks
> > >
> > > > Thanks, Daniel
> > > >
> > > > >
> > > > > > -Daniel
> > > > > >
> > > > > > > Best Regards
> > > > > > >
> > > > > > > On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > > > > > >
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > This here is the first part of a rework for the vkms crc worker. I think
> > > > > > > > this should fix all the locking/races/use-after-free issues I spotted in
> > > > > > > > the code. There's more work we can do in the future as a follow-up:
> > > > > > > >
> > > > > > > > - directly access vkms_plane_state->base in the crc worker, with this
> > > > > > > >   approach in this series here that should be safe now. Follow-up patches
> > > > > > > >   could switch and remove all the separate crc_data infrastructure.
> > > > > > > >
> > > > > > > > - I think some kerneldoc for vkms structures would be nice. Documentation
> > > > > > > >   the various functions is probably overkill.
> > > > > > > >
> > > > > > > > - Implementing a more generic blending engine, as prep for adding more
> > > > > > > >   pixel formats, more planes, and more features in general.
> > > > > > > >
> > > > > > > > Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make
> > > > > > > > things worse, but I didn't do a full run.
> > > > > > > >
> > > > > > > > Cheers, Daniel
> > > > > > > >
> > > > > > > > Daniel Vetter (10):
> > > > > > > >   drm/vkms: Fix crc worker races
> > > > > > > >   drm/vkms: Use spin_lock_irq in process context
> > > > > > > >   drm/vkms: Rename vkms_output.state_lock to crc_lock
> > > > > > > >   drm/vkms: Move format arrays to vkms_plane.c
> > > > > > > >   drm/vkms: Add our own commit_tail
> > > > > > > >   drm/vkms: flush crc workers earlier in commit flow
> > > > > > > >   drm/vkms: Dont flush crc worker when we change crc status
> > > > > > > >   drm/vkms: No _irqsave within spin_lock_irq needed
> > > > > > > >   drm/vkms: totally reworked crc data tracking
> > > > > > > >   drm/vkms: No need for ->pages_lock in crc work anymore
> > > > > > > >
> > > > > > > >  drivers/gpu/drm/vkms/vkms_crc.c   | 74 +++++++++-------------------
> > > > > > > >  drivers/gpu/drm/vkms/vkms_crtc.c  | 80 ++++++++++++++++++++++++++-----
> > > > > > > >  drivers/gpu/drm/vkms/vkms_drv.c   | 35 ++++++++++++++
> > > > > > > >  drivers/gpu/drm/vkms/vkms_drv.h   | 24 +++++-----
> > > > > > > >  drivers/gpu/drm/vkms/vkms_plane.c |  8 ++++
> > > > > > > >  5 files changed, 146 insertions(+), 75 deletions(-)
> > > > > > > >
> > > > > > > > --
> > > > > > > > 2.20.1
> > > > > > > >
> > > > > > > > _______________________________________________
> > > > > > > > dri-devel mailing list
> > > > > > > > dri-devel@lists.freedesktop.org
> > > > > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Rodrigo Siqueira
> > > > > > > https://siqueira.tech
> > > > > >
> > > > > > --
> > > > > > Daniel Vetter
> > > > > > Software Engineer, Intel Corporation
> > > > > > http://blog.ffwll.ch
> > > > >
> > > > > --
> > > > > Rodrigo Siqueira
> > > > > https://siqueira.tech
> > > >
> > > > --
> > > > Daniel Vetter
> > > > Software Engineer, Intel Corporation
> > > > http://blog.ffwll.ch
> > >
> > >
> > >
> > > --
> > >
> > > Rodrigo Siqueira
> > > https://siqueira.tech
> >
> >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
>
> --
> Rodrigo Siqueira
> https://siqueira.tech

Rodrigo Siqueira June 26, 2019, 1:46 p.m. UTC | #12

On Wed, Jun 26, 2019 at 4:55 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Wed, Jun 26, 2019 at 3:44 AM Rodrigo Siqueira
> <rodrigosiqueiramelo@gmail.com> wrote:
> >
> > On 06/19, Daniel Vetter wrote:
> > > On Tue, Jun 18, 2019 at 11:54 PM Rodrigo Siqueira
> > > <rodrigosiqueiramelo@gmail.com> wrote:
> > > >
> > > > On Tue, Jun 18, 2019 at 5:56 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > >
> > > > > On Mon, Jun 17, 2019 at 11:49:04PM -0300, Rodrigo Siqueira wrote:
> > > > > > On 06/12, Daniel Vetter wrote:
> > > > > > > On Wed, Jun 12, 2019 at 10:28:41AM -0300, Rodrigo Siqueira wrote:
> > > > > > > > Hi Daniel,
> > > > > > > >
> > > > > > > > First of all, thank you very much for your patchset.
> > > > > > > >
> > > > > > > > I tried to make a detailed review of your series, and you can see my
> > > > > > > > comments in each patch. You’ll notice that I asked many things related
> > > > > > > > to the DRM subsystem with the hope that I could learn a little bit
> > > > > > > > more about DRM from your comments.
> > > > > > > >
> > > > > > > > Before you go through the review, I would like to start a discussion
> > > > > > > > about the vkms race conditions. First, I have to admit that I did not
> > > > > > > > understand the race conditions that you described before because I
> > > > > > > > couldn’t reproduce them. Now, I'm suspecting that I could not
> > > > > > > > experience the problem because I'm using QEMU with KVM; with this idea
> > > > > > > > in mind, I suppose that we have two scenarios for using vkms in a
> > > > > > > > virtual machine:
> > > > > > > >
> > > > > > > > * Scenario 1: The user has hardware virtualization support; in this
> > > > > > > > case, it is a little bit harder to experience race conditions with
> > > > > > > > vkms.
> > > > > > > >
> > > > > > > > * Scenario 2: Without hardware virtualization support, it is much
> > > > > > > > easier to experience race conditions.
> > > > > > > >
> > > > > > > > With these two scenarios in mind, I conducted a bunch of experiments
> > > > > > > > for trying to shed light on this issue. I did:
> > > > > > > >
> > > > > > > > 1. Enabled lockdep
> > > > > > > >
> > > > > > > > 2. Defined two different environments for testing both using QEMU with
> > > > > > > > and without kvm. See below the QEMU hardware options:
> > > > > > > >
> > > > > > > > * env_kvm: -enable-kvm -daemonize -m 1G -smp cores=2,cpus=2
> > > > > > > > * env_no_kvm: -daemonize -m 2G -smp cores=4,cpus=4
> > > > > > > >
> > > > > > > > 3. My test protocol: I) turn on the vm, II) clean /proc/lock_stat,
> > > > > > > > III) execute kms_cursor_crc, III) save the lock_stat file, and IV)
> > > > > > > > turn off the vm.
> > > > > > > >
> > > > > > > > 4. From the lockdep_stat, I just highlighted the row related to vkms
> > > > > > > > and the columns holdtime-total and holdtime-avg
> > > > > > > >
> > > > > > > > I would like to highlight that the following data does not have any
> > > > > > > > statistical approach, and its intention is solely to assist our
> > > > > > > > discussion. See below the summary of the collected data:
> > > > > > > >
> > > > > > > > Summary of the experiment results:
> > > > > > > >
> > > > > > > > +----------------+----------------+----------------+
> > > > > > > > |                |     env_kvm    |   env_no_kvm   |
> > > > > > > > +                +----------------+----------------+
> > > > > > > > | Test           | Before | After | Before | After |
> > > > > > > > +----------------+--------+-------+--------+-------+
> > > > > > > > | kms_cursor_crc |   S1   |   S2  |   M1   |   M2  |
> > > > > > > > +----------------+--------+-------+--------+-------+
> > > > > > > >
> > > > > > > > * Before: before apply this patchset
> > > > > > > > * After: after apply this patchset
> > > > > > > >
> > > > > > > > -----------------------------------------+------------------+-----------
> > > > > > > > S1: Without this patchset and with kvm   | holdtime-total | holdtime-avg
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > &(&vkms_out->lock)->rlock:               |  21983.52      |  6.21
> > > > > > > > (work_completion)(&vkms_state->crc_wo:   |  20.47         |  20.47
> > > > > > > > (wq_completion)vkms_crc_workq:           |  3999507.87    |  3352.48
> > > > > > > > &(&vkms_out->state_lock)->rlock:         |  378.47        |  0.30
> > > > > > > > (work_completion)(&vkms_state->crc_#2:   |  3999066.30    |  2848.34
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > S2: With this patchset and with kvm      |                |
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > &(&vkms_out->lock)->rlock:               |  23262.83      |  6.34
> > > > > > > > (work_completion)(&vkms_state->crc_wo:   |  8.98          |  8.98
> > > > > > > > &(&vkms_out->crc_lock)->rlock:           |  307.28        |  0.32
> > > > > > > > (wq_completion)vkms_crc_workq:           |  6567727.05    |  12345.35
> > > > > > > > (work_completion)(&vkms_state->crc_#2:   |  6567135.15    |  4488.81
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > M1: Without this patchset and without kvm|                |
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > > > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > > > > > (work_completion)(&vkms_state->crc_wo:   |  31.32         |  31.32
> > > > > > > > (wq_completion)vkms_crc_workq:           |  20991073.78   |  13525.18
> > > > > > > > (work_completion)(&vkms_state->crc_#2:   |  20988347.18   |  11904.90
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > M2: With this patchset and without kvm   |                |
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > (wq_completion)vkms_crc_workq:           |  42819161.68   |  36597.57
> > > > > > > > &(&vkms_out->lock)->rlock:               |  251257.06     |  35.80
> > > > > > > > (work_completion)(&vkms_state->crc_wo:   |  69.37         |  69.37
> > > > > > > > &(&vkms_out->crc_lock)->rlock:           |  3620.92       |  1.54
> > > > > > > > (work_completion)(&vkms_state->crc_#2:   |  42803419.59   |  24306.31
> > > > > > > >
> > > > > > > > First, I analyzed the scenarios with KVM (S1 and S2); more
> > > > > > > > specifically, I focused on these two classes:
> > > > > > > >
> > > > > > > > 1. (work_completion)(&vkms_state->crc_wo
> > > > > > > > 2. (work_completion)(&vkms_state->crc_#2
> > > > > > > >
> > > > > > > > After taking a look at the data, it looks like that this patchset
> > > > > > > > greatly reduces the hold time average for crc_wo. On the other hand,
> > > > > > > > it increases the hold time average for crc_#2. I didn’t understand
> > > > > > > > well the reason for the difference. Could you help me here?
> > > > > > >
> > > > > > > So there's two real locks here from our code, the ones you can see as
> > > > > > > spinlocks:
> > > > > > >
> > > > > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > > > >
> > > > > > > All the others are fake locks that the workqueue adds, which only exist in
> > > > > > > lockdep. They are used to catch special kinds of deadlocks like the below:
> > > > > > >
> > > > > > > thread A:
> > > > > > > 1. mutex_lock(mutex_A)
> > > > > > > 2. flush_work(work_A)
> > > > > > >
> > > > > > > thread B
> > > > > > > 1. running work_A:
> > > > > > > 2. mutex_lock(mutex_A)
> > > > > > >
> > > > > > > thread B can't continue becuase mutex_A is already held by thread A.
> > > > > > > thread A can't continue because thread B is blocked and the work never
> > > > > > > finishes.
> > > > > > > -> deadlock
> > > > > > >
> > > > > > > I haven't checked which is which, but essentially what you measure with
> > > > > > > the hold times of these fake locks is how long a work execution takes on
> > > > > > > average.
> > > > > > >
> > > > > > > Since my patches are supposed to fix races where the worker can't keep up
> > > > > > > with the vblank hrtimer, then the average worker will (probably) do more,
> > > > > > > so that going up is expected. I think.
> > > > > > >
> > > > > > > I'm honestly not sure what's going on here, never looking into this in
> > > > > > > detail.
> > > > > > >
> > > > > > > > When I looked for the second set of scenarios (M1 and M2, both without
> > > > > > > > KVM), the results look much more distant; basically, this patchset
> > > > > > > > increased the hold time average. Again, could you help me understand a
> > > > > > > > little bit better this issue?
> > > > > > > >
> > > > > > > > Finally, after these tests, I have some questions:
> > > > > > > >
> > > > > > > > 1. VKMS is a software-only driver; because of this, how about defining
> > > > > > > > a minimal system resource for using it?
> > > > > > >
> > > > > > > No idea, in reality it's probably "if it fails too often, buy faster cpu".
> > > > > > > I do think we should make the code robust against a slow cpu, since atm
> > > > > > > that's needed even on pretty fast machines (because our blending code is
> > > > > > > really, really slow and inefficient).
> > > > > > >
> > > > > > > > 2. During my experiments, I noticed that running tests with a VM that
> > > > > > > > uses KVM had consistent results. For example, kms_flip never fails
> > > > > > > > with QEMU+KVM; however, without KVM, two or three tests failed (one of
> > > > > > > > them looks random). If we use vkms for test DRM stuff, should we
> > > > > > > > recommend the use of KVM?
> > > > > > >
> > > > > > > What do you mean without kvm? In general running without kvm shouldn't be
> > > > > > > slower, so I'm a bit confused ... I'm running vgem directly on the
> > > > > > > machine, by booting into new kernels (and controlling the machine over the
> > > > > > > network).
> > > > > >
> > > > > > Sorry, I should have detailed my point.
> > > > > >
> > > > > > Basically, I do all my testing with vkms in QEMU VM. In that sense, I
> > > > > > did some experiments which I enabled and disabled the KVM (i.e., flag
> > > > > > '-enable-kvm') to check the vkms behaviour in these two scenarios. I
> > > > > > noticed that the tests are consistent when I use the '-enable-kvm' flag,
> > > > > > in that context it seemed a good idea to recommend the use of KVM for
> > > > > > those users who want to test vkms with igt. Anyway, don't worry about
> > > > > > that I'll try to add more documentation for vkms in the future and in
> > > > > > that time we discuss about this again.
> > > > >
> > > > > Ah, qemu without kvm is going to use software emulation for a lot of the
> > > > > kernel stuff. That's going to be terribly slow indeed.
> > > > >
> > > > > > Anyway, from my side, I think we should merge this series. Do you want
> > > > > > to prepare a V2 with the fixes and maybe update the commit messages by
> > > > > > using some of your explanations? Or, if you want, I can fix the tiny
> > > > > > details and merge it.
> > > > >
> > > > > I'm deeply burried in my prime cleanup/doc series right now, so will take
> > > > > a few days until I get around to this again. If you want, please go ahead
> > > > > with merging.
> > > > >
> > > > > btw if you edit a patch when merging, please add a comment about that to
> > > > > the commit message. And ime it's best to only augment the commit message
> > > > > and maybe delete an unused variable or so that got forgotten. For
> > > > > everything more better to do the edits and resubmit.
> > > >
> > > > First of all, thank you very much for all your reviews and
> > > > explanation. I’ll try the exercise that you proposed on Patch 1.
> > > >
> > > > I’ll merge patch [4] and [7] since they’re not related to these
> > > > series. For the other patches, I’ll merge them after I finish the new
> > > > version of writeback series. I’ll ping you later.
> > > >
> >
> > Hi,
> >
> > I already sent the new version of the writeback patchset. So, how do you
> > want to proceed with this series? Do you prefer to send a V2 or should I
> > apply the patchset and make the tiny fixes?
>
> I don't have r-b tags from you (usually if it's just nits then you can
> do "r-b with those tiny details addressed"), so I can't merge. I
> thought you wanted to merge them, which I've been waiting for you to
> do ... so that's why they're stuck right now. Just decide and we move
> them, up to you really.
> -Daniel

Sorry for forgotten to add my the r-b and t-b, I missed it due to my
questions. Also, sorry for not merge it yet... I was focused on fixing
the writeback implementation in vkms.

Anyway, I'll merge the patchset tonight and make the tiny fixes in the
code. I'll try to be online on irc for checking some small details
with you.

> >
> > Thanks
> >
> > > > 4. https://patchwork.freedesktop.org/patch/309031/?series=61737&rev=1
> > > > 7. https://patchwork.freedesktop.org/patch/309029/?series=61737&rev=1$
> > >
> > > Can you merge them quicker? I have another 3 vkms patches here
> > > touching that area with some follow-up stuff ...
> > >
> > > > Finally, not related with this patchset, can I apply the patch
> > > > “drm/drm_vblank: Change EINVAL by the correct errno” [1] or do I need
> > > > more SoB? I’ll also apply Oleg patch (drm/vkms: add crc sources list).
> > > >
> > > > 1. https://patchwork.freedesktop.org/patch/310006/?series=50697&rev=4
> > >
> > > If you want get some acks from igt maintainers (those patches landed
> > > now, right), but this is good enough.
> > > -Daniel
> > >
> > >
> > > > Thanks
> > > >
> > > > > Thanks, Daniel
> > > > >
> > > > > >
> > > > > > > -Daniel
> > > > > > >
> > > > > > > > Best Regards
> > > > > > > >
> > > > > > > > On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > > > > > > >
> > > > > > > > > Hi all,
> > > > > > > > >
> > > > > > > > > This here is the first part of a rework for the vkms crc worker. I think
> > > > > > > > > this should fix all the locking/races/use-after-free issues I spotted in
> > > > > > > > > the code. There's more work we can do in the future as a follow-up:
> > > > > > > > >
> > > > > > > > > - directly access vkms_plane_state->base in the crc worker, with this
> > > > > > > > >   approach in this series here that should be safe now. Follow-up patches
> > > > > > > > >   could switch and remove all the separate crc_data infrastructure.
> > > > > > > > >
> > > > > > > > > - I think some kerneldoc for vkms structures would be nice. Documentation
> > > > > > > > >   the various functions is probably overkill.
> > > > > > > > >
> > > > > > > > > - Implementing a more generic blending engine, as prep for adding more
> > > > > > > > >   pixel formats, more planes, and more features in general.
> > > > > > > > >
> > > > > > > > > Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make
> > > > > > > > > things worse, but I didn't do a full run.
> > > > > > > > >
> > > > > > > > > Cheers, Daniel
> > > > > > > > >
> > > > > > > > > Daniel Vetter (10):
> > > > > > > > >   drm/vkms: Fix crc worker races
> > > > > > > > >   drm/vkms: Use spin_lock_irq in process context
> > > > > > > > >   drm/vkms: Rename vkms_output.state_lock to crc_lock
> > > > > > > > >   drm/vkms: Move format arrays to vkms_plane.c
> > > > > > > > >   drm/vkms: Add our own commit_tail
> > > > > > > > >   drm/vkms: flush crc workers earlier in commit flow
> > > > > > > > >   drm/vkms: Dont flush crc worker when we change crc status
> > > > > > > > >   drm/vkms: No _irqsave within spin_lock_irq needed
> > > > > > > > >   drm/vkms: totally reworked crc data tracking
> > > > > > > > >   drm/vkms: No need for ->pages_lock in crc work anymore
> > > > > > > > >
> > > > > > > > >  drivers/gpu/drm/vkms/vkms_crc.c   | 74 +++++++++-------------------
> > > > > > > > >  drivers/gpu/drm/vkms/vkms_crtc.c  | 80 ++++++++++++++++++++++++++-----
> > > > > > > > >  drivers/gpu/drm/vkms/vkms_drv.c   | 35 ++++++++++++++
> > > > > > > > >  drivers/gpu/drm/vkms/vkms_drv.h   | 24 +++++-----
> > > > > > > > >  drivers/gpu/drm/vkms/vkms_plane.c |  8 ++++
> > > > > > > > >  5 files changed, 146 insertions(+), 75 deletions(-)
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > 2.20.1
> > > > > > > > >
> > > > > > > > > _______________________________________________
> > > > > > > > > dri-devel mailing list
> > > > > > > > > dri-devel@lists.freedesktop.org
> > > > > > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > >
> > > > > > > > Rodrigo Siqueira
> > > > > > > > https://siqueira.tech
> > > > > > >
> > > > > > > --
> > > > > > > Daniel Vetter
> > > > > > > Software Engineer, Intel Corporation
> > > > > > > http://blog.ffwll.ch
> > > > > >
> > > > > > --
> > > > > > Rodrigo Siqueira
> > > > > > https://siqueira.tech
> > > > >
> > > > > --
> > > > > Daniel Vetter
> > > > > Software Engineer, Intel Corporation
> > > > > http://blog.ffwll.ch
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Rodrigo Siqueira
> > > > https://siqueira.tech
> > >
> > >
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> >
> > --
> > Rodrigo Siqueira
> > https://siqueira.tech
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

Rodrigo Siqueira July 1, 2019, 3:30 a.m. UTC | #13

On 06/26, Daniel Vetter wrote:
> On Wed, Jun 26, 2019 at 3:44 AM Rodrigo Siqueira
> <rodrigosiqueiramelo@gmail.com> wrote:
> >
> > On 06/19, Daniel Vetter wrote:
> > > On Tue, Jun 18, 2019 at 11:54 PM Rodrigo Siqueira
> > > <rodrigosiqueiramelo@gmail.com> wrote:
> > > >
> > > > On Tue, Jun 18, 2019 at 5:56 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > >
> > > > > On Mon, Jun 17, 2019 at 11:49:04PM -0300, Rodrigo Siqueira wrote:
> > > > > > On 06/12, Daniel Vetter wrote:
> > > > > > > On Wed, Jun 12, 2019 at 10:28:41AM -0300, Rodrigo Siqueira wrote:
> > > > > > > > Hi Daniel,
> > > > > > > >
> > > > > > > > First of all, thank you very much for your patchset.
> > > > > > > >
> > > > > > > > I tried to make a detailed review of your series, and you can see my
> > > > > > > > comments in each patch. You’ll notice that I asked many things related
> > > > > > > > to the DRM subsystem with the hope that I could learn a little bit
> > > > > > > > more about DRM from your comments.
> > > > > > > >
> > > > > > > > Before you go through the review, I would like to start a discussion
> > > > > > > > about the vkms race conditions. First, I have to admit that I did not
> > > > > > > > understand the race conditions that you described before because I
> > > > > > > > couldn’t reproduce them. Now, I'm suspecting that I could not
> > > > > > > > experience the problem because I'm using QEMU with KVM; with this idea
> > > > > > > > in mind, I suppose that we have two scenarios for using vkms in a
> > > > > > > > virtual machine:
> > > > > > > >
> > > > > > > > * Scenario 1: The user has hardware virtualization support; in this
> > > > > > > > case, it is a little bit harder to experience race conditions with
> > > > > > > > vkms.
> > > > > > > >
> > > > > > > > * Scenario 2: Without hardware virtualization support, it is much
> > > > > > > > easier to experience race conditions.
> > > > > > > >
> > > > > > > > With these two scenarios in mind, I conducted a bunch of experiments
> > > > > > > > for trying to shed light on this issue. I did:
> > > > > > > >
> > > > > > > > 1. Enabled lockdep
> > > > > > > >
> > > > > > > > 2. Defined two different environments for testing both using QEMU with
> > > > > > > > and without kvm. See below the QEMU hardware options:
> > > > > > > >
> > > > > > > > * env_kvm: -enable-kvm -daemonize -m 1G -smp cores=2,cpus=2
> > > > > > > > * env_no_kvm: -daemonize -m 2G -smp cores=4,cpus=4
> > > > > > > >
> > > > > > > > 3. My test protocol: I) turn on the vm, II) clean /proc/lock_stat,
> > > > > > > > III) execute kms_cursor_crc, III) save the lock_stat file, and IV)
> > > > > > > > turn off the vm.
> > > > > > > >
> > > > > > > > 4. From the lockdep_stat, I just highlighted the row related to vkms
> > > > > > > > and the columns holdtime-total and holdtime-avg
> > > > > > > >
> > > > > > > > I would like to highlight that the following data does not have any
> > > > > > > > statistical approach, and its intention is solely to assist our
> > > > > > > > discussion. See below the summary of the collected data:
> > > > > > > >
> > > > > > > > Summary of the experiment results:
> > > > > > > >
> > > > > > > > +----------------+----------------+----------------+
> > > > > > > > |                |     env_kvm    |   env_no_kvm   |
> > > > > > > > +                +----------------+----------------+
> > > > > > > > | Test           | Before | After | Before | After |
> > > > > > > > +----------------+--------+-------+--------+-------+
> > > > > > > > | kms_cursor_crc |   S1   |   S2  |   M1   |   M2  |
> > > > > > > > +----------------+--------+-------+--------+-------+
> > > > > > > >
> > > > > > > > * Before: before apply this patchset
> > > > > > > > * After: after apply this patchset
> > > > > > > >
> > > > > > > > -----------------------------------------+------------------+-----------
> > > > > > > > S1: Without this patchset and with kvm   | holdtime-total | holdtime-avg
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > &(&vkms_out->lock)->rlock:               |  21983.52      |  6.21
> > > > > > > > (work_completion)(&vkms_state->crc_wo:   |  20.47         |  20.47
> > > > > > > > (wq_completion)vkms_crc_workq:           |  3999507.87    |  3352.48
> > > > > > > > &(&vkms_out->state_lock)->rlock:         |  378.47        |  0.30
> > > > > > > > (work_completion)(&vkms_state->crc_#2:   |  3999066.30    |  2848.34
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > S2: With this patchset and with kvm      |                |
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > &(&vkms_out->lock)->rlock:               |  23262.83      |  6.34
> > > > > > > > (work_completion)(&vkms_state->crc_wo:   |  8.98          |  8.98
> > > > > > > > &(&vkms_out->crc_lock)->rlock:           |  307.28        |  0.32
> > > > > > > > (wq_completion)vkms_crc_workq:           |  6567727.05    |  12345.35
> > > > > > > > (work_completion)(&vkms_state->crc_#2:   |  6567135.15    |  4488.81
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > M1: Without this patchset and without kvm|                |
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > > > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > > > > > (work_completion)(&vkms_state->crc_wo:   |  31.32         |  31.32
> > > > > > > > (wq_completion)vkms_crc_workq:           |  20991073.78   |  13525.18
> > > > > > > > (work_completion)(&vkms_state->crc_#2:   |  20988347.18   |  11904.90
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > M2: With this patchset and without kvm   |                |
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > (wq_completion)vkms_crc_workq:           |  42819161.68   |  36597.57
> > > > > > > > &(&vkms_out->lock)->rlock:               |  251257.06     |  35.80
> > > > > > > > (work_completion)(&vkms_state->crc_wo:   |  69.37         |  69.37
> > > > > > > > &(&vkms_out->crc_lock)->rlock:           |  3620.92       |  1.54
> > > > > > > > (work_completion)(&vkms_state->crc_#2:   |  42803419.59   |  24306.31
> > > > > > > >
> > > > > > > > First, I analyzed the scenarios with KVM (S1 and S2); more
> > > > > > > > specifically, I focused on these two classes:
> > > > > > > >
> > > > > > > > 1. (work_completion)(&vkms_state->crc_wo
> > > > > > > > 2. (work_completion)(&vkms_state->crc_#2
> > > > > > > >
> > > > > > > > After taking a look at the data, it looks like that this patchset
> > > > > > > > greatly reduces the hold time average for crc_wo. On the other hand,
> > > > > > > > it increases the hold time average for crc_#2. I didn’t understand
> > > > > > > > well the reason for the difference. Could you help me here?
> > > > > > >
> > > > > > > So there's two real locks here from our code, the ones you can see as
> > > > > > > spinlocks:
> > > > > > >
> > > > > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > > > >
> > > > > > > All the others are fake locks that the workqueue adds, which only exist in
> > > > > > > lockdep. They are used to catch special kinds of deadlocks like the below:
> > > > > > >
> > > > > > > thread A:
> > > > > > > 1. mutex_lock(mutex_A)
> > > > > > > 2. flush_work(work_A)
> > > > > > >
> > > > > > > thread B
> > > > > > > 1. running work_A:
> > > > > > > 2. mutex_lock(mutex_A)
> > > > > > >
> > > > > > > thread B can't continue becuase mutex_A is already held by thread A.
> > > > > > > thread A can't continue because thread B is blocked and the work never
> > > > > > > finishes.
> > > > > > > -> deadlock
> > > > > > >
> > > > > > > I haven't checked which is which, but essentially what you measure with
> > > > > > > the hold times of these fake locks is how long a work execution takes on
> > > > > > > average.
> > > > > > >
> > > > > > > Since my patches are supposed to fix races where the worker can't keep up
> > > > > > > with the vblank hrtimer, then the average worker will (probably) do more,
> > > > > > > so that going up is expected. I think.
> > > > > > >
> > > > > > > I'm honestly not sure what's going on here, never looking into this in
> > > > > > > detail.
> > > > > > >
> > > > > > > > When I looked for the second set of scenarios (M1 and M2, both without
> > > > > > > > KVM), the results look much more distant; basically, this patchset
> > > > > > > > increased the hold time average. Again, could you help me understand a
> > > > > > > > little bit better this issue?
> > > > > > > >
> > > > > > > > Finally, after these tests, I have some questions:
> > > > > > > >
> > > > > > > > 1. VKMS is a software-only driver; because of this, how about defining
> > > > > > > > a minimal system resource for using it?
> > > > > > >
> > > > > > > No idea, in reality it's probably "if it fails too often, buy faster cpu".
> > > > > > > I do think we should make the code robust against a slow cpu, since atm
> > > > > > > that's needed even on pretty fast machines (because our blending code is
> > > > > > > really, really slow and inefficient).
> > > > > > >
> > > > > > > > 2. During my experiments, I noticed that running tests with a VM that
> > > > > > > > uses KVM had consistent results. For example, kms_flip never fails
> > > > > > > > with QEMU+KVM; however, without KVM, two or three tests failed (one of
> > > > > > > > them looks random). If we use vkms for test DRM stuff, should we
> > > > > > > > recommend the use of KVM?
> > > > > > >
> > > > > > > What do you mean without kvm? In general running without kvm shouldn't be
> > > > > > > slower, so I'm a bit confused ... I'm running vgem directly on the
> > > > > > > machine, by booting into new kernels (and controlling the machine over the
> > > > > > > network).
> > > > > >
> > > > > > Sorry, I should have detailed my point.
> > > > > >
> > > > > > Basically, I do all my testing with vkms in QEMU VM. In that sense, I
> > > > > > did some experiments which I enabled and disabled the KVM (i.e., flag
> > > > > > '-enable-kvm') to check the vkms behaviour in these two scenarios. I
> > > > > > noticed that the tests are consistent when I use the '-enable-kvm' flag,
> > > > > > in that context it seemed a good idea to recommend the use of KVM for
> > > > > > those users who want to test vkms with igt. Anyway, don't worry about
> > > > > > that I'll try to add more documentation for vkms in the future and in
> > > > > > that time we discuss about this again.
> > > > >
> > > > > Ah, qemu without kvm is going to use software emulation for a lot of the
> > > > > kernel stuff. That's going to be terribly slow indeed.
> > > > >
> > > > > > Anyway, from my side, I think we should merge this series. Do you want
> > > > > > to prepare a V2 with the fixes and maybe update the commit messages by
> > > > > > using some of your explanations? Or, if you want, I can fix the tiny
> > > > > > details and merge it.
> > > > >
> > > > > I'm deeply burried in my prime cleanup/doc series right now, so will take
> > > > > a few days until I get around to this again. If you want, please go ahead
> > > > > with merging.
> > > > >
> > > > > btw if you edit a patch when merging, please add a comment about that to
> > > > > the commit message. And ime it's best to only augment the commit message
> > > > > and maybe delete an unused variable or so that got forgotten. For
> > > > > everything more better to do the edits and resubmit.
> > > >
> > > > First of all, thank you very much for all your reviews and
> > > > explanation. I’ll try the exercise that you proposed on Patch 1.
> > > >
> > > > I’ll merge patch [4] and [7] since they’re not related to these
> > > > series. For the other patches, I’ll merge them after I finish the new
> > > > version of writeback series. I’ll ping you later.
> > > >
> >
> > Hi,
> >
> > I already sent the new version of the writeback patchset. So, how do you
> > want to proceed with this series? Do you prefer to send a V2 or should I
> > apply the patchset and make the tiny fixes?
> 
> I don't have r-b tags from you (usually if it's just nits then you can
> do "r-b with those tiny details addressed"), so I can't merge. I
> thought you wanted to merge them, which I've been waiting for you to
> do ... so that's why they're stuck right now. Just decide and we move
> them, up to you really.
> -Daniel

Hi Daniel,

I think that I forgot to tell you that I already applied this series.

If you have time, could you take a look in the following two series
built on top of this patchset?

1. Writeback: https://patchwork.freedesktop.org/series/61738/
2. Configfs: https://patchwork.freedesktop.org/series/63010/

Thanks
 
> >
> > Thanks
> >
> > > > 4. https://patchwork.freedesktop.org/patch/309031/?series=61737&rev=1
> > > > 7. https://patchwork.freedesktop.org/patch/309029/?series=61737&rev=1$
> > >
> > > Can you merge them quicker? I have another 3 vkms patches here
> > > touching that area with some follow-up stuff ...
> > >
> > > > Finally, not related with this patchset, can I apply the patch
> > > > “drm/drm_vblank: Change EINVAL by the correct errno” [1] or do I need
> > > > more SoB? I’ll also apply Oleg patch (drm/vkms: add crc sources list).
> > > >
> > > > 1. https://patchwork.freedesktop.org/patch/310006/?series=50697&rev=4
> > >
> > > If you want get some acks from igt maintainers (those patches landed
> > > now, right), but this is good enough.
> > > -Daniel
> > >
> > >
> > > > Thanks
> > > >
> > > > > Thanks, Daniel
> > > > >
> > > > > >
> > > > > > > -Daniel
> > > > > > >
> > > > > > > > Best Regards
> > > > > > > >
> > > > > > > > On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > > > > > > >
> > > > > > > > > Hi all,
> > > > > > > > >
> > > > > > > > > This here is the first part of a rework for the vkms crc worker. I think
> > > > > > > > > this should fix all the locking/races/use-after-free issues I spotted in
> > > > > > > > > the code. There's more work we can do in the future as a follow-up:
> > > > > > > > >
> > > > > > > > > - directly access vkms_plane_state->base in the crc worker, with this
> > > > > > > > >   approach in this series here that should be safe now. Follow-up patches
> > > > > > > > >   could switch and remove all the separate crc_data infrastructure.
> > > > > > > > >
> > > > > > > > > - I think some kerneldoc for vkms structures would be nice. Documentation
> > > > > > > > >   the various functions is probably overkill.
> > > > > > > > >
> > > > > > > > > - Implementing a more generic blending engine, as prep for adding more
> > > > > > > > >   pixel formats, more planes, and more features in general.
> > > > > > > > >
> > > > > > > > > Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make
> > > > > > > > > things worse, but I didn't do a full run.
> > > > > > > > >
> > > > > > > > > Cheers, Daniel
> > > > > > > > >
> > > > > > > > > Daniel Vetter (10):
> > > > > > > > >   drm/vkms: Fix crc worker races
> > > > > > > > >   drm/vkms: Use spin_lock_irq in process context
> > > > > > > > >   drm/vkms: Rename vkms_output.state_lock to crc_lock
> > > > > > > > >   drm/vkms: Move format arrays to vkms_plane.c
> > > > > > > > >   drm/vkms: Add our own commit_tail
> > > > > > > > >   drm/vkms: flush crc workers earlier in commit flow
> > > > > > > > >   drm/vkms: Dont flush crc worker when we change crc status
> > > > > > > > >   drm/vkms: No _irqsave within spin_lock_irq needed
> > > > > > > > >   drm/vkms: totally reworked crc data tracking
> > > > > > > > >   drm/vkms: No need for ->pages_lock in crc work anymore
> > > > > > > > >
> > > > > > > > >  drivers/gpu/drm/vkms/vkms_crc.c   | 74 +++++++++-------------------
> > > > > > > > >  drivers/gpu/drm/vkms/vkms_crtc.c  | 80 ++++++++++++++++++++++++++-----
> > > > > > > > >  drivers/gpu/drm/vkms/vkms_drv.c   | 35 ++++++++++++++
> > > > > > > > >  drivers/gpu/drm/vkms/vkms_drv.h   | 24 +++++-----
> > > > > > > > >  drivers/gpu/drm/vkms/vkms_plane.c |  8 ++++
> > > > > > > > >  5 files changed, 146 insertions(+), 75 deletions(-)
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > 2.20.1
> > > > > > > > >
> > > > > > > > > _______________________________________________
> > > > > > > > > dri-devel mailing list
> > > > > > > > > dri-devel@lists.freedesktop.org
> > > > > > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > >
> > > > > > > > Rodrigo Siqueira
> > > > > > > > https://siqueira.tech
> > > > > > >
> > > > > > > --
> > > > > > > Daniel Vetter
> > > > > > > Software Engineer, Intel Corporation
> > > > > > > http://blog.ffwll.ch
> > > > > >
> > > > > > --
> > > > > > Rodrigo Siqueira
> > > > > > https://siqueira.tech
> > > > >
> > > > > --
> > > > > Daniel Vetter
> > > > > Software Engineer, Intel Corporation
> > > > > http://blog.ffwll.ch
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Rodrigo Siqueira
> > > > https://siqueira.tech
> > >
> > >
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> >
> > --
> > Rodrigo Siqueira
> > https://siqueira.tech
> 
> 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

[00/10] drm/vkms: rework crc worker

Message

Comments