diff mbox series

[3/4] drm/scheduler: add new function to get least loaded sched v2

Message ID 20180801082002.20696-3-nayan26deshmukh@gmail.com (mailing list archive)
State New, archived
Headers show
Series [1/4] drm/scheduler: add a list of run queues to the entity | expand

Commit Message

Nayan Deshmukh Aug. 1, 2018, 8:20 a.m. UTC
The function selects the run queue from the rq_list with the
least load. The load is decided by the number of jobs in a
scheduler.

v2: avoid using atomic read twice consecutively, instead store
    it locally

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
---
 drivers/gpu/drm/scheduler/gpu_scheduler.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

Comments

Andrey Grodzovsky Aug. 1, 2018, 3:35 p.m. UTC | #1
Clarification question -  if the run queues belong to different 
schedulers they effectively point to different rings,

it means we allow to move (reschedule) a drm_sched_entity from one ring 
to another - i assume that the idea int the first place, that

you have a set of HW rings and you can utilize any of them for your jobs 
(like compute rings). Correct ?

Andrey


On 08/01/2018 04:20 AM, Nayan Deshmukh wrote:
> The function selects the run queue from the rq_list with the
> least load. The load is decided by the number of jobs in a
> scheduler.
>
> v2: avoid using atomic read twice consecutively, instead store
>      it locally
>
> Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
> ---
>   drivers/gpu/drm/scheduler/gpu_scheduler.c | 25 +++++++++++++++++++++++++
>   1 file changed, 25 insertions(+)
>
> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> index 375f6f7f6a93..fb4e542660b0 100644
> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> @@ -255,6 +255,31 @@ static bool drm_sched_entity_is_ready(struct drm_sched_entity *entity)
>   	return true;
>   }
>   
> +/**
> + * drm_sched_entity_get_free_sched - Get the rq from rq_list with least load
> + *
> + * @entity: scheduler entity
> + *
> + * Return the pointer to the rq with least load.
> + */
> +static struct drm_sched_rq *
> +drm_sched_entity_get_free_sched(struct drm_sched_entity *entity)
> +{
> +	struct drm_sched_rq *rq = NULL;
> +	unsigned int min_jobs = UINT_MAX, num_jobs;
> +	int i;
> +
> +	for (i = 0; i < entity->num_rq_list; ++i) {
> +		num_jobs = atomic_read(&entity->rq_list[i]->sched->num_jobs);
> +		if (num_jobs < min_jobs) {
> +			min_jobs = num_jobs;
> +			rq = entity->rq_list[i];
> +		}
> +	}
> +
> +	return rq;
> +}
> +
>   static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
>   				    struct dma_fence_cb *cb)
>   {
Nayan Deshmukh Aug. 1, 2018, 4:06 p.m. UTC | #2
Yes, that is correct.

Nayan

On Wed, Aug 1, 2018, 9:05 PM Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
wrote:

> Clarification question -  if the run queues belong to different
> schedulers they effectively point to different rings,
>
> it means we allow to move (reschedule) a drm_sched_entity from one ring
> to another - i assume that the idea int the first place, that
>
> you have a set of HW rings and you can utilize any of them for your jobs
> (like compute rings). Correct ?
>
> Andrey
>
>
> On 08/01/2018 04:20 AM, Nayan Deshmukh wrote:
> > The function selects the run queue from the rq_list with the
> > least load. The load is decided by the number of jobs in a
> > scheduler.
> >
> > v2: avoid using atomic read twice consecutively, instead store
> >      it locally
> >
> > Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
> > ---
> >   drivers/gpu/drm/scheduler/gpu_scheduler.c | 25
> +++++++++++++++++++++++++
> >   1 file changed, 25 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > index 375f6f7f6a93..fb4e542660b0 100644
> > --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > @@ -255,6 +255,31 @@ static bool drm_sched_entity_is_ready(struct
> drm_sched_entity *entity)
> >       return true;
> >   }
> >
> > +/**
> > + * drm_sched_entity_get_free_sched - Get the rq from rq_list with least
> load
> > + *
> > + * @entity: scheduler entity
> > + *
> > + * Return the pointer to the rq with least load.
> > + */
> > +static struct drm_sched_rq *
> > +drm_sched_entity_get_free_sched(struct drm_sched_entity *entity)
> > +{
> > +     struct drm_sched_rq *rq = NULL;
> > +     unsigned int min_jobs = UINT_MAX, num_jobs;
> > +     int i;
> > +
> > +     for (i = 0; i < entity->num_rq_list; ++i) {
> > +             num_jobs =
> atomic_read(&entity->rq_list[i]->sched->num_jobs);
> > +             if (num_jobs < min_jobs) {
> > +                     min_jobs = num_jobs;
> > +                     rq = entity->rq_list[i];
> > +             }
> > +     }
> > +
> > +     return rq;
> > +}
> > +
> >   static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
> >                                   struct dma_fence_cb *cb)
> >   {
>
>
<div dir="auto">Yes, that is correct. <div dir="auto"><br></div><div dir="auto">Nayan</div></div><br><div class="gmail_quote"><div dir="ltr">On Wed, Aug 1, 2018, 9:05 PM Andrey Grodzovsky &lt;<a href="mailto:Andrey.Grodzovsky@amd.com">Andrey.Grodzovsky@amd.com</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Clarification question -  if the run queues belong to different <br>
schedulers they effectively point to different rings,<br>
<br>
it means we allow to move (reschedule) a drm_sched_entity from one ring <br>
to another - i assume that the idea int the first place, that<br>
<br>
you have a set of HW rings and you can utilize any of them for your jobs <br>
(like compute rings). Correct ?<br>
<br>
Andrey<br>
<br>
<br>
On 08/01/2018 04:20 AM, Nayan Deshmukh wrote:<br>
&gt; The function selects the run queue from the rq_list with the<br>
&gt; least load. The load is decided by the number of jobs in a<br>
&gt; scheduler.<br>
&gt;<br>
&gt; v2: avoid using atomic read twice consecutively, instead store<br>
&gt;      it locally<br>
&gt;<br>
&gt; Signed-off-by: Nayan Deshmukh &lt;<a href="mailto:nayan26deshmukh@gmail.com" target="_blank" rel="noreferrer">nayan26deshmukh@gmail.com</a>&gt;<br>
&gt; ---<br>
&gt;   drivers/gpu/drm/scheduler/gpu_scheduler.c | 25 +++++++++++++++++++++++++<br>
&gt;   1 file changed, 25 insertions(+)<br>
&gt;<br>
&gt; diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c<br>
&gt; index 375f6f7f6a93..fb4e542660b0 100644<br>
&gt; --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c<br>
&gt; +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c<br>
&gt; @@ -255,6 +255,31 @@ static bool drm_sched_entity_is_ready(struct drm_sched_entity *entity)<br>
&gt;       return true;<br>
&gt;   }<br>
&gt;   <br>
&gt; +/**<br>
&gt; + * drm_sched_entity_get_free_sched - Get the rq from rq_list with least load<br>
&gt; + *<br>
&gt; + * @entity: scheduler entity<br>
&gt; + *<br>
&gt; + * Return the pointer to the rq with least load.<br>
&gt; + */<br>
&gt; +static struct drm_sched_rq *<br>
&gt; +drm_sched_entity_get_free_sched(struct drm_sched_entity *entity)<br>
&gt; +{<br>
&gt; +     struct drm_sched_rq *rq = NULL;<br>
&gt; +     unsigned int min_jobs = UINT_MAX, num_jobs;<br>
&gt; +     int i;<br>
&gt; +<br>
&gt; +     for (i = 0; i &lt; entity-&gt;num_rq_list; ++i) {<br>
&gt; +             num_jobs = atomic_read(&amp;entity-&gt;rq_list[i]-&gt;sched-&gt;num_jobs);<br>
&gt; +             if (num_jobs &lt; min_jobs) {<br>
&gt; +                     min_jobs = num_jobs;<br>
&gt; +                     rq = entity-&gt;rq_list[i];<br>
&gt; +             }<br>
&gt; +     }<br>
&gt; +<br>
&gt; +     return rq;<br>
&gt; +}<br>
&gt; +<br>
&gt;   static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,<br>
&gt;                                   struct dma_fence_cb *cb)<br>
&gt;   {<br>
<br>
</blockquote></div>
Andrey Grodzovsky Aug. 1, 2018, 5:51 p.m. UTC | #3
Series is Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

Andrey


On 08/01/2018 12:06 PM, Nayan Deshmukh wrote:
> Yes, that is correct.
>
> Nayan
>
> On Wed, Aug 1, 2018, 9:05 PM Andrey Grodzovsky 
> <Andrey.Grodzovsky@amd.com <mailto:Andrey.Grodzovsky@amd.com>> wrote:
>
>     Clarification question -  if the run queues belong to different
>     schedulers they effectively point to different rings,
>
>     it means we allow to move (reschedule) a drm_sched_entity from one
>     ring
>     to another - i assume that the idea int the first place, that
>
>     you have a set of HW rings and you can utilize any of them for
>     your jobs
>     (like compute rings). Correct ?
>
>     Andrey
>
>
>     On 08/01/2018 04:20 AM, Nayan Deshmukh wrote:
>     > The function selects the run queue from the rq_list with the
>     > least load. The load is decided by the number of jobs in a
>     > scheduler.
>     >
>     > v2: avoid using atomic read twice consecutively, instead store
>     >      it locally
>     >
>     > Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com
>     <mailto:nayan26deshmukh@gmail.com>>
>     > ---
>     >   drivers/gpu/drm/scheduler/gpu_scheduler.c | 25
>     +++++++++++++++++++++++++
>     >   1 file changed, 25 insertions(+)
>     >
>     > diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>     b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>     > index 375f6f7f6a93..fb4e542660b0 100644
>     > --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>     > +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>     > @@ -255,6 +255,31 @@ static bool
>     drm_sched_entity_is_ready(struct drm_sched_entity *entity)
>     >       return true;
>     >   }
>     >
>     > +/**
>     > + * drm_sched_entity_get_free_sched - Get the rq from rq_list
>     with least load
>     > + *
>     > + * @entity: scheduler entity
>     > + *
>     > + * Return the pointer to the rq with least load.
>     > + */
>     > +static struct drm_sched_rq *
>     > +drm_sched_entity_get_free_sched(struct drm_sched_entity *entity)
>     > +{
>     > +     struct drm_sched_rq *rq = NULL;
>     > +     unsigned int min_jobs = UINT_MAX, num_jobs;
>     > +     int i;
>     > +
>     > +     for (i = 0; i < entity->num_rq_list; ++i) {
>     > +             num_jobs =
>     atomic_read(&entity->rq_list[i]->sched->num_jobs);
>     > +             if (num_jobs < min_jobs) {
>     > +                     min_jobs = num_jobs;
>     > +                     rq = entity->rq_list[i];
>     > +             }
>     > +     }
>     > +
>     > +     return rq;
>     > +}
>     > +
>     >   static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
>     >                                   struct dma_fence_cb *cb)
>     >   {
>
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p>Series is Acked-by: Andrey Grodzovsky
      <a class="moz-txt-link-rfc2396E" href="mailto:andrey.grodzovsky@amd.com">&lt;andrey.grodzovsky@amd.com&gt;</a></p>
    <p>Andrey<br>
    </p>
    <br>
    <div class="moz-cite-prefix">On 08/01/2018 12:06 PM, Nayan Deshmukh
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAFd4ddx8D2iKquRu4YVh1gnRMpLFgWf3CBPdk5SD-nJ8dNXEPQ@mail.gmail.com">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <div dir="auto">Yes, that is correct. 
        <div dir="auto"><br>
        </div>
        <div dir="auto">Nayan</div>
      </div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr">On Wed, Aug 1, 2018, 9:05 PM Andrey Grodzovsky
          &lt;<a href="mailto:Andrey.Grodzovsky@amd.com"
            moz-do-not-send="true">Andrey.Grodzovsky@amd.com</a>&gt;
          wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0 0 0
          .8ex;border-left:1px #ccc solid;padding-left:1ex">Clarification
          question -  if the run queues belong to different <br>
          schedulers they effectively point to different rings,<br>
          <br>
          it means we allow to move (reschedule) a drm_sched_entity from
          one ring <br>
          to another - i assume that the idea int the first place, that<br>
          <br>
          you have a set of HW rings and you can utilize any of them for
          your jobs <br>
          (like compute rings). Correct ?<br>
          <br>
          Andrey<br>
          <br>
          <br>
          On 08/01/2018 04:20 AM, Nayan Deshmukh wrote:<br>
          &gt; The function selects the run queue from the rq_list with
          the<br>
          &gt; least load. The load is decided by the number of jobs in
          a<br>
          &gt; scheduler.<br>
          &gt;<br>
          &gt; v2: avoid using atomic read twice consecutively, instead
          store<br>
          &gt;      it locally<br>
          &gt;<br>
          &gt; Signed-off-by: Nayan Deshmukh &lt;<a
            href="mailto:nayan26deshmukh@gmail.com" target="_blank"
            rel="noreferrer" moz-do-not-send="true">nayan26deshmukh@gmail.com</a>&gt;<br>
          &gt; ---<br>
          &gt;   drivers/gpu/drm/scheduler/gpu_scheduler.c | 25
          +++++++++++++++++++++++++<br>
          &gt;   1 file changed, 25 insertions(+)<br>
          &gt;<br>
          &gt; diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c
          b/drivers/gpu/drm/scheduler/gpu_scheduler.c<br>
          &gt; index 375f6f7f6a93..fb4e542660b0 100644<br>
          &gt; --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c<br>
          &gt; +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c<br>
          &gt; @@ -255,6 +255,31 @@ static bool
          drm_sched_entity_is_ready(struct drm_sched_entity *entity)<br>
          &gt;       return true;<br>
          &gt;   }<br>
          &gt;   <br>
          &gt; +/**<br>
          &gt; + * drm_sched_entity_get_free_sched - Get the rq from
          rq_list with least load<br>
          &gt; + *<br>
          &gt; + * @entity: scheduler entity<br>
          &gt; + *<br>
          &gt; + * Return the pointer to the rq with least load.<br>
          &gt; + */<br>
          &gt; +static struct drm_sched_rq *<br>
          &gt; +drm_sched_entity_get_free_sched(struct drm_sched_entity
          *entity)<br>
          &gt; +{<br>
          &gt; +     struct drm_sched_rq *rq = NULL;<br>
          &gt; +     unsigned int min_jobs = UINT_MAX, num_jobs;<br>
          &gt; +     int i;<br>
          &gt; +<br>
          &gt; +     for (i = 0; i &lt; entity-&gt;num_rq_list; ++i) {<br>
          &gt; +             num_jobs =
          atomic_read(&amp;entity-&gt;rq_list[i]-&gt;sched-&gt;num_jobs);<br>
          &gt; +             if (num_jobs &lt; min_jobs) {<br>
          &gt; +                     min_jobs = num_jobs;<br>
          &gt; +                     rq = entity-&gt;rq_list[i];<br>
          &gt; +             }<br>
          &gt; +     }<br>
          &gt; +<br>
          &gt; +     return rq;<br>
          &gt; +}<br>
          &gt; +<br>
          &gt;   static void drm_sched_entity_kill_jobs_cb(struct
          dma_fence *f,<br>
          &gt;                                   struct dma_fence_cb
          *cb)<br>
          &gt;   {<br>
          <br>
        </blockquote>
      </div>
    </blockquote>
    <br>
  </body>
</html>
Chunming Zhou Aug. 2, 2018, 2:51 a.m. UTC | #4
Another big question:
I agree the general idea is good to balance scheduler load for same ring family.
But, when same entity job run on different scheduler, that means the later job could be completed ahead of front, Right?
That will break fence design, later fence must be signaled after front fence in same fence context.

Anything I missed?

Regards,
David Zhou

From: dri-devel <dri-devel-bounces@lists.freedesktop.org> On Behalf Of Nayan Deshmukh
Sent: Thursday, August 02, 2018 12:07 AM
To: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
Cc: amd-gfx@lists.freedesktop.org; Maling list - DRI developers <dri-devel@lists.freedesktop.org>; Koenig, Christian <Christian.Koenig@amd.com>
Subject: Re: [PATCH 3/4] drm/scheduler: add new function to get least loaded sched v2

Yes, that is correct.

Nayan

On Wed, Aug 1, 2018, 9:05 PM Andrey Grodzovsky <Andrey.Grodzovsky@amd.com<mailto:Andrey.Grodzovsky@amd.com>> wrote:
Clarification question -  if the run queues belong to different
schedulers they effectively point to different rings,

it means we allow to move (reschedule) a drm_sched_entity from one ring
to another - i assume that the idea int the first place, that

you have a set of HW rings and you can utilize any of them for your jobs
(like compute rings). Correct ?

Andrey


On 08/01/2018 04:20 AM, Nayan Deshmukh wrote:
> The function selects the run queue from the rq_list with the
> least load. The load is decided by the number of jobs in a
> scheduler.
>
> v2: avoid using atomic read twice consecutively, instead store
>      it locally
>
> Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com<mailto:nayan26deshmukh@gmail.com>>
> ---
>   drivers/gpu/drm/scheduler/gpu_scheduler.c | 25 +++++++++++++++++++++++++
>   1 file changed, 25 insertions(+)
>
> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> index 375f6f7f6a93..fb4e542660b0 100644
> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> @@ -255,6 +255,31 @@ static bool drm_sched_entity_is_ready(struct drm_sched_entity *entity)
>       return true;
>   }
>
> +/**
> + * drm_sched_entity_get_free_sched - Get the rq from rq_list with least load
> + *
> + * @entity: scheduler entity
> + *
> + * Return the pointer to the rq with least load.
> + */
> +static struct drm_sched_rq *
> +drm_sched_entity_get_free_sched(struct drm_sched_entity *entity)
> +{
> +     struct drm_sched_rq *rq = NULL;
> +     unsigned int min_jobs = UINT_MAX, num_jobs;
> +     int i;
> +
> +     for (i = 0; i < entity->num_rq_list; ++i) {
> +             num_jobs = atomic_read(&entity->rq_list[i]->sched->num_jobs);
> +             if (num_jobs < min_jobs) {
> +                     min_jobs = num_jobs;
> +                     rq = entity->rq_list[i];
> +             }
> +     }
> +
> +     return rq;
> +}
> +
>   static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
>                                   struct dma_fence_cb *cb)
>   {
<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
	{font-family:宋体;
	panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
	{font-family:"Cambria Math";
	panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
	{font-family:Calibri;
	panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
	{font-family:"\@宋体";
	panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0in;
	margin-bottom:.0001pt;
	font-size:11.0pt;
	font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
	{mso-style-priority:99;
	color:blue;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{mso-style-priority:99;
	color:purple;
	text-decoration:underline;}
p.msonormal0, li.msonormal0, div.msonormal0
	{mso-style-name:msonormal;
	mso-margin-top-alt:auto;
	margin-right:0in;
	mso-margin-bottom-alt:auto;
	margin-left:0in;
	font-size:11.0pt;
	font-family:"Calibri",sans-serif;}
span.EmailStyle18
	{mso-style-type:personal-reply;
	font-family:"Calibri",sans-serif;
	color:windowtext;}
.MsoChpDefault
	{mso-style-type:export-only;
	font-family:"Calibri",sans-serif;}
@page WordSection1
	{size:8.5in 11.0in;
	margin:1.0in 1.25in 1.0in 1.25in;}
div.WordSection1
	{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal">Another big question:<o:p></o:p></p>
<p class="MsoNormal">I agree the general idea is good to balance scheduler load for same ring family.<o:p></o:p></p>
<p class="MsoNormal">But, when same entity job run on different scheduler, that means the later job could be completed ahead of front, Right?<o:p></o:p></p>
<p class="MsoNormal">That will break fence design, later fence must be signaled after front fence in same fence context.<o:p></o:p></p>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal">Anything I missed?<o:p></o:p></p>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal">Regards,<o:p></o:p></p>
<p class="MsoNormal">David Zhou<o:p></o:p></p>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal"><b>From:</b> dri-devel &lt;dri-devel-bounces@lists.freedesktop.org&gt;
<b>On Behalf Of </b>Nayan Deshmukh<br>
<b>Sent:</b> Thursday, August 02, 2018 12:07 AM<br>
<b>To:</b> Grodzovsky, Andrey &lt;Andrey.Grodzovsky@amd.com&gt;<br>
<b>Cc:</b> amd-gfx@lists.freedesktop.org; Maling list - DRI developers &lt;dri-devel@lists.freedesktop.org&gt;; Koenig, Christian &lt;Christian.Koenig@amd.com&gt;<br>
<b>Subject:</b> Re: [PATCH 3/4] drm/scheduler: add new function to get least loaded sched v2<o:p></o:p></p>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<div>
<p class="MsoNormal">Yes, that is correct.&nbsp;<o:p></o:p></p>
<div>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
<div>
<p class="MsoNormal">Nayan<o:p></o:p></p>
</div>
</div>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<div>
<div>
<p class="MsoNormal">On Wed, Aug 1, 2018, 9:05 PM Andrey Grodzovsky &lt;<a href="mailto:Andrey.Grodzovsky@amd.com">Andrey.Grodzovsky@amd.com</a>&gt; wrote:<o:p></o:p></p>
</div>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in">
<p class="MsoNormal" style="margin-bottom:12.0pt">Clarification question -&nbsp; if the run queues belong to different
<br>
schedulers they effectively point to different rings,<br>
<br>
it means we allow to move (reschedule) a drm_sched_entity from one ring <br>
to another - i assume that the idea int the first place, that<br>
<br>
you have a set of HW rings and you can utilize any of them for your jobs <br>
(like compute rings). Correct ?<br>
<br>
Andrey<br>
<br>
<br>
On 08/01/2018 04:20 AM, Nayan Deshmukh wrote:<br>
&gt; The function selects the run queue from the rq_list with the<br>
&gt; least load. The load is decided by the number of jobs in a<br>
&gt; scheduler.<br>
&gt;<br>
&gt; v2: avoid using atomic read twice consecutively, instead store<br>
&gt;&nbsp; &nbsp; &nbsp; it locally<br>
&gt;<br>
&gt; Signed-off-by: Nayan Deshmukh &lt;<a href="mailto:nayan26deshmukh@gmail.com" target="_blank">nayan26deshmukh@gmail.com</a>&gt;<br>
&gt; ---<br>
&gt;&nbsp; &nbsp;drivers/gpu/drm/scheduler/gpu_scheduler.c | 25 &#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;<br>
&gt;&nbsp; &nbsp;1 file changed, 25 insertions(&#43;)<br>
&gt;<br>
&gt; diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c<br>
&gt; index 375f6f7f6a93..fb4e542660b0 100644<br>
&gt; --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c<br>
&gt; &#43;&#43;&#43; b/drivers/gpu/drm/scheduler/gpu_scheduler.c<br>
&gt; @@ -255,6 &#43;255,31 @@ static bool drm_sched_entity_is_ready(struct drm_sched_entity *entity)<br>
&gt;&nbsp; &nbsp; &nbsp; &nbsp;return true;<br>
&gt;&nbsp; &nbsp;}<br>
&gt;&nbsp; &nbsp;<br>
&gt; &#43;/**<br>
&gt; &#43; * drm_sched_entity_get_free_sched - Get the rq from rq_list with least load<br>
&gt; &#43; *<br>
&gt; &#43; * @entity: scheduler entity<br>
&gt; &#43; *<br>
&gt; &#43; * Return the pointer to the rq with least load.<br>
&gt; &#43; */<br>
&gt; &#43;static struct drm_sched_rq *<br>
&gt; &#43;drm_sched_entity_get_free_sched(struct drm_sched_entity *entity)<br>
&gt; &#43;{<br>
&gt; &#43;&nbsp; &nbsp; &nbsp;struct drm_sched_rq *rq = NULL;<br>
&gt; &#43;&nbsp; &nbsp; &nbsp;unsigned int min_jobs = UINT_MAX, num_jobs;<br>
&gt; &#43;&nbsp; &nbsp; &nbsp;int i;<br>
&gt; &#43;<br>
&gt; &#43;&nbsp; &nbsp; &nbsp;for (i = 0; i &lt; entity-&gt;num_rq_list; &#43;&#43;i) {<br>
&gt; &#43;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;num_jobs = atomic_read(&amp;entity-&gt;rq_list[i]-&gt;sched-&gt;num_jobs);<br>
&gt; &#43;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;if (num_jobs &lt; min_jobs) {<br>
&gt; &#43;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;min_jobs = num_jobs;<br>
&gt; &#43;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;rq = entity-&gt;rq_list[i];<br>
&gt; &#43;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;}<br>
&gt; &#43;&nbsp; &nbsp; &nbsp;}<br>
&gt; &#43;<br>
&gt; &#43;&nbsp; &nbsp; &nbsp;return rq;<br>
&gt; &#43;}<br>
&gt; &#43;<br>
&gt;&nbsp; &nbsp;static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,<br>
&gt;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;struct dma_fence_cb *cb)<br>
&gt;&nbsp; &nbsp;{<o:p></o:p></p>
</blockquote>
</div>
</div>
</body>
</html>
Nayan Deshmukh Aug. 2, 2018, 6:01 a.m. UTC | #5
Hi David,

On Thu, Aug 2, 2018 at 8:22 AM Zhou, David(ChunMing) <David1.Zhou@amd.com>
wrote:

> Another big question:
>
> I agree the general idea is good to balance scheduler load for same ring
> family.
>
> But, when same entity job run on different scheduler, that means the later
> job could be completed ahead of front, Right?
>
Really good question. To avoid this senario we do not move an entity which
already has a job in the hardware queue. We only move entities whose
last_scheduled fence has been signalled which means that the last submitted
job of this entity has finished executing.

Moving an entity which already has a job in the hardware queue will hinder
the dependency optimization that we are using and hence will not anyway
lead to a better performance. I have talked about the issue in more detail
here [1]. Please let me know if you have any more doubts regarding this.

Cheers,
Nayan

[1]
http://ndesh26.github.io/gsoc/2018/06/14/GSoC-Update-A-Curious-Case-of-Dependency-Handling/

That will break fence design, later fence must be signaled after front
> fence in same fence context.
>
>
>
> Anything I missed?
>
>
>
> Regards,
>
> David Zhou
>
>
>
> *From:* dri-devel <dri-devel-bounces@lists.freedesktop.org> *On Behalf Of
> *Nayan Deshmukh
> *Sent:* Thursday, August 02, 2018 12:07 AM
> *To:* Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
> *Cc:* amd-gfx@lists.freedesktop.org; Maling list - DRI developers <
> dri-devel@lists.freedesktop.org>; Koenig, Christian <
> Christian.Koenig@amd.com>
> *Subject:* Re: [PATCH 3/4] drm/scheduler: add new function to get least
> loaded sched v2
>
>
>
> Yes, that is correct.
>
>
>
> Nayan
>
>
>
> On Wed, Aug 1, 2018, 9:05 PM Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> wrote:
>
> Clarification question -  if the run queues belong to different
> schedulers they effectively point to different rings,
>
> it means we allow to move (reschedule) a drm_sched_entity from one ring
> to another - i assume that the idea int the first place, that
>
> you have a set of HW rings and you can utilize any of them for your jobs
> (like compute rings). Correct ?
>
> Andrey
>
>
> On 08/01/2018 04:20 AM, Nayan Deshmukh wrote:
> > The function selects the run queue from the rq_list with the
> > least load. The load is decided by the number of jobs in a
> > scheduler.
> >
> > v2: avoid using atomic read twice consecutively, instead store
> >      it locally
> >
> > Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
> > ---
> >   drivers/gpu/drm/scheduler/gpu_scheduler.c | 25
> +++++++++++++++++++++++++
> >   1 file changed, 25 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > index 375f6f7f6a93..fb4e542660b0 100644
> > --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > @@ -255,6 +255,31 @@ static bool drm_sched_entity_is_ready(struct
> drm_sched_entity *entity)
> >       return true;
> >   }
> >
> > +/**
> > + * drm_sched_entity_get_free_sched - Get the rq from rq_list with least
> load
> > + *
> > + * @entity: scheduler entity
> > + *
> > + * Return the pointer to the rq with least load.
> > + */
> > +static struct drm_sched_rq *
> > +drm_sched_entity_get_free_sched(struct drm_sched_entity *entity)
> > +{
> > +     struct drm_sched_rq *rq = NULL;
> > +     unsigned int min_jobs = UINT_MAX, num_jobs;
> > +     int i;
> > +
> > +     for (i = 0; i < entity->num_rq_list; ++i) {
> > +             num_jobs =
> atomic_read(&entity->rq_list[i]->sched->num_jobs);
> > +             if (num_jobs < min_jobs) {
> > +                     min_jobs = num_jobs;
> > +                     rq = entity->rq_list[i];
> > +             }
> > +     }
> > +
> > +     return rq;
> > +}
> > +
> >   static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
> >                                   struct dma_fence_cb *cb)
> >   {
>
>
<div dir="ltr">Hi David,<br><div><br><div class="gmail_quote"><div dir="ltr">On Thu, Aug 2, 2018 at 8:22 AM Zhou, David(ChunMing) &lt;<a href="mailto:David1.Zhou@amd.com">David1.Zhou@amd.com</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">





<div lang="EN-US">
<div class="gmail-m_963201938271036718WordSection1">
<p class="MsoNormal">Another big question:<u></u><u></u></p>
<p class="MsoNormal">I agree the general idea is good to balance scheduler load for same ring family.<u></u><u></u></p>
<p class="MsoNormal">But, when same entity job run on different scheduler, that means the later job could be completed ahead of front, Right?<u></u><u></u></p></div></div></blockquote><div>Really good question. To avoid this senario we do not move an entity which already has a job in the hardware queue. We only move entities whose last_scheduled fence has been signalled which means that the last submitted job of this entity has finished executing. <br><br></div><div>Moving an entity which already has a job in the hardware queue will hinder the dependency optimization that we are using and hence will not anyway lead to a better performance. I have talked about the issue in more detail here [1]. Please let me know if you have any more doubts regarding this.<br><br></div><div>Cheers,<br></div><div>Nayan <br><br>[1] <a href="http://ndesh26.github.io/gsoc/2018/06/14/GSoC-Update-A-Curious-Case-of-Dependency-Handling/">http://ndesh26.github.io/gsoc/2018/06/14/GSoC-Update-A-Curious-Case-of-Dependency-Handling/</a><br><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div lang="EN-US"><div class="gmail-m_963201938271036718WordSection1"><p class="MsoNormal"></p>
<p class="MsoNormal">That will break fence design, later fence must be signaled after front fence in same fence context.<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">Anything I missed?<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">Regards,<u></u><u></u></p>
<p class="MsoNormal">David Zhou<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal"><b>From:</b> dri-devel &lt;<a href="mailto:dri-devel-bounces@lists.freedesktop.org" target="_blank">dri-devel-bounces@lists.freedesktop.org</a>&gt;
<b>On Behalf Of </b>Nayan Deshmukh<br>
<b>Sent:</b> Thursday, August 02, 2018 12:07 AM<br>
<b>To:</b> Grodzovsky, Andrey &lt;<a href="mailto:Andrey.Grodzovsky@amd.com" target="_blank">Andrey.Grodzovsky@amd.com</a>&gt;<br>
<b>Cc:</b> <a href="mailto:amd-gfx@lists.freedesktop.org" target="_blank">amd-gfx@lists.freedesktop.org</a>; Maling list - DRI developers &lt;<a href="mailto:dri-devel@lists.freedesktop.org" target="_blank">dri-devel@lists.freedesktop.org</a>&gt;; Koenig, Christian &lt;<a href="mailto:Christian.Koenig@amd.com" target="_blank">Christian.Koenig@amd.com</a>&gt;<br>
<b>Subject:</b> Re: [PATCH 3/4] drm/scheduler: add new function to get least loaded sched v2<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<p class="MsoNormal">Yes, that is correct. <u></u><u></u></p>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Nayan<u></u><u></u></p>
</div>
</div>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<div>
<p class="MsoNormal">On Wed, Aug 1, 2018, 9:05 PM Andrey Grodzovsky &lt;<a href="mailto:Andrey.Grodzovsky@amd.com" target="_blank">Andrey.Grodzovsky@amd.com</a>&gt; wrote:<u></u><u></u></p>
</div>
<blockquote style="border-color:currentcolor currentcolor currentcolor rgb(204,204,204);border-style:none none none solid;border-width:medium medium medium 1pt;padding:0in 0in 0in 6pt;margin-left:4.8pt;margin-right:0in">
<p class="MsoNormal" style="margin-bottom:12pt">Clarification question -  if the run queues belong to different
<br>
schedulers they effectively point to different rings,<br>
<br>
it means we allow to move (reschedule) a drm_sched_entity from one ring <br>
to another - i assume that the idea int the first place, that<br>
<br>
you have a set of HW rings and you can utilize any of them for your jobs <br>
(like compute rings). Correct ?<br>
<br>
Andrey<br>
<br>
<br>
On 08/01/2018 04:20 AM, Nayan Deshmukh wrote:<br>
&gt; The function selects the run queue from the rq_list with the<br>
&gt; least load. The load is decided by the number of jobs in a<br>
&gt; scheduler.<br>
&gt;<br>
&gt; v2: avoid using atomic read twice consecutively, instead store<br>
&gt;      it locally<br>
&gt;<br>
&gt; Signed-off-by: Nayan Deshmukh &lt;<a href="mailto:nayan26deshmukh@gmail.com" target="_blank">nayan26deshmukh@gmail.com</a>&gt;<br>
&gt; ---<br>
&gt;   drivers/gpu/drm/scheduler/gpu_scheduler.c | 25 +++++++++++++++++++++++++<br>
&gt;   1 file changed, 25 insertions(+)<br>
&gt;<br>
&gt; diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c<br>
&gt; index 375f6f7f6a93..fb4e542660b0 100644<br>
&gt; --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c<br>
&gt; +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c<br>
&gt; @@ -255,6 +255,31 @@ static bool drm_sched_entity_is_ready(struct drm_sched_entity *entity)<br>
&gt;       return true;<br>
&gt;   }<br>
&gt;   <br>
&gt; +/**<br>
&gt; + * drm_sched_entity_get_free_sched - Get the rq from rq_list with least load<br>
&gt; + *<br>
&gt; + * @entity: scheduler entity<br>
&gt; + *<br>
&gt; + * Return the pointer to the rq with least load.<br>
&gt; + */<br>
&gt; +static struct drm_sched_rq *<br>
&gt; +drm_sched_entity_get_free_sched(struct drm_sched_entity *entity)<br>
&gt; +{<br>
&gt; +     struct drm_sched_rq *rq = NULL;<br>
&gt; +     unsigned int min_jobs = UINT_MAX, num_jobs;<br>
&gt; +     int i;<br>
&gt; +<br>
&gt; +     for (i = 0; i &lt; entity-&gt;num_rq_list; ++i) {<br>
&gt; +             num_jobs = atomic_read(&amp;entity-&gt;rq_list[i]-&gt;sched-&gt;num_jobs);<br>
&gt; +             if (num_jobs &lt; min_jobs) {<br>
&gt; +                     min_jobs = num_jobs;<br>
&gt; +                     rq = entity-&gt;rq_list[i];<br>
&gt; +             }<br>
&gt; +     }<br>
&gt; +<br>
&gt; +     return rq;<br>
&gt; +}<br>
&gt; +<br>
&gt;   static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,<br>
&gt;                                   struct dma_fence_cb *cb)<br>
&gt;   {<u></u><u></u></p>
</blockquote>
</div>
</div>
</div>

</blockquote></div></div></div>
Chunming Zhou Aug. 2, 2018, 6:42 a.m. UTC | #6
On 2018年08月02日 14:01, Nayan Deshmukh wrote:
> Hi David,
>
> On Thu, Aug 2, 2018 at 8:22 AM Zhou, David(ChunMing) 
> <David1.Zhou@amd.com <mailto:David1.Zhou@amd.com>> wrote:
>
>     Another big question:
>
>     I agree the general idea is good to balance scheduler load for
>     same ring family.
>
>     But, when same entity job run on different scheduler, that means
>     the later job could be completed ahead of front, Right?
>
> Really good question. To avoid this senario we do not move an entity 
> which already has a job in the hardware queue. We only move entities 
> whose last_scheduled fence has been signalled which means that the 
> last submitted job of this entity has finished executing.
Good handling I missed when reviewing them.

Cheers,
David Zhou
>
> Moving an entity which already has a job in the hardware queue will 
> hinder the dependency optimization that we are using and hence will 
> not anyway lead to a better performance. I have talked about the issue 
> in more detail here [1]. Please let me know if you have any more 
> doubts regarding this.
>
> Cheers,
> Nayan
>
> [1] 
> http://ndesh26.github.io/gsoc/2018/06/14/GSoC-Update-A-Curious-Case-of-Dependency-Handling/
>
>     That will break fence design, later fence must be signaled after
>     front fence in same fence context.
>
>     Anything I missed?
>
>     Regards,
>
>     David Zhou
>
>     *From:* dri-devel <dri-devel-bounces@lists.freedesktop.org
>     <mailto:dri-devel-bounces@lists.freedesktop.org>> *On Behalf Of
>     *Nayan Deshmukh
>     *Sent:* Thursday, August 02, 2018 12:07 AM
>     *To:* Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com
>     <mailto:Andrey.Grodzovsky@amd.com>>
>     *Cc:* amd-gfx@lists.freedesktop.org
>     <mailto:amd-gfx@lists.freedesktop.org>; Maling list - DRI
>     developers <dri-devel@lists.freedesktop.org
>     <mailto:dri-devel@lists.freedesktop.org>>; Koenig, Christian
>     <Christian.Koenig@amd.com <mailto:Christian.Koenig@amd.com>>
>     *Subject:* Re: [PATCH 3/4] drm/scheduler: add new function to get
>     least loaded sched v2
>
>     Yes, that is correct.
>
>     Nayan
>
>     On Wed, Aug 1, 2018, 9:05 PM Andrey Grodzovsky
>     <Andrey.Grodzovsky@amd.com <mailto:Andrey.Grodzovsky@amd.com>> wrote:
>
>         Clarification question -  if the run queues belong to different
>         schedulers they effectively point to different rings,
>
>         it means we allow to move (reschedule) a drm_sched_entity from
>         one ring
>         to another - i assume that the idea int the first place, that
>
>         you have a set of HW rings and you can utilize any of them for
>         your jobs
>         (like compute rings). Correct ?
>
>         Andrey
>
>
>         On 08/01/2018 04:20 AM, Nayan Deshmukh wrote:
>         > The function selects the run queue from the rq_list with the
>         > least load. The load is decided by the number of jobs in a
>         > scheduler.
>         >
>         > v2: avoid using atomic read twice consecutively, instead store
>         >      it locally
>         >
>         > Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com
>         <mailto:nayan26deshmukh@gmail.com>>
>         > ---
>         >   drivers/gpu/drm/scheduler/gpu_scheduler.c | 25
>         +++++++++++++++++++++++++
>         >   1 file changed, 25 insertions(+)
>         >
>         > diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>         b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>         > index 375f6f7f6a93..fb4e542660b0 100644
>         > --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>         > +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>         > @@ -255,6 +255,31 @@ static bool
>         drm_sched_entity_is_ready(struct drm_sched_entity *entity)
>         >       return true;
>         >   }
>         >
>         > +/**
>         > + * drm_sched_entity_get_free_sched - Get the rq from
>         rq_list with least load
>         > + *
>         > + * @entity: scheduler entity
>         > + *
>         > + * Return the pointer to the rq with least load.
>         > + */
>         > +static struct drm_sched_rq *
>         > +drm_sched_entity_get_free_sched(struct drm_sched_entity
>         *entity)
>         > +{
>         > +     struct drm_sched_rq *rq = NULL;
>         > +     unsigned int min_jobs = UINT_MAX, num_jobs;
>         > +     int i;
>         > +
>         > +     for (i = 0; i < entity->num_rq_list; ++i) {
>         > +             num_jobs =
>         atomic_read(&entity->rq_list[i]->sched->num_jobs);
>         > +             if (num_jobs < min_jobs) {
>         > +                     min_jobs = num_jobs;
>         > +                     rq = entity->rq_list[i];
>         > +             }
>         > +     }
>         > +
>         > +     return rq;
>         > +}
>         > +
>         >   static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
>         >                                   struct dma_fence_cb *cb)
>         >   {
>
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p><br>
    </p>
    <br>
    <div class="moz-cite-prefix">On 2018年08月02日 14:01, Nayan Deshmukh
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAFd4ddyf=EhJ7pmzq3sEGa6U1sQ7Ga7fUH+sW+VKeBxJABjnKQ@mail.gmail.com">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <div dir="ltr">Hi David,<br>
        <div><br>
          <div class="gmail_quote">
            <div dir="ltr">On Thu, Aug 2, 2018 at 8:22 AM Zhou,
              David(ChunMing) &lt;<a href="mailto:David1.Zhou@amd.com"
                moz-do-not-send="true">David1.Zhou@amd.com</a>&gt;
              wrote:<br>
            </div>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex">
              <div lang="EN-US">
                <div class="gmail-m_963201938271036718WordSection1">
                  <p class="MsoNormal">Another big question:</p>
                  <p class="MsoNormal">I agree the general idea is good
                    to balance scheduler load for same ring family.</p>
                  <p class="MsoNormal">But, when same entity job run on
                    different scheduler, that means the later job could
                    be completed ahead of front, Right?</p>
                </div>
              </div>
            </blockquote>
            <div>Really good question. To avoid this senario we do not
              move an entity which already has a job in the hardware
              queue. We only move entities whose last_scheduled fence
              has been signalled which means that the last submitted job
              of this entity has finished executing. <br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    Good handling I missed when reviewing them.<br>
    <br>
    Cheers,<br>
    David Zhou<br>
    <blockquote type="cite"
cite="mid:CAFd4ddyf=EhJ7pmzq3sEGa6U1sQ7Ga7fUH+sW+VKeBxJABjnKQ@mail.gmail.com">
      <div dir="ltr">
        <div>
          <div class="gmail_quote">
            <div><br>
            </div>
            <div>Moving an entity which already has a job in the
              hardware queue will hinder the dependency optimization
              that we are using and hence will not anyway lead to a
              better performance. I have talked about the issue in more
              detail here [1]. Please let me know if you have any more
              doubts regarding this.<br>
              <br>
            </div>
            <div>Cheers,<br>
            </div>
            <div>Nayan <br>
              <br>
              [1] <a
href="http://ndesh26.github.io/gsoc/2018/06/14/GSoC-Update-A-Curious-Case-of-Dependency-Handling/"
                moz-do-not-send="true">http://ndesh26.github.io/gsoc/2018/06/14/GSoC-Update-A-Curious-Case-of-Dependency-Handling/</a><br>
              <br>
            </div>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex">
              <div lang="EN-US">
                <div class="gmail-m_963201938271036718WordSection1">
                  <p class="MsoNormal">That will break fence design,
                    later fence must be signaled after front fence in
                    same fence context.</p>
                  <p class="MsoNormal"> </p>
                  <p class="MsoNormal">Anything I missed?</p>
                  <p class="MsoNormal"> </p>
                  <p class="MsoNormal">Regards,</p>
                  <p class="MsoNormal">David Zhou</p>
                  <p class="MsoNormal"> </p>
                  <p class="MsoNormal"><b>From:</b> dri-devel &lt;<a
                      href="mailto:dri-devel-bounces@lists.freedesktop.org"
                      target="_blank" moz-do-not-send="true">dri-devel-bounces@lists.freedesktop.org</a>&gt;
                    <b>On Behalf Of </b>Nayan Deshmukh<br>
                    <b>Sent:</b> Thursday, August 02, 2018 12:07 AM<br>
                    <b>To:</b> Grodzovsky, Andrey &lt;<a
                      href="mailto:Andrey.Grodzovsky@amd.com"
                      target="_blank" moz-do-not-send="true">Andrey.Grodzovsky@amd.com</a>&gt;<br>
                    <b>Cc:</b> <a
                      href="mailto:amd-gfx@lists.freedesktop.org"
                      target="_blank" moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
                    Maling list - DRI developers &lt;<a
                      href="mailto:dri-devel@lists.freedesktop.org"
                      target="_blank" moz-do-not-send="true">dri-devel@lists.freedesktop.org</a>&gt;;
                    Koenig, Christian &lt;<a
                      href="mailto:Christian.Koenig@amd.com"
                      target="_blank" moz-do-not-send="true">Christian.Koenig@amd.com</a>&gt;<br>
                    <b>Subject:</b> Re: [PATCH 3/4] drm/scheduler: add
                    new function to get least loaded sched v2</p>
                  <p class="MsoNormal"> </p>
                  <div>
                    <p class="MsoNormal">Yes, that is correct. </p>
                    <div>
                      <p class="MsoNormal"> </p>
                    </div>
                    <div>
                      <p class="MsoNormal">Nayan</p>
                    </div>
                  </div>
                  <p class="MsoNormal"> </p>
                  <div>
                    <div>
                      <p class="MsoNormal">On Wed, Aug 1, 2018, 9:05 PM
                        Andrey Grodzovsky &lt;<a
                          href="mailto:Andrey.Grodzovsky@amd.com"
                          target="_blank" moz-do-not-send="true">Andrey.Grodzovsky@amd.com</a>&gt;
                        wrote:</p>
                    </div>
                    <blockquote style="border-color:currentcolor
                      currentcolor currentcolor
                      rgb(204,204,204);border-style:none none none
                      solid;border-width:medium medium medium
                      1pt;padding:0in 0in 0in
                      6pt;margin-left:4.8pt;margin-right:0in">
                      <p class="MsoNormal" style="margin-bottom:12pt">Clarification
                        question -  if the run queues belong to
                        different
                        <br>
                        schedulers they effectively point to different
                        rings,<br>
                        <br>
                        it means we allow to move (reschedule) a
                        drm_sched_entity from one ring <br>
                        to another - i assume that the idea int the
                        first place, that<br>
                        <br>
                        you have a set of HW rings and you can utilize
                        any of them for your jobs <br>
                        (like compute rings). Correct ?<br>
                        <br>
                        Andrey<br>
                        <br>
                        <br>
                        On 08/01/2018 04:20 AM, Nayan Deshmukh wrote:<br>
                        &gt; The function selects the run queue from the
                        rq_list with the<br>
                        &gt; least load. The load is decided by the
                        number of jobs in a<br>
                        &gt; scheduler.<br>
                        &gt;<br>
                        &gt; v2: avoid using atomic read twice
                        consecutively, instead store<br>
                        &gt;      it locally<br>
                        &gt;<br>
                        &gt; Signed-off-by: Nayan Deshmukh &lt;<a
                          href="mailto:nayan26deshmukh@gmail.com"
                          target="_blank" moz-do-not-send="true">nayan26deshmukh@gmail.com</a>&gt;<br>
                        &gt; ---<br>
                        &gt;   drivers/gpu/drm/scheduler/gpu_scheduler.c
                        | 25 +++++++++++++++++++++++++<br>
                        &gt;   1 file changed, 25 insertions(+)<br>
                        &gt;<br>
                        &gt; diff --git
                        a/drivers/gpu/drm/scheduler/gpu_scheduler.c
                        b/drivers/gpu/drm/scheduler/gpu_scheduler.c<br>
                        &gt; index 375f6f7f6a93..fb4e542660b0 100644<br>
                        &gt; ---
                        a/drivers/gpu/drm/scheduler/gpu_scheduler.c<br>
                        &gt; +++
                        b/drivers/gpu/drm/scheduler/gpu_scheduler.c<br>
                        &gt; @@ -255,6 +255,31 @@ static bool
                        drm_sched_entity_is_ready(struct
                        drm_sched_entity *entity)<br>
                        &gt;       return true;<br>
                        &gt;   }<br>
                        &gt;   <br>
                        &gt; +/**<br>
                        &gt; + * drm_sched_entity_get_free_sched - Get
                        the rq from rq_list with least load<br>
                        &gt; + *<br>
                        &gt; + * @entity: scheduler entity<br>
                        &gt; + *<br>
                        &gt; + * Return the pointer to the rq with least
                        load.<br>
                        &gt; + */<br>
                        &gt; +static struct drm_sched_rq *<br>
                        &gt; +drm_sched_entity_get_free_sched(struct
                        drm_sched_entity *entity)<br>
                        &gt; +{<br>
                        &gt; +     struct drm_sched_rq *rq = NULL;<br>
                        &gt; +     unsigned int min_jobs = UINT_MAX,
                        num_jobs;<br>
                        &gt; +     int i;<br>
                        &gt; +<br>
                        &gt; +     for (i = 0; i &lt;
                        entity-&gt;num_rq_list; ++i) {<br>
                        &gt; +             num_jobs =
                        atomic_read(&amp;entity-&gt;rq_list[i]-&gt;sched-&gt;num_jobs);<br>
                        &gt; +             if (num_jobs &lt; min_jobs) {<br>
                        &gt; +                     min_jobs = num_jobs;<br>
                        &gt; +                     rq =
                        entity-&gt;rq_list[i];<br>
                        &gt; +             }<br>
                        &gt; +     }<br>
                        &gt; +<br>
                        &gt; +     return rq;<br>
                        &gt; +}<br>
                        &gt; +<br>
                        &gt;   static void
                        drm_sched_entity_kill_jobs_cb(struct dma_fence
                        *f,<br>
                        &gt;                                   struct
                        dma_fence_cb *cb)<br>
                        &gt;   {</p>
                    </blockquote>
                  </div>
                </div>
              </div>
            </blockquote>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
  </body>
</html>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
index 375f6f7f6a93..fb4e542660b0 100644
--- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
+++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
@@ -255,6 +255,31 @@  static bool drm_sched_entity_is_ready(struct drm_sched_entity *entity)
 	return true;
 }
 
+/**
+ * drm_sched_entity_get_free_sched - Get the rq from rq_list with least load
+ *
+ * @entity: scheduler entity
+ *
+ * Return the pointer to the rq with least load.
+ */
+static struct drm_sched_rq *
+drm_sched_entity_get_free_sched(struct drm_sched_entity *entity)
+{
+	struct drm_sched_rq *rq = NULL;
+	unsigned int min_jobs = UINT_MAX, num_jobs;
+	int i;
+
+	for (i = 0; i < entity->num_rq_list; ++i) {
+		num_jobs = atomic_read(&entity->rq_list[i]->sched->num_jobs);
+		if (num_jobs < min_jobs) {
+			min_jobs = num_jobs;
+			rq = entity->rq_list[i];
+		}
+	}
+
+	return rq;
+}
+
 static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
 				    struct dma_fence_cb *cb)
 {