diff mbox

[35/43] drm/i915: Move __raw_i915_read8() & co. into i915_drv.h

Message ID 1442595836-23981-36-git-send-email-ville.syrjala@linux.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ville Syrjälä Sept. 18, 2015, 5:03 p.m. UTC
From: Ville Syrjälä <ville.syrjala@linux.intel.com>

We have a few users of the raw register acces functions outside
intel_uncore.c, so let's just move the functions into intel_drv.h.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c |  2 +-
 drivers/gpu/drm/i915/i915_drv.h     | 28 ++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_irq.c     |  2 --
 drivers/gpu/drm/i915/intel_uncore.c | 28 ----------------------------
 4 files changed, 29 insertions(+), 31 deletions(-)

Comments

Chris Wilson Sept. 18, 2015, 5:42 p.m. UTC | #1
On Fri, Sep 18, 2015 at 08:03:48PM +0300, ville.syrjala@linux.intel.com wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> We have a few users of the raw register acces functions outside
> intel_uncore.c, so let's just move the functions into intel_drv.h.

I would rather see those external users converted to
I915_READ_FW/I915_WRITE_FW etc. You will then, no doubt, want to convert
those _FW macro definitions over to the uncore set.

Also due to how we write and post our accesses, the raw functions can be
the _relaxed variants.
-Chris
Ville Syrjälä Sept. 18, 2015, 6:23 p.m. UTC | #2
On Fri, Sep 18, 2015 at 06:42:17PM +0100, Chris Wilson wrote:
> On Fri, Sep 18, 2015 at 08:03:48PM +0300, ville.syrjala@linux.intel.com wrote:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > 
> > We have a few users of the raw register acces functions outside
> > intel_uncore.c, so let's just move the functions into intel_drv.h.
> 
> I would rather see those external users converted to
> I915_READ_FW/I915_WRITE_FW etc. You will then, no doubt, want to convert
> those _FW macro definitions over to the uncore set.
> 
> Also due to how we write and post our accesses, the raw functions can be
> the _relaxed variants.

Hmm. I think the only difference with the relaxed vs. not would be
potential compiler reordering of memory accesses vs. mmio. So if we
start using the relaxed versions we may need to start sprinkling
barriers around.
Chris Wilson Sept. 18, 2015, 6:33 p.m. UTC | #3
On Fri, Sep 18, 2015 at 09:23:11PM +0300, Ville Syrjälä wrote:
> On Fri, Sep 18, 2015 at 06:42:17PM +0100, Chris Wilson wrote:
> > On Fri, Sep 18, 2015 at 08:03:48PM +0300, ville.syrjala@linux.intel.com wrote:
> > > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > 
> > > We have a few users of the raw register acces functions outside
> > > intel_uncore.c, so let's just move the functions into intel_drv.h.
> > 
> > I would rather see those external users converted to
> > I915_READ_FW/I915_WRITE_FW etc. You will then, no doubt, want to convert
> > those _FW macro definitions over to the uncore set.
> > 
> > Also due to how we write and post our accesses, the raw functions can be
> > the _relaxed variants.
> 
> Hmm. I think the only difference with the relaxed vs. not would be
> potential compiler reordering of memory accesses vs. mmio. So if we
> start using the relaxed versions we may need to start sprinkling
> barriers around.

Yes. We have been working under that assumption (weak ordering of writes),
or at least I hope we all have been...
-Chris
Ville Syrjälä Sept. 18, 2015, 6:37 p.m. UTC | #4
On Fri, Sep 18, 2015 at 07:33:50PM +0100, Chris Wilson wrote:
> On Fri, Sep 18, 2015 at 09:23:11PM +0300, Ville Syrjälä wrote:
> > On Fri, Sep 18, 2015 at 06:42:17PM +0100, Chris Wilson wrote:
> > > On Fri, Sep 18, 2015 at 08:03:48PM +0300, ville.syrjala@linux.intel.com wrote:
> > > > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > > 
> > > > We have a few users of the raw register acces functions outside
> > > > intel_uncore.c, so let's just move the functions into intel_drv.h.
> > > 
> > > I would rather see those external users converted to
> > > I915_READ_FW/I915_WRITE_FW etc. You will then, no doubt, want to convert
> > > those _FW macro definitions over to the uncore set.
> > > 
> > > Also due to how we write and post our accesses, the raw functions can be
> > > the _relaxed variants.
> > 
> > Hmm. I think the only difference with the relaxed vs. not would be
> > potential compiler reordering of memory accesses vs. mmio. So if we
> > start using the relaxed versions we may need to start sprinkling
> > barriers around.
> 
> Yes. We have been working under that assumption (weak ordering of writes),
> or at least I hope we all have been...

Well, looking at the irq code that uses the _FW, what would prevent the
compiler from eg. reorder the seqno read to happen before the IIR read?
Well in practice function calls are involved, so there's a barrier
there, but say we would want to read the seqno straight from the irq
handler...
Chris Wilson Sept. 18, 2015, 6:44 p.m. UTC | #5
On Fri, Sep 18, 2015 at 09:37:38PM +0300, Ville Syrjälä wrote:
> On Fri, Sep 18, 2015 at 07:33:50PM +0100, Chris Wilson wrote:
> > On Fri, Sep 18, 2015 at 09:23:11PM +0300, Ville Syrjälä wrote:
> > > On Fri, Sep 18, 2015 at 06:42:17PM +0100, Chris Wilson wrote:
> > > > On Fri, Sep 18, 2015 at 08:03:48PM +0300, ville.syrjala@linux.intel.com wrote:
> > > > > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > > > 
> > > > > We have a few users of the raw register acces functions outside
> > > > > intel_uncore.c, so let's just move the functions into intel_drv.h.
> > > > 
> > > > I would rather see those external users converted to
> > > > I915_READ_FW/I915_WRITE_FW etc. You will then, no doubt, want to convert
> > > > those _FW macro definitions over to the uncore set.
> > > > 
> > > > Also due to how we write and post our accesses, the raw functions can be
> > > > the _relaxed variants.
> > > 
> > > Hmm. I think the only difference with the relaxed vs. not would be
> > > potential compiler reordering of memory accesses vs. mmio. So if we
> > > start using the relaxed versions we may need to start sprinkling
> > > barriers around.
> > 
> > Yes. We have been working under that assumption (weak ordering of writes),
> > or at least I hope we all have been...
> 
> Well, looking at the irq code that uses the _FW, what would prevent the
> compiler from eg. reorder the seqno read to happen before the IIR read?
> Well in practice function calls are involved, so there's a barrier
> there, but say we would want to read the seqno straight from the irq
> handler...

What seqno read? We definitely don't want to be doing those expensive
reads from an irq handler...

Anyway, I thought we had strongly ordered reads on x64/x32?
-Chris
Ville Syrjälä Sept. 18, 2015, 7:26 p.m. UTC | #6
On Fri, Sep 18, 2015 at 07:44:34PM +0100, Chris Wilson wrote:
> On Fri, Sep 18, 2015 at 09:37:38PM +0300, Ville Syrjälä wrote:
> > On Fri, Sep 18, 2015 at 07:33:50PM +0100, Chris Wilson wrote:
> > > On Fri, Sep 18, 2015 at 09:23:11PM +0300, Ville Syrjälä wrote:
> > > > On Fri, Sep 18, 2015 at 06:42:17PM +0100, Chris Wilson wrote:
> > > > > On Fri, Sep 18, 2015 at 08:03:48PM +0300, ville.syrjala@linux.intel.com wrote:
> > > > > > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > > > > 
> > > > > > We have a few users of the raw register acces functions outside
> > > > > > intel_uncore.c, so let's just move the functions into intel_drv.h.
> > > > > 
> > > > > I would rather see those external users converted to
> > > > > I915_READ_FW/I915_WRITE_FW etc. You will then, no doubt, want to convert
> > > > > those _FW macro definitions over to the uncore set.
> > > > > 
> > > > > Also due to how we write and post our accesses, the raw functions can be
> > > > > the _relaxed variants.
> > > > 
> > > > Hmm. I think the only difference with the relaxed vs. not would be
> > > > potential compiler reordering of memory accesses vs. mmio. So if we
> > > > start using the relaxed versions we may need to start sprinkling
> > > > barriers around.
> > > 
> > > Yes. We have been working under that assumption (weak ordering of writes),
> > > or at least I hope we all have been...
> > 
> > Well, looking at the irq code that uses the _FW, what would prevent the
> > compiler from eg. reorder the seqno read to happen before the IIR read?
> > Well in practice function calls are involved, so there's a barrier
> > there, but say we would want to read the seqno straight from the irq
> > handler...
> 
> What seqno read? We definitely don't want to be doing those expensive
> reads from an irq handler...

Yeah I guess that was a crappy example. Trying to think of a better one,
I figure execlist status could be read from memory, but apparently
that's just mmio. I guess the context update + ELSP write is something
we do from the irq handler, but there are plenty of barriers between
those two it seems. Maybe there's no good example here.

> 
> Anyway, I thought we had strongly ordered reads on x64/x32?

The cpu won't reoder reads vs. reads, or writes vs. writes for that
matter if you ignore the nt stuff and whatnot. But AFAIU the whole
point of _relaxed() on x86 is that it allows the compiler to reoder
memory vs. mmio any which way it wants.
Jesse Barnes Sept. 21, 2015, 4:26 p.m. UTC | #7
On 09/18/2015 12:26 PM, Ville Syrjälä wrote:
> On Fri, Sep 18, 2015 at 07:44:34PM +0100, Chris Wilson wrote:
> Yeah I guess that was a crappy example. Trying to think of a better one,
> I figure execlist status could be read from memory, but apparently
> that's just mmio. I guess the context update + ELSP write is something
> we do from the irq handler, but there are plenty of barriers between
> those two it seems. Maybe there's no good example here.
> 
>>
>> Anyway, I thought we had strongly ordered reads on x64/x32?
> 
> The cpu won't reoder reads vs. reads, or writes vs. writes for that
> matter if you ignore the nt stuff and whatnot. But AFAIU the whole
> point of _relaxed() on x86 is that it allows the compiler to reoder
> memory vs. mmio any which way it wants.

If you mean the top level readX_relaxed() functions, those were more
about re-ordering on a large I/O fabric than compiler re-ordering.  I'm
not sure how the compiler handles things these days; I thought maybe the
volatile accesses would imply a re-order barrier as well, since they
could have side effects the compiler can't anticipate (i.e. re-ordering
even read after read could change behavior).  But it would be good to
double check...  and according to memory-barriers.txt we should be safe
there, from the readX_relaxed() section:

"Note that relaxed accesses to the same peripheral are guaranteed to be
ordered with respect to each other."

Jesse
Ville Syrjälä Sept. 21, 2015, 4:53 p.m. UTC | #8
On Mon, Sep 21, 2015 at 09:26:11AM -0700, Jesse Barnes wrote:
> On 09/18/2015 12:26 PM, Ville Syrjälä wrote:
> > On Fri, Sep 18, 2015 at 07:44:34PM +0100, Chris Wilson wrote:
> > Yeah I guess that was a crappy example. Trying to think of a better one,
> > I figure execlist status could be read from memory, but apparently
> > that's just mmio. I guess the context update + ELSP write is something
> > we do from the irq handler, but there are plenty of barriers between
> > those two it seems. Maybe there's no good example here.
> > 
> >>
> >> Anyway, I thought we had strongly ordered reads on x64/x32?
> > 
> > The cpu won't reoder reads vs. reads, or writes vs. writes for that
> > matter if you ignore the nt stuff and whatnot. But AFAIU the whole
> > point of _relaxed() on x86 is that it allows the compiler to reoder
> > memory vs. mmio any which way it wants.
> 
> If you mean the top level readX_relaxed() functions, those were more
> about re-ordering on a large I/O fabric than compiler re-ordering.  I'm
> not sure how the compiler handles things these days; I thought maybe the
> volatile accesses would imply a re-order barrier as well, since they
> could have side effects the compiler can't anticipate (i.e. re-ordering
> even read after read could change behavior).  But it would be good to
> double check...  and according to memory-barriers.txt we should be safe
> there, from the readX_relaxed() section:
> 
> "Note that relaxed accesses to the same peripheral are guaranteed to be
> ordered with respect to each other."

We have
#define barrier() asm volatile("" ::: "memory")

so I take it someone definitely thinks just volatile isn't enough to
keep the compiler in check.
Jesse Barnes Sept. 21, 2015, 4:57 p.m. UTC | #9
On 09/21/2015 09:53 AM, Ville Syrjälä wrote:
> On Mon, Sep 21, 2015 at 09:26:11AM -0700, Jesse Barnes wrote:
>> On 09/18/2015 12:26 PM, Ville Syrjälä wrote:
>>> On Fri, Sep 18, 2015 at 07:44:34PM +0100, Chris Wilson wrote:
>>> Yeah I guess that was a crappy example. Trying to think of a better one,
>>> I figure execlist status could be read from memory, but apparently
>>> that's just mmio. I guess the context update + ELSP write is something
>>> we do from the irq handler, but there are plenty of barriers between
>>> those two it seems. Maybe there's no good example here.
>>>
>>>>
>>>> Anyway, I thought we had strongly ordered reads on x64/x32?
>>>
>>> The cpu won't reoder reads vs. reads, or writes vs. writes for that
>>> matter if you ignore the nt stuff and whatnot. But AFAIU the whole
>>> point of _relaxed() on x86 is that it allows the compiler to reoder
>>> memory vs. mmio any which way it wants.
>>
>> If you mean the top level readX_relaxed() functions, those were more
>> about re-ordering on a large I/O fabric than compiler re-ordering.  I'm
>> not sure how the compiler handles things these days; I thought maybe the
>> volatile accesses would imply a re-order barrier as well, since they
>> could have side effects the compiler can't anticipate (i.e. re-ordering
>> even read after read could change behavior).  But it would be good to
>> double check...  and according to memory-barriers.txt we should be safe
>> there, from the readX_relaxed() section:
>>
>> "Note that relaxed accesses to the same peripheral are guaranteed to be
>> ordered with respect to each other."
> 
> We have
> #define barrier() asm volatile("" ::: "memory")
> 
> so I take it someone definitely thinks just volatile isn't enough to
> keep the compiler in check.

Yeah, maybe with respect to other, non-I/O targeted reads & writes
things get messier...

Jesse
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 47ef007..2322dac 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1516,7 +1516,7 @@  static int gen6_drpc_info(struct seq_file *m)
 		seq_printf(m, "RC information accurate: %s\n", yesno(count < 51));
 	}
 
-	gt_core_status = readl(dev_priv->regs + GEN6_GT_CORE_STATUS);
+	gt_core_status = __raw_i915_read32(dev_priv, GEN6_GT_CORE_STATUS);
 	trace_i915_reg_rw(false, GEN6_GT_CORE_STATUS, gt_core_status, 4, true);
 
 	rpmodectl1 = I915_READ(GEN6_RP_CONTROL);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 3e35e08..91ed3c2 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3502,4 +3502,32 @@  static inline void i915_trace_irq_get(struct intel_engine_cs *ring,
 		i915_gem_request_assign(&ring->trace_irq_req, req);
 }
 
+#define __raw_read(x, s) \
+static inline uint##x##_t __raw_i915_read##x(struct drm_i915_private *dev_priv, \
+					     uint32_t reg) \
+{ \
+	return read##s(dev_priv->regs + reg); \
+}
+
+#define __raw_write(x, s) \
+static inline void __raw_i915_write##x(struct drm_i915_private *dev_priv, \
+				       uint32_t reg, uint##x##_t val) \
+{ \
+	write##s(val, dev_priv->regs + reg); \
+}
+__raw_read(8, b)
+__raw_read(16, w)
+__raw_read(32, l)
+__raw_read(64, q)
+
+__raw_write(8, b)
+__raw_write(16, w)
+__raw_write(32, l)
+__raw_write(64, q)
+
+#undef __raw_read
+#undef __raw_write
+
+#define __raw_posting_read(dev_priv__, reg__) (void)__raw_i915_read32((dev_priv__), (reg__))
+
 #endif
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index d181dab..a6c23b2 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -679,8 +679,6 @@  static u32 g4x_get_vblank_counter(struct drm_device *dev, int pipe)
 }
 
 /* raw reads, only for fast reads of display block, no need for forcewake etc. */
-#define __raw_i915_read32(dev_priv__, reg__) readl((dev_priv__)->regs + (reg__))
-
 static int __intel_get_crtc_scanline(struct intel_crtc *crtc)
 {
 	struct drm_device *dev = crtc->base.dev;
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 5b27ee1..acf9b4b 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -29,34 +29,6 @@ 
 
 #define FORCEWAKE_ACK_TIMEOUT_MS 2
 
-#define __raw_read(x, s) \
-static inline uint##x##_t __raw_i915_read##x(struct drm_i915_private *dev_priv, \
-					     uint32_t reg) \
-{ \
-	return read##s(dev_priv->regs + reg); \
-}
-
-#define __raw_write(x, s) \
-static inline void __raw_i915_write##x(struct drm_i915_private *dev_priv, \
-				       uint32_t reg, uint##x##_t val) \
-{ \
-	write##s(val, dev_priv->regs + reg); \
-}
-__raw_read(8, b)
-__raw_read(16, w)
-__raw_read(32, l)
-__raw_read(64, q)
-
-__raw_write(8, b)
-__raw_write(16, w)
-__raw_write(32, l)
-__raw_write(64, q)
-
-#undef __raw_read
-#undef __raw_write
-
-#define __raw_posting_read(dev_priv__, reg__) (void)__raw_i915_read32((dev_priv__), (reg__))
-
 static const char * const forcewake_domain_names[] = {
 	"render",
 	"blitter",