diff mbox series

[4/4] pnv/xive2: Handle TIMA access through all ports

Message ID 20230530161129.313258-5-fbarrat@linux.ibm.com (mailing list archive)
State New, archived
Headers show
Series Various xive fixes | expand

Commit Message

Frederic Barrat May 30, 2023, 4:11 p.m. UTC
The Thread Interrupt Management Area (TIMA) can be accessed through 4
ports/snoop buses, targeted by the address. The base address of a TIMA
is using port 0 and the other ports are 0x80 apart. Using one port or
another can be useful to balance the load on the snoop buses.

The TIMA registers are in the 0x0 -> 0x3F range and there are 2
indication bits for special operations (bits 10 and 11; everything
fits on a 4k page). So the port address bits fall in between and are
"don't care" for the hardware when processing the TIMA operation. So
this patch filters out those port address bits so that a TIMA
operation can be triggered using any port.

It is also true for indirect access (through the IC BAR) and it's
actually nothing new, it was already the case on P9. Which helps here,
as the TIMA handling code is common between P9 (xive) and P10 (xive2).

Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
---
 hw/intc/pnv_xive2.c |  4 ++++
 hw/intc/xive.c      | 18 ++++++++++++++++++
 2 files changed, 22 insertions(+)

Comments

Cédric Le Goater May 30, 2023, 4:40 p.m. UTC | #1
On 5/30/23 18:11, Frederic Barrat wrote:
> The Thread Interrupt Management Area (TIMA) can be accessed through 4
> ports/snoop buses, targeted by the address. The base address of a TIMA
> is using port 0 and the other ports are 0x80 apart. Using one port or
> another can be useful to balance the load on the snoop buses.

and can we have some nice examples of how these ports are used ? only for
snooping or also for balancing operations ? which ones ?

> The TIMA registers are in the 0x0 -> 0x3F range and there are 2
> indication bits for special operations (bits 10 and 11; everything
> fits on a 4k page). So the port address bits fall in between and are
> "don't care" for the hardware when processing the TIMA operation. So
> this patch filters out those port address bits so that a TIMA
> operation can be triggered using any port.
> 
> It is also true for indirect access (through the IC BAR) and it's
> actually nothing new, it was already the case on P9. Which helps here,
> as the TIMA handling code is common between P9 (xive) and P10 (xive2).
> 
> Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>

Reviewed-by: Cédric Le Goater <clg@kaod.org>

Thanks,

C.

> ---
>   hw/intc/pnv_xive2.c |  4 ++++
>   hw/intc/xive.c      | 18 ++++++++++++++++++
>   2 files changed, 22 insertions(+)
> 
> diff --git a/hw/intc/pnv_xive2.c b/hw/intc/pnv_xive2.c
> index 132f82a035..c80316657a 100644
> --- a/hw/intc/pnv_xive2.c
> +++ b/hw/intc/pnv_xive2.c
> @@ -1662,6 +1662,8 @@ static void pnv_xive2_tm_write(void *opaque, hwaddr offset,
>       bool gen1_tima_os =
>           xive->cq_regs[CQ_XIVE_CFG >> 3] & CQ_XIVE_CFG_GEN1_TIMA_OS;
>   
> +    offset &= 0xC3F; /* See comment in xive_tctx_tm_write() */
> +
>       /* TODO: should we switch the TM ops table instead ? */
>       if (!gen1_tima_os && offset == HV_PUSH_OS_CTX_OFFSET) {
>           xive2_tm_push_os_ctx(xptr, tctx, offset, value, size);
> @@ -1681,6 +1683,8 @@ static uint64_t pnv_xive2_tm_read(void *opaque, hwaddr offset, unsigned size)
>       bool gen1_tima_os =
>           xive->cq_regs[CQ_XIVE_CFG >> 3] & CQ_XIVE_CFG_GEN1_TIMA_OS;
>   
> +    offset &= 0xC3F; /* See comment in xive_tctx_tm_read() */
> +
>       /* TODO: should we switch the TM ops table instead ? */
>       if (!gen1_tima_os && offset == HV_PULL_OS_CTX_OFFSET) {
>           return xive2_tm_pull_os_ctx(xptr, tctx, offset, size);
> diff --git a/hw/intc/xive.c b/hw/intc/xive.c
> index a986b96843..c1abfae31d 100644
> --- a/hw/intc/xive.c
> +++ b/hw/intc/xive.c
> @@ -527,6 +527,15 @@ void xive_tctx_tm_write(XivePresenter *xptr, XiveTCTX *tctx, hwaddr offset,
>   
>       trace_xive_tctx_tm_write(offset, size, value);
>   
> +    /*
> +     * The TIMA can be accessed through 4 ports/snoop buses, with
> +     * addresses 0x80 apart.
> +     * However, the offset bits between the "special op" bits and the
> +     * MSB of the range used for the TIMA registers are "don't care"
> +     * for the hardware, so we filter them out.
> +     */
> +    offset &= 0xC3F;
> +
>       /*
>        * TODO: check V bit in Q[0-3]W2
>        */
> @@ -566,6 +575,15 @@ uint64_t xive_tctx_tm_read(XivePresenter *xptr, XiveTCTX *tctx, hwaddr offset,
>       const XiveTmOp *xto;
>       uint64_t ret;
>   
> +    /*
> +     * The TIMA can be accessed through 4 ports/snoop buses, with
> +     * addresses 0x80 apart.
> +     * However, the offset bits between the "special op" bits and the
> +     * MSB of the range used for the TIMA registers are "don't care"
> +     * for the hardware, so we filter them out.
> +     */
> +    offset &= 0xC3F;
> +
>       /*
>        * TODO: check V bit in Q[0-3]W2
>        */
Cédric Le Goater May 30, 2023, 4:49 p.m. UTC | #2
On 5/30/23 18:40, Cédric Le Goater wrote:
> On 5/30/23 18:11, Frederic Barrat wrote:
>> The Thread Interrupt Management Area (TIMA) can be accessed through 4
>> ports/snoop buses, targeted by the address. The base address of a TIMA
>> is using port 0 and the other ports are 0x80 apart. Using one port or
>> another can be useful to balance the load on the snoop buses.
> 
> and can we have some nice examples of how these ports are used ? only for
> snooping or also for balancing operations ? which ones ?
> 
>> The TIMA registers are in the 0x0 -> 0x3F range and there are 2
>> indication bits for special operations (bits 10 and 11; everything
>> fits on a 4k page). So the port address bits fall in between and are
>> "don't care" for the hardware when processing the TIMA operation. So
>> this patch filters out those port address bits so that a TIMA
>> operation can be triggered using any port.
>>
>> It is also true for indirect access (through the IC BAR) and it's
>> actually nothing new, it was already the case on P9. Which helps here,
>> as the TIMA handling code is common between P9 (xive) and P10 (xive2).
>>
>> Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
> 
> Reviewed-by: Cédric Le Goater <clg@kaod.org>

one extra comment, since we already have one mask for the tima offsets :

     /*
      * First, check for special operations in the 2K region
      */
     if (offset & 0x800) {

I think it would be cleaner to add some defines in the reg definition file.

Can come later.

Thanks,

C.
Frederic Barrat May 30, 2023, 5:29 p.m. UTC | #3
On 30/05/2023 18:40, Cédric Le Goater wrote:
> On 5/30/23 18:11, Frederic Barrat wrote:
>> The Thread Interrupt Management Area (TIMA) can be accessed through 4
>> ports/snoop buses, targeted by the address. The base address of a TIMA
>> is using port 0 and the other ports are 0x80 apart. Using one port or
>> another can be useful to balance the load on the snoop buses.
> 
> and can we have some nice examples of how these ports are used ? only for
> snooping or also for balancing operations ? which ones ?


This is really "static" balancing. The snoop bus 0 and 3 are known to be 
the busiest, for various kind of operations. We found out recently that 
another hypervisor is always using snoop bus 1 for TIMA accesses, to 
spread the load. So it is really empirical.

xive_tm_raw_write/read() were actually working (by chance?), since they 
only retain the least significant bits, so the snoop address bits were 
dropped. The problem was really for the special op detection, which 
matches on the full address.

   Fred


>> The TIMA registers are in the 0x0 -> 0x3F range and there are 2
>> indication bits for special operations (bits 10 and 11; everything
>> fits on a 4k page). So the port address bits fall in between and are
>> "don't care" for the hardware when processing the TIMA operation. So
>> this patch filters out those port address bits so that a TIMA
>> operation can be triggered using any port.
>>
>> It is also true for indirect access (through the IC BAR) and it's
>> actually nothing new, it was already the case on P9. Which helps here,
>> as the TIMA handling code is common between P9 (xive) and P10 (xive2).
>>
>> Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
> 
> Reviewed-by: Cédric Le Goater <clg@kaod.org>
> 
> Thanks,
> 
> C.
> 
>> ---
>>   hw/intc/pnv_xive2.c |  4 ++++
>>   hw/intc/xive.c      | 18 ++++++++++++++++++
>>   2 files changed, 22 insertions(+)
>>
>> diff --git a/hw/intc/pnv_xive2.c b/hw/intc/pnv_xive2.c
>> index 132f82a035..c80316657a 100644
>> --- a/hw/intc/pnv_xive2.c
>> +++ b/hw/intc/pnv_xive2.c
>> @@ -1662,6 +1662,8 @@ static void pnv_xive2_tm_write(void *opaque, 
>> hwaddr offset,
>>       bool gen1_tima_os =
>>           xive->cq_regs[CQ_XIVE_CFG >> 3] & CQ_XIVE_CFG_GEN1_TIMA_OS;
>> +    offset &= 0xC3F; /* See comment in xive_tctx_tm_write() */
>> +
>>       /* TODO: should we switch the TM ops table instead ? */
>>       if (!gen1_tima_os && offset == HV_PUSH_OS_CTX_OFFSET) {
>>           xive2_tm_push_os_ctx(xptr, tctx, offset, value, size);
>> @@ -1681,6 +1683,8 @@ static uint64_t pnv_xive2_tm_read(void *opaque, 
>> hwaddr offset, unsigned size)
>>       bool gen1_tima_os =
>>           xive->cq_regs[CQ_XIVE_CFG >> 3] & CQ_XIVE_CFG_GEN1_TIMA_OS;
>> +    offset &= 0xC3F; /* See comment in xive_tctx_tm_read() */
>> +
>>       /* TODO: should we switch the TM ops table instead ? */
>>       if (!gen1_tima_os && offset == HV_PULL_OS_CTX_OFFSET) {
>>           return xive2_tm_pull_os_ctx(xptr, tctx, offset, size);
>> diff --git a/hw/intc/xive.c b/hw/intc/xive.c
>> index a986b96843..c1abfae31d 100644
>> --- a/hw/intc/xive.c
>> +++ b/hw/intc/xive.c
>> @@ -527,6 +527,15 @@ void xive_tctx_tm_write(XivePresenter *xptr, 
>> XiveTCTX *tctx, hwaddr offset,
>>       trace_xive_tctx_tm_write(offset, size, value);
>> +    /*
>> +     * The TIMA can be accessed through 4 ports/snoop buses, with
>> +     * addresses 0x80 apart.
>> +     * However, the offset bits between the "special op" bits and the
>> +     * MSB of the range used for the TIMA registers are "don't care"
>> +     * for the hardware, so we filter them out.
>> +     */
>> +    offset &= 0xC3F;
>> +
>>       /*
>>        * TODO: check V bit in Q[0-3]W2
>>        */
>> @@ -566,6 +575,15 @@ uint64_t xive_tctx_tm_read(XivePresenter *xptr, 
>> XiveTCTX *tctx, hwaddr offset,
>>       const XiveTmOp *xto;
>>       uint64_t ret;
>> +    /*
>> +     * The TIMA can be accessed through 4 ports/snoop buses, with
>> +     * addresses 0x80 apart.
>> +     * However, the offset bits between the "special op" bits and the
>> +     * MSB of the range used for the TIMA registers are "don't care"
>> +     * for the hardware, so we filter them out.
>> +     */
>> +    offset &= 0xC3F;
>> +
>>       /*
>>        * TODO: check V bit in Q[0-3]W2
>>        */
>
Frederic Barrat May 30, 2023, 5:30 p.m. UTC | #4
On 30/05/2023 18:49, Cédric Le Goater wrote:
> On 5/30/23 18:40, Cédric Le Goater wrote:
>> On 5/30/23 18:11, Frederic Barrat wrote:
>>> The Thread Interrupt Management Area (TIMA) can be accessed through 4
>>> ports/snoop buses, targeted by the address. The base address of a TIMA
>>> is using port 0 and the other ports are 0x80 apart. Using one port or
>>> another can be useful to balance the load on the snoop buses.
>>
>> and can we have some nice examples of how these ports are used ? only for
>> snooping or also for balancing operations ? which ones ?
>>
>>> The TIMA registers are in the 0x0 -> 0x3F range and there are 2
>>> indication bits for special operations (bits 10 and 11; everything
>>> fits on a 4k page). So the port address bits fall in between and are
>>> "don't care" for the hardware when processing the TIMA operation. So
>>> this patch filters out those port address bits so that a TIMA
>>> operation can be triggered using any port.
>>>
>>> It is also true for indirect access (through the IC BAR) and it's
>>> actually nothing new, it was already the case on P9. Which helps here,
>>> as the TIMA handling code is common between P9 (xive) and P10 (xive2).
>>>
>>> Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
>>
>> Reviewed-by: Cédric Le Goater <clg@kaod.org>
> 
> one extra comment, since we already have one mask for the tima offsets :
> 
>      /*
>       * First, check for special operations in the 2K region
>       */
>      if (offset & 0x800) {
> 
> I think it would be cleaner to add some defines in the reg definition file.


Yeah, you're right, I should respin it with some proper defines. And 
I'll add one for the special op bit(s).

   Fred


> 
> Can come later.
> 
> Thanks,
> 
> C.
>
diff mbox series

Patch

diff --git a/hw/intc/pnv_xive2.c b/hw/intc/pnv_xive2.c
index 132f82a035..c80316657a 100644
--- a/hw/intc/pnv_xive2.c
+++ b/hw/intc/pnv_xive2.c
@@ -1662,6 +1662,8 @@  static void pnv_xive2_tm_write(void *opaque, hwaddr offset,
     bool gen1_tima_os =
         xive->cq_regs[CQ_XIVE_CFG >> 3] & CQ_XIVE_CFG_GEN1_TIMA_OS;
 
+    offset &= 0xC3F; /* See comment in xive_tctx_tm_write() */
+
     /* TODO: should we switch the TM ops table instead ? */
     if (!gen1_tima_os && offset == HV_PUSH_OS_CTX_OFFSET) {
         xive2_tm_push_os_ctx(xptr, tctx, offset, value, size);
@@ -1681,6 +1683,8 @@  static uint64_t pnv_xive2_tm_read(void *opaque, hwaddr offset, unsigned size)
     bool gen1_tima_os =
         xive->cq_regs[CQ_XIVE_CFG >> 3] & CQ_XIVE_CFG_GEN1_TIMA_OS;
 
+    offset &= 0xC3F; /* See comment in xive_tctx_tm_read() */
+
     /* TODO: should we switch the TM ops table instead ? */
     if (!gen1_tima_os && offset == HV_PULL_OS_CTX_OFFSET) {
         return xive2_tm_pull_os_ctx(xptr, tctx, offset, size);
diff --git a/hw/intc/xive.c b/hw/intc/xive.c
index a986b96843..c1abfae31d 100644
--- a/hw/intc/xive.c
+++ b/hw/intc/xive.c
@@ -527,6 +527,15 @@  void xive_tctx_tm_write(XivePresenter *xptr, XiveTCTX *tctx, hwaddr offset,
 
     trace_xive_tctx_tm_write(offset, size, value);
 
+    /*
+     * The TIMA can be accessed through 4 ports/snoop buses, with
+     * addresses 0x80 apart.
+     * However, the offset bits between the "special op" bits and the
+     * MSB of the range used for the TIMA registers are "don't care"
+     * for the hardware, so we filter them out.
+     */
+    offset &= 0xC3F;
+
     /*
      * TODO: check V bit in Q[0-3]W2
      */
@@ -566,6 +575,15 @@  uint64_t xive_tctx_tm_read(XivePresenter *xptr, XiveTCTX *tctx, hwaddr offset,
     const XiveTmOp *xto;
     uint64_t ret;
 
+    /*
+     * The TIMA can be accessed through 4 ports/snoop buses, with
+     * addresses 0x80 apart.
+     * However, the offset bits between the "special op" bits and the
+     * MSB of the range used for the TIMA registers are "don't care"
+     * for the hardware, so we filter them out.
+     */
+    offset &= 0xC3F;
+
     /*
      * TODO: check V bit in Q[0-3]W2
      */