[4/9] mailbox: tegra-hsp: Add support for shared mailboxes

Message ID	20181026111638.10759-5-thierry.reding@gmail.com (mailing list archive)
State	Superseded, archived
Headers	show Return-Path: <linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org> From: Thierry Reding <thierry.reding@gmail.com> To: Thierry Reding <thierry.reding@gmail.com>, Jassi Brar <jassisinghbrar@gmail.com>, Greg Kroah-Hartman <gregkh@linuxfoundation.org> Subject: [PATCH 4/9] mailbox: tegra-hsp: Add support for shared mailboxes Date: Fri, 26 Oct 2018 13:16:33 +0200 Message-Id: <20181026111638.10759-5-thierry.reding@gmail.com> In-Reply-To: <20181026111638.10759-1-thierry.reding@gmail.com> References: <20181026111638.10759-1-thierry.reding@gmail.com> MIME-Version: 1.0 Precedence: list Cc: devicetree@vger.kernel.org, linux-kernel@vger.kernel.org, Mika Liljeberg <mliljeberg@nvidia.com>, Mikko Perttunen <mperttunen@nvidia.com>, Timo Alho <talho@nvidia.com>, linux-serial@vger.kernel.org, Jiri Slaby <jslaby@suse.com>, linux-tegra@vger.kernel.org, Pekka Pessi <ppessi@nvidia.com>, Jon Hunter <jonathanh@nvidia.com>, linux-arm-kernel@lists.infradead.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org> Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org
Series	serial: Add Tegra Combined UART driver \| expand [0/9] serial: Add Tegra Combined UART driver [1/9] mailbox: Support blocking transfers in atomic context [2/9] mailbox: Allow multiple controllers per device [3/9] dt-bindings: tegra186-hsp: Add shared interrupts [4/9] mailbox: tegra-hsp: Add support for shared mailboxes [5/9] mailbox: tegra-hsp: Add suspend/resume support [6/9] dt-bindings: serial: Add bindings for nvidia, tegra194-tcu [7/9] serial: Add Tegra Combined UART driver [8/9] arm64: tegra: Add nodes for TCU on Tegra194 [9/9] arm64: tegra: Mark TCU as primary serial port on Tegra194 P2888

Thierry Reding Oct. 26, 2018, 11:16 a.m. UTC

From: Thierry Reding <treding@nvidia.com>

The Tegra HSP block supports 'shared mailboxes' that are simple 32-bit
registers consisting of a FULL bit in MSB position and 31 bits of data.
The hardware can be configured to trigger interrupts when a mailbox
is empty or full. Add support for these shared mailboxes to the HSP
driver.

The initial use for the mailboxes is the Tegra Combined UART. For this
purpose, we use interrupts to receive data, and spinning to wait for
the transmit mailbox to be emptied to minimize unnecessary overhead.

Based on work by Mikko Perttunen <mperttunen@nvidia.com>.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/mailbox/tegra-hsp.c | 476 +++++++++++++++++++++++++++++++-----
 1 file changed, 415 insertions(+), 61 deletions(-)

Pekka Pessi Oct. 29, 2018, 10:04 a.m. UTC | #1

Hi Thierry,

There is typically one entity (aux cpu or a VM running on CCPLEX) owning 
the "empty" or producer side of mailbox (iow, waking up on empty) and 
another entity owning the "full" or consumer side of mailbox (waking up 
on full). An entity should not muck with the interrupts used by the 
opposite side.

One entity typically owns one shared interrupt only.  For the 
BPMP/SCE/RCE/SPE HSP blocks the shared interrupt 0 is owned by the 
auxiliary processor itself, the shared interrupts 1..4 are connected to 
LIC and are available to other entities. The convention is to go through 
the interrupts 0..4 and then using the first available shared interrupt 
for both full and empty.

The interrupt functions should use a mask for mailboxes owned by kernel 
(in essence what the IE register should be for the HSP shared interrupt 
owned by the kernel) and serve only those mailboxes owned by kernel. 
Note that there is no reset for HSP in Xavier, and the IE register 
contents may be stale.

And lastly, if we want to support only Xavier and later, perhaps we 
should be more clear in the bindings? There are no mailbox-specific 
interrupt enable registers available on Parker and your design relies on 
them.

--Pekka

On 10/26/2018 02:16 PM, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
>
> The Tegra HSP block supports 'shared mailboxes' that are simple 32-bit
> registers consisting of a FULL bit in MSB position and 31 bits of data.
> The hardware can be configured to trigger interrupts when a mailbox
> is empty or full. Add support for these shared mailboxes to the HSP
> driver.
>
> The initial use for the mailboxes is the Tegra Combined UART. For this
> purpose, we use interrupts to receive data, and spinning to wait for
> the transmit mailbox to be emptied to minimize unnecessary overhead.
>
> Based on work by Mikko Perttunen <mperttunen@nvidia.com>.
>
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>   drivers/mailbox/tegra-hsp.c | 476 +++++++++++++++++++++++++++++++-----
>   1 file changed, 415 insertions(+), 61 deletions(-)
>
> diff --git a/drivers/mailbox/tegra-hsp.c b/drivers/mailbox/tegra-hsp.c
> index 0cde356c11ab..d070c8e38375 100644
> --- a/drivers/mailbox/tegra-hsp.c
> +++ b/drivers/mailbox/tegra-hsp.c
> @@ -1,5 +1,5 @@
>   /*
> - * Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.
> + * Copyright (c) 2016-2018, NVIDIA CORPORATION.  All rights reserved.
>    *
>    * This program is free software; you can redistribute it and/or modify it
>    * under the terms and conditions of the GNU General Public License,
> @@ -11,6 +11,7 @@
>    * more details.
>    */
>   
> +#include <linux/delay.h>
>   #include <linux/interrupt.h>
>   #include <linux/io.h>
>   #include <linux/mailbox_controller.h>
> @@ -21,6 +22,17 @@
>   
>   #include <dt-bindings/mailbox/tegra186-hsp.h>
>   
> +#include "mailbox.h"
> +
> +#define HSP_INT_IE(x)		(0x100 + ((x) * 4))
> +#define HSP_INT_IV		0x300
> +#define HSP_INT_IR		0x304
> +
> +#define HSP_INT_EMPTY_SHIFT	0
> +#define HSP_INT_EMPTY_MASK	0xff
> +#define HSP_INT_FULL_SHIFT	8
> +#define HSP_INT_FULL_MASK	0xff
> +
>   #define HSP_INT_DIMENSIONING	0x380
>   #define HSP_nSM_SHIFT		0
>   #define HSP_nSS_SHIFT		4
> @@ -34,6 +46,11 @@
>   #define HSP_DB_RAW	0x8
>   #define HSP_DB_PENDING	0xc
>   
> +#define HSP_SM_SHRD_MBOX	0x0
> +#define HSP_SM_SHRD_MBOX_FULL	BIT(31)
> +#define HSP_SM_SHRD_MBOX_FULL_INT_IE	0x04
> +#define HSP_SM_SHRD_MBOX_EMPTY_INT_IE	0x08
> +
>   #define HSP_DB_CCPLEX		1
>   #define HSP_DB_BPMP		3
>   #define HSP_DB_MAX		7
> @@ -55,6 +72,12 @@ struct tegra_hsp_doorbell {
>   	unsigned int index;
>   };
>   
> +struct tegra_hsp_mailbox {
> +	struct tegra_hsp_channel channel;
> +	unsigned int index;
> +	bool sending;
> +};
> +
>   struct tegra_hsp_db_map {
>   	const char *name;
>   	unsigned int master;
> @@ -66,10 +89,13 @@ struct tegra_hsp_soc {
>   };
>   
>   struct tegra_hsp {
> +	struct device *dev;
>   	const struct tegra_hsp_soc *soc;
> -	struct mbox_controller mbox;
> +	struct mbox_controller mbox_db;
> +	struct mbox_controller mbox_sm;
>   	void __iomem *regs;
> -	unsigned int irq;
> +	unsigned int doorbell_irq;
> +	unsigned int *shared_irqs;
>   	unsigned int num_sm;
>   	unsigned int num_as;
>   	unsigned int num_ss;
> @@ -78,14 +104,9 @@ struct tegra_hsp {
>   	spinlock_t lock;
>   
>   	struct list_head doorbells;
> +	struct tegra_hsp_mailbox *mailboxes;
>   };
>   
> -static inline struct tegra_hsp *
> -to_tegra_hsp(struct mbox_controller *mbox)
> -{
> -	return container_of(mbox, struct tegra_hsp, mbox);
> -}
> -
>   static inline u32 tegra_hsp_readl(struct tegra_hsp *hsp, unsigned int offset)
>   {
>   	return readl(hsp->regs + offset);
> @@ -158,7 +179,7 @@ static irqreturn_t tegra_hsp_doorbell_irq(int irq, void *data)
>   
>   	spin_lock(&hsp->lock);
>   
> -	for_each_set_bit(master, &value, hsp->mbox.num_chans) {
> +	for_each_set_bit(master, &value, hsp->mbox_db.num_chans) {
>   		struct tegra_hsp_doorbell *db;
>   
>   		db = __tegra_hsp_doorbell_get(hsp, master);
> @@ -182,6 +203,84 @@ static irqreturn_t tegra_hsp_doorbell_irq(int irq, void *data)
>   	return IRQ_HANDLED;
>   }
>   
> +static irqreturn_t tegra_hsp_shared_full_irq(int irq, void *data)
> +{
> +	struct tegra_hsp *hsp = data;
> +	unsigned long bit, mask;
> +	u32 status, value;
> +	void *msg;
> +
> +	status = tegra_hsp_readl(hsp, HSP_INT_IR);
> +
> +	/* only interested in FULL interrupts */
> +	mask = (status >> HSP_INT_FULL_SHIFT) & HSP_INT_FULL_MASK;
> +
> +	if (!mask)
> +		return IRQ_NONE;
> +
> +	for_each_set_bit(bit, &mask, hsp->num_sm) {
> +		struct tegra_hsp_mailbox *mb = &hsp->mailboxes[bit];
> +
> +		if (!mb->sending) {
> +			value = tegra_hsp_channel_readl(&mb->channel,
> +							HSP_SM_SHRD_MBOX);
> +			value &= ~HSP_SM_SHRD_MBOX_FULL;
> +			msg = (void *)(unsigned long)value;
> +			mbox_chan_received_data(mb->channel.chan, msg);
> +
> +			/*
> +			 * Need to clear all bits here since some producers,
> +			 * such as TCU, depend on fields in the register
> +			 * getting cleared by the consumer.
> +			 *
> +			 * The mailbox API doesn't give the consumers a way
> +			 * of doing that explicitly, so we have to make sure
> +			 * we cover all possible cases.
> +			 */
> +			tegra_hsp_channel_writel(&mb->channel, 0x0,
> +						 HSP_SM_SHRD_MBOX);
> +		}
> +	}
> +
> +	return IRQ_HANDLED;
> +}
> +
> +static irqreturn_t tegra_hsp_shared_empty_irq(int irq, void *data)
> +{
> +	struct tegra_hsp *hsp = data;
> +	unsigned long bit, mask;
> +	u32 status, value;
> +
> +	status = tegra_hsp_readl(hsp, HSP_INT_IR);
> +
> +	/* only interested in EMPTY interrupts */
> +	mask = (status >> HSP_INT_EMPTY_SHIFT) & HSP_INT_EMPTY_MASK;
> +
> +	if (!mask)
> +		return IRQ_NONE;
> +
> +	for_each_set_bit(bit, &mask, hsp->num_sm) {
> +		struct tegra_hsp_mailbox *mb = &hsp->mailboxes[bit];
> +
> +		if (mb->sending) {
> +			/*
> +			 * Disable EMPTY interrupts until data is sent
> +			 * with the next message. These interrupts are
> +			 * level-triggered, so if we kept them enabled
> +			 * they would constantly trigger until we next
> +			 * write data into the message.
> +			 */
> +			value = tegra_hsp_readl(hsp, HSP_INT_IE(1));
> +			value &= ~BIT(HSP_INT_EMPTY_SHIFT + mb->index);
> +			tegra_hsp_writel(hsp, value, HSP_INT_IE(1));
> +
> +			mbox_chan_txdone(mb->channel.chan, 0);
> +		}
> +	}
> +
> +	return IRQ_HANDLED;
> +}
> +
>   static struct tegra_hsp_channel *
>   tegra_hsp_doorbell_create(struct tegra_hsp *hsp, const char *name,
>   			  unsigned int master, unsigned int index)
> @@ -194,7 +293,7 @@ tegra_hsp_doorbell_create(struct tegra_hsp *hsp, const char *name,
>   	if (!db)
>   		return ERR_PTR(-ENOMEM);
>   
> -	offset = (1 + (hsp->num_sm / 2) + hsp->num_ss + hsp->num_as) << 16;
> +	offset = (1 + (hsp->num_sm / 2) + hsp->num_ss + hsp->num_as) * SZ_64K;
>   	offset += index * 0x100;
>   
>   	db->channel.regs = hsp->regs + offset;
> @@ -235,8 +334,8 @@ static int tegra_hsp_doorbell_startup(struct mbox_chan *chan)
>   	unsigned long flags;
>   	u32 value;
>   
> -	if (db->master >= hsp->mbox.num_chans) {
> -		dev_err(hsp->mbox.dev,
> +	if (db->master >= chan->mbox->num_chans) {
> +		dev_err(chan->mbox->dev,
>   			"invalid master ID %u for HSP channel\n",
>   			db->master);
>   		return -EINVAL;
> @@ -281,46 +380,147 @@ static void tegra_hsp_doorbell_shutdown(struct mbox_chan *chan)
>   	spin_unlock_irqrestore(&hsp->lock, flags);
>   }
>   
> -static const struct mbox_chan_ops tegra_hsp_doorbell_ops = {
> +static const struct mbox_chan_ops tegra_hsp_db_ops = {
>   	.send_data = tegra_hsp_doorbell_send_data,
>   	.startup = tegra_hsp_doorbell_startup,
>   	.shutdown = tegra_hsp_doorbell_shutdown,
>   };
>   
> -static struct mbox_chan *of_tegra_hsp_xlate(struct mbox_controller *mbox,
> +static int tegra_hsp_mailbox_send_data(struct mbox_chan *chan, void *data)
> +{
> +	struct tegra_hsp_mailbox *mb = chan->con_priv;
> +	struct tegra_hsp *hsp = mb->channel.hsp;
> +	u32 value;
> +
> +	mb->sending = true;
> +
> +	/* copy data and mark mailbox full */
> +	value = (u32)(unsigned long)data;
> +	value |= HSP_SM_SHRD_MBOX_FULL;
> +
> +	tegra_hsp_channel_writel(&mb->channel, value, HSP_SM_SHRD_MBOX);
> +
> +	if (!irqs_disabled()) {
> +		/* enable EMPTY interrupt for the shared mailbox */
> +		value = tegra_hsp_readl(hsp, HSP_INT_IE(1));
> +		value |= BIT(HSP_INT_EMPTY_SHIFT + mb->index);
> +		tegra_hsp_writel(hsp, value, HSP_INT_IE(1));
> +	}
> +
> +	return 0;
> +}
> +
> +static int tegra_hsp_mailbox_flush(struct mbox_chan *chan,
> +				   unsigned long timeout)
> +{
> +	struct tegra_hsp_mailbox *mb = chan->con_priv;
> +	struct tegra_hsp_channel *ch = &mb->channel;
> +	u32 value;
> +
> +	timeout = jiffies + msecs_to_jiffies(timeout);
> +
> +	while (time_before(jiffies, timeout)) {
> +		value = tegra_hsp_channel_readl(ch, HSP_SM_SHRD_MBOX);
> +		if ((value & HSP_SM_SHRD_MBOX_FULL) == 0) {
> +			mbox_chan_txdone(chan, 0);
> +			return 0;
> +		}
> +
> +		udelay(1);
> +	}
> +
> +	return -ETIME;
> +}
> +
> +static int tegra_hsp_mailbox_startup(struct mbox_chan *chan)
> +{
> +	struct tegra_hsp_mailbox *mb = chan->con_priv;
> +	struct tegra_hsp_channel *ch = &mb->channel;
> +	struct tegra_hsp *hsp = mb->channel.hsp;
> +	u32 value;
> +
> +	chan->txdone_method = TXDONE_BY_IRQ;
> +
> +	/*
> +	 * Shared mailboxes start out as consumers by default. FULL interrupts
> +	 * are coalesced at shared interrupt 0, while EMPTY interrupts will be
> +	 * coalesced at shared interrupt 1.
> +	 *
> +	 * Keep EMPTY interrupts disabled at startup and only enable them when
> +	 * the mailbox is actually full. This is required because the FULL and
> +	 * EMPTY interrupts are level-triggered, so keeping EMPTY interrupts
> +	 * enabled all the time would cause an interrupt storm while mailboxes
> +	 * are idle.
> +	 */
> +
> +	value = tegra_hsp_readl(hsp, HSP_INT_IE(0));
> +	value |= BIT(HSP_INT_FULL_SHIFT + mb->index);
> +	tegra_hsp_writel(hsp, value, HSP_INT_IE(0));
> +
> +	value = tegra_hsp_readl(hsp, HSP_INT_IE(1));
> +	value &= ~BIT(HSP_INT_EMPTY_SHIFT + mb->index);
> +	tegra_hsp_writel(hsp, value, HSP_INT_IE(1));
> +
> +	tegra_hsp_channel_writel(ch, 0x1, HSP_SM_SHRD_MBOX_FULL_INT_IE);
> +	tegra_hsp_channel_writel(ch, 0x1, HSP_SM_SHRD_MBOX_EMPTY_INT_IE);
> +
> +	return 0;
> +}
> +
> +static void tegra_hsp_mailbox_shutdown(struct mbox_chan *chan)
> +{
> +	struct tegra_hsp_mailbox *mb = chan->con_priv;
> +	struct tegra_hsp_channel *ch = &mb->channel;
> +	struct tegra_hsp *hsp = mb->channel.hsp;
> +	u32 value;
> +
> +	tegra_hsp_channel_writel(ch, 0x0, HSP_SM_SHRD_MBOX_EMPTY_INT_IE);
> +	tegra_hsp_channel_writel(ch, 0x0, HSP_SM_SHRD_MBOX_FULL_INT_IE);
> +
> +	value = tegra_hsp_readl(hsp, HSP_INT_IE(1));
> +	value &= ~BIT(HSP_INT_EMPTY_SHIFT + mb->index);
> +	tegra_hsp_writel(hsp, value, HSP_INT_IE(1));
> +
> +	value = tegra_hsp_readl(hsp, HSP_INT_IE(0));
> +	value &= ~BIT(HSP_INT_FULL_SHIFT + mb->index);
> +	tegra_hsp_writel(hsp, value, HSP_INT_IE(0));
> +}
> +
> +static const struct mbox_chan_ops tegra_hsp_sm_ops = {
> +	.send_data = tegra_hsp_mailbox_send_data,
> +	.flush = tegra_hsp_mailbox_flush,
> +	.startup = tegra_hsp_mailbox_startup,
> +	.shutdown = tegra_hsp_mailbox_shutdown,
> +};
> +
> +static struct mbox_chan *tegra_hsp_db_xlate(struct mbox_controller *mbox,
>   					    const struct of_phandle_args *args)
>   {
> +	struct tegra_hsp *hsp = container_of(mbox, struct tegra_hsp, mbox_db);
> +	unsigned int type = args->args[0], master = args->args[1];
>   	struct tegra_hsp_channel *channel = ERR_PTR(-ENODEV);
> -	struct tegra_hsp *hsp = to_tegra_hsp(mbox);
> -	unsigned int type = args->args[0];
> -	unsigned int master = args->args[1];
>   	struct tegra_hsp_doorbell *db;
>   	struct mbox_chan *chan;
>   	unsigned long flags;
>   	unsigned int i;
>   
> -	switch (type) {
> -	case TEGRA_HSP_MBOX_TYPE_DB:
> -		db = tegra_hsp_doorbell_get(hsp, master);
> -		if (db)
> -			channel = &db->channel;
> +	if (type != TEGRA_HSP_MBOX_TYPE_DB || !hsp->doorbell_irq)
> +		return ERR_PTR(-ENODEV);
>   
> -		break;
> -
> -	default:
> -		break;
> -	}
> +	db = tegra_hsp_doorbell_get(hsp, master);
> +	if (db)
> +		channel = &db->channel;
>   
>   	if (IS_ERR(channel))
>   		return ERR_CAST(channel);
>   
>   	spin_lock_irqsave(&hsp->lock, flags);
>   
> -	for (i = 0; i < hsp->mbox.num_chans; i++) {
> -		chan = &hsp->mbox.chans[i];
> +	for (i = 0; i < mbox->num_chans; i++) {
> +		chan = &mbox->chans[i];
>   		if (!chan->con_priv) {
> -			chan->con_priv = channel;
>   			channel->chan = chan;
> +			chan->con_priv = db;
>   			break;
>   		}
>   
> @@ -332,6 +532,19 @@ static struct mbox_chan *of_tegra_hsp_xlate(struct mbox_controller *mbox,
>   	return chan ?: ERR_PTR(-EBUSY);
>   }
>   
> +static struct mbox_chan *tegra_hsp_sm_xlate(struct mbox_controller *mbox,
> +					    const struct of_phandle_args *args)
> +{
> +	struct tegra_hsp *hsp = container_of(mbox, struct tegra_hsp, mbox_sm);
> +	unsigned int type = args->args[0], index = args->args[1];
> +
> +	if (type != TEGRA_HSP_MBOX_TYPE_SM || !hsp->shared_irqs ||
> +	    index >= hsp->num_sm)
> +		return ERR_PTR(-ENODEV);
> +
> +	return hsp->mailboxes[index].channel.chan;
> +}
> +
>   static void tegra_hsp_remove_doorbells(struct tegra_hsp *hsp)
>   {
>   	struct tegra_hsp_doorbell *db, *tmp;
> @@ -364,10 +577,72 @@ static int tegra_hsp_add_doorbells(struct tegra_hsp *hsp)
>   	return 0;
>   }
>   
> +static int tegra_hsp_add_mailboxes(struct tegra_hsp *hsp, struct device *dev)
> +{
> +	int i;
> +
> +	hsp->mailboxes = devm_kcalloc(dev, hsp->num_sm, sizeof(*hsp->mailboxes),
> +				      GFP_KERNEL);
> +	if (!hsp->mailboxes)
> +		return -ENOMEM;
> +
> +	for (i = 0; i < hsp->num_sm; i++) {
> +		struct tegra_hsp_mailbox *mb = &hsp->mailboxes[i];
> +
> +		mb->index = i;
> +		mb->sending = false;
> +
> +		mb->channel.hsp = hsp;
> +		mb->channel.regs = hsp->regs + SZ_64K + i * SZ_32K;
> +		mb->channel.chan = &hsp->mbox_sm.chans[i];
> +		mb->channel.chan->con_priv = mb;
> +	}
> +
> +	return 0;
> +}
> +
> +static int tegra_hsp_request_shared_irqs(struct tegra_hsp *hsp)
> +{
> +	int err;
> +
> +	if (hsp->shared_irqs[0] > 0) {
> +		err = devm_request_irq(hsp->dev, hsp->shared_irqs[0],
> +				       tegra_hsp_shared_full_irq, 0,
> +				       dev_name(hsp->dev), hsp);
> +		if (err < 0) {
> +			dev_err(hsp->dev,
> +				"failed to request full interrupt: %d\n",
> +				err);
> +			return err;
> +		}
> +
> +		dev_dbg(hsp->dev, "full interrupt requested: %u\n",
> +			hsp->shared_irqs[0]);
> +	}
> +
> +	if (hsp->shared_irqs[1] > 0) {
> +		err = devm_request_irq(hsp->dev, hsp->shared_irqs[1],
> +				       tegra_hsp_shared_empty_irq, 0,
> +				       dev_name(hsp->dev), hsp);
> +		if (err < 0) {
> +			dev_err(hsp->dev,
> +				"failed to request empty interrupt: %d\n",
> +				err);
> +			return err;
> +		}
> +
> +		dev_dbg(hsp->dev, "empty interrupt requested: %u\n",
> +			hsp->shared_irqs[1]);
> +	}
> +
> +	return 0;
> +}
> +
>   static int tegra_hsp_probe(struct platform_device *pdev)
>   {
>   	struct tegra_hsp *hsp;
>   	struct resource *res;
> +	unsigned int i;
>   	u32 value;
>   	int err;
>   
> @@ -375,6 +650,7 @@ static int tegra_hsp_probe(struct platform_device *pdev)
>   	if (!hsp)
>   		return -ENOMEM;
>   
> +	hsp->dev = &pdev->dev;
>   	hsp->soc = of_device_get_match_data(&pdev->dev);
>   	INIT_LIST_HEAD(&hsp->doorbells);
>   	spin_lock_init(&hsp->lock);
> @@ -392,58 +668,136 @@ static int tegra_hsp_probe(struct platform_device *pdev)
>   	hsp->num_si = (value >> HSP_nSI_SHIFT) & HSP_nINT_MASK;
>   
>   	err = platform_get_irq_byname(pdev, "doorbell");
> -	if (err < 0) {
> -		dev_err(&pdev->dev, "failed to get doorbell IRQ: %d\n", err);
> -		return err;
> +	if (err >= 0)
> +		hsp->doorbell_irq = err;
> +
> +	if (hsp->num_si > 0) {
> +		unsigned int count = 0;
> +
> +		hsp->shared_irqs = devm_kcalloc(&pdev->dev, hsp->num_si,
> +						sizeof(*hsp->shared_irqs),
> +						GFP_KERNEL);
> +		if (!hsp->shared_irqs)
> +			return -ENOMEM;
> +
> +		for (i = 0; i < hsp->num_si; i++) {
> +			char *name;
> +
> +			name = kasprintf(GFP_KERNEL, "shared%u", i);
> +			if (!name)
> +				return -ENOMEM;
> +
> +			err = platform_get_irq_byname(pdev, name);
> +			if (err >= 0) {
> +				hsp->shared_irqs[i] = err;
> +				count++;
> +			}
> +
> +			kfree(name);
> +		}
> +
> +		if (count == 0) {
> +			devm_kfree(&pdev->dev, hsp->shared_irqs);
> +			hsp->shared_irqs = NULL;
> +		}
>   	}
>   
> -	hsp->irq = err;
> +	/* setup the doorbell controller */
> +	hsp->mbox_db.of_xlate = tegra_hsp_db_xlate;
> +	hsp->mbox_db.num_chans = 32;
> +	hsp->mbox_db.dev = &pdev->dev;
> +	hsp->mbox_db.ops = &tegra_hsp_db_ops;
>   
> -	hsp->mbox.of_xlate = of_tegra_hsp_xlate;
> -	hsp->mbox.num_chans = 32;
> -	hsp->mbox.dev = &pdev->dev;
> -	hsp->mbox.txdone_irq = false;
> -	hsp->mbox.txdone_poll = false;
> -	hsp->mbox.ops = &tegra_hsp_doorbell_ops;
> +	hsp->mbox_db.chans = devm_kcalloc(&pdev->dev, hsp->mbox_db.num_chans,
> +					  sizeof(*hsp->mbox_db.chans),
> +					  GFP_KERNEL);
> +	if (!hsp->mbox_db.chans)
> +		return -ENOMEM;
>   
> -	hsp->mbox.chans = devm_kcalloc(&pdev->dev, hsp->mbox.num_chans,
> -					sizeof(*hsp->mbox.chans),
> -					GFP_KERNEL);
> -	if (!hsp->mbox.chans)
> +	if (hsp->doorbell_irq) {
> +		err = tegra_hsp_add_doorbells(hsp);
> +		if (err < 0) {
> +			dev_err(&pdev->dev, "failed to add doorbells: %d\n",
> +			        err);
> +			return err;
> +		}
> +	}
> +
> +	err = mbox_controller_register(&hsp->mbox_db);
> +	if (err < 0) {
> +		dev_err(&pdev->dev, "failed to register doorbell mailbox: %d\n", err);
> +		goto remove_doorbells;
> +	}
> +
> +	/* setup the shared mailbox controller */
> +	hsp->mbox_sm.of_xlate = tegra_hsp_sm_xlate;
> +	hsp->mbox_sm.num_chans = hsp->num_sm;
> +	hsp->mbox_sm.dev = &pdev->dev;
> +	hsp->mbox_sm.ops = &tegra_hsp_sm_ops;
> +
> +	hsp->mbox_sm.chans = devm_kcalloc(&pdev->dev, hsp->mbox_sm.num_chans,
> +					  sizeof(*hsp->mbox_sm.chans),
> +					  GFP_KERNEL);
> +	if (!hsp->mbox_sm.chans)
>   		return -ENOMEM;
>   
> -	err = tegra_hsp_add_doorbells(hsp);
> +	if (hsp->shared_irqs) {
> +		err = tegra_hsp_add_mailboxes(hsp, &pdev->dev);
> +		if (err < 0) {
> +			dev_err(&pdev->dev, "failed to add mailboxes: %d\n",
> +			        err);
> +			goto unregister_mbox_db;
> +		}
> +	}
> +
> +	err = mbox_controller_register(&hsp->mbox_sm);
>   	if (err < 0) {
> -		dev_err(&pdev->dev, "failed to add doorbells: %d\n", err);
> -		return err;
> +		dev_err(&pdev->dev, "failed to register shared mailbox: %d\n", err);
> +		goto unregister_mbox_db;
>   	}
>   
>   	platform_set_drvdata(pdev, hsp);
>   
> -	err = mbox_controller_register(&hsp->mbox);
> -	if (err) {
> -		dev_err(&pdev->dev, "failed to register mailbox: %d\n", err);
> -		tegra_hsp_remove_doorbells(hsp);
> -		return err;
> +	if (hsp->doorbell_irq) {
> +		err = devm_request_irq(&pdev->dev, hsp->doorbell_irq,
> +				       tegra_hsp_doorbell_irq, IRQF_NO_SUSPEND,
> +				       dev_name(&pdev->dev), hsp);
> +		if (err < 0) {
> +			dev_err(&pdev->dev,
> +			        "failed to request doorbell IRQ#%u: %d\n",
> +				hsp->doorbell_irq, err);
> +			goto unregister_mbox_sm;
> +		}
>   	}
>   
> -	err = devm_request_irq(&pdev->dev, hsp->irq, tegra_hsp_doorbell_irq,
> -			       IRQF_NO_SUSPEND, dev_name(&pdev->dev), hsp);
> -	if (err < 0) {
> -		dev_err(&pdev->dev, "failed to request IRQ#%u: %d\n",
> -			hsp->irq, err);
> -		return err;
> +	if (hsp->shared_irqs) {
> +		err = tegra_hsp_request_shared_irqs(hsp);
> +		if (err < 0)
> +			goto unregister_mbox_sm;
>   	}
>   
>   	return 0;
> +
> +unregister_mbox_sm:
> +	mbox_controller_unregister(&hsp->mbox_sm);
> +unregister_mbox_db:
> +	mbox_controller_unregister(&hsp->mbox_db);
> +remove_doorbells:
> +	if (hsp->doorbell_irq)
> +		tegra_hsp_remove_doorbells(hsp);
> +
> +	return err;
>   }
>   
>   static int tegra_hsp_remove(struct platform_device *pdev)
>   {
>   	struct tegra_hsp *hsp = platform_get_drvdata(pdev);
>   
> -	mbox_controller_unregister(&hsp->mbox);
> -	tegra_hsp_remove_doorbells(hsp);
> +	mbox_controller_unregister(&hsp->mbox_sm);
> +	mbox_controller_unregister(&hsp->mbox_db);
> +
> +	if (hsp->doorbell_irq)
> +		tegra_hsp_remove_doorbells(hsp);
>   
>   	return 0;
>   }

Thierry Reding Oct. 29, 2018, 10:39 a.m. UTC | #2

On Mon, Oct 29, 2018 at 12:04:22PM +0200, Pekka Pessi wrote:
> Hi Thierry,
> 
> There is typically one entity (aux cpu or a VM running on CCPLEX) owning the
> "empty" or producer side of mailbox (iow, waking up on empty) and another
> entity owning the "full" or consumer side of mailbox (waking up on full). An
> entity should not muck with the interrupts used by the opposite side.

Okay, that explains some of my observations. I was initially trying to
program interrupt enables for both FULL and EMPTY interrupts for all
mailboxes, but then I'd usually get timeouts because the consumer wasn't
responding (i.e. the SPE wasn't getting FULL interrupts for the CCPLEX's
TX mailbox).

If I understand correctly, you're saying that the CPU should only be
using the EMPTY interrupts for it's TX mailbox (while leaving the FULL
interrupts completely untouched) and only the FULL interrupt for it's
RX mailbox (while leaving the EMPTY interrupts untouched). This is a bit
of a problem because the mailbox driver doesn't really know anything
about the direction when it starts up, so how would it make the decision
about how to program the registers?

> One entity typically owns one shared interrupt only.  For the
> BPMP/SCE/RCE/SPE HSP blocks the shared interrupt 0 is owned by the auxiliary
> processor itself, the shared interrupts 1..4 are connected to LIC and are
> available to other entities. The convention is to go through the interrupts
> 0..4 and then using the first available shared interrupt for both full and
> empty.

That partially matches another of my observations. It seems like we
can't use the shared interrupt 0 at all on at least the AON HSP. That's
fine because that HSP instance contains the TX mailbox for TCU and by
the current convention in the HSP driver, shared interrupt 0 would be
aggregating the FULL interrupts, which according to the above we don't
need for TX mailboxes.

What's somewhat surprising is that you're saying that both FULL and
EMPTY interrupts should be handled by the same shared interrupt. That's
the opposite of what the recommended programming sequence is that the
TRM specifies. Why is it better to handle both FULL and EMPTY interrupts
with the same shared interrupt?

> The interrupt functions should use a mask for mailboxes owned by kernel (in
> essence what the IE register should be for the HSP shared interrupt owned by
> the kernel) and serve only those mailboxes owned by kernel. Note that there
> is no reset for HSP in Xavier, and the IE register contents may be stale.

Would it be safe to clear all of the IE registers to 0 on driver probe?
I seem to remember trying to do that and getting similar behaviour to
what I describe above, namely that interrupts on the SPE weren't working
anymore. I concluded that the IE register must be shared between the
various processors, even though that's somewhat suprising given that
there is no way to synchronize accesses to those registers, so their
programming would be somewhat up to chance.

Do you know any more about these registers? If they are indeed separate
for each processor, it should be fairly easy to keep track of the
mailboxes used by the kernel and process only those. Again I don't know
how exactly to distinguish between TX and RX mailboxes because they all
start out the same and only their use defines which direction they go.
Currently this works because we program them as consumers by default.
That means we enable the FULL interrupts but keep EMPTY interrupts
disabled until a message in transmitted on the mailbox, at which point
we enable the EMPTY interrupt. I suppose at that point we should also
disable the FULL interrupt, given the above discussion.

> And lastly, if we want to support only Xavier and later, perhaps we should
> be more clear in the bindings? There are no mailbox-specific interrupt
> enable registers available on Parker and your design relies on them.

That was certainly not the intention. I thought I had seen the per-
mailbox interrupt enable registers also in Tegra186 documentation, but
after double-checking they're indeed not there. I don't think the driver
currently "relies" on them because it uses them in addition to the
HSP_IE registers. I suppose that accessing them might cause aborts on
Tegra186 if they don't exist, though.

I'm not entirely clear on what the advantages are of using the per-
mailbox registers, or how they are supposed to be used. The existing
documentation doesn't really explain how these are supposed to be used
either, so I was mostly just going by trial and error.

Do you know anything more on how to use these registers? I can easily
make them Tegra194 specific in the code, but if they're aren't any clear
advantages, it might just be easier to stick with HSP_IE programming
only.

Thierry

Pekka Pessi Oct. 29, 2018, 12:25 p.m. UTC | #3

Hi Thierry,

 From the 0/9:
> Are you aware of any others that we need to take into account?
We would like to use upstream driver for RCE (and probably AON and SCE) 
mailbox handling, too. Eventually.
> This is a bit
> of a problem because the mailbox driver doesn't really know anything
> about the direction when it starts up, so how would it make the decision
> about how to program the registers?
I'm afraid that information must be stored in the device tree.
> What's somewhat surprising is that you're saying that both FULL and
> EMPTY interrupts should be handled by the same shared interrupt. That's
> the opposite of what the recommended programming sequence is that the
> TRM specifies.
Which TRM you mean? Something here?

https://p4viewer.nvidia.com/get///hw/ar/doc/Tech_Pubs/Xavier/TRM/

I browsed through he Xavier_TRM_DP09253001v1.pdf HSP section and I could 
not find any recommendations? But see below.
> Why is it better to handle both FULL and EMPTY interrupts
> with the same shared interrupt?
Only top0 has 8 interrupts, rest of the HSP blocks have only 4 (or 5 in 
case of top1) interrupts per HSP available through LIC. Hogging two of 
them means that only two VMs can access a HSP.
> Would it be safe to clear all of the IE registers to 0 on driver probe?
Nope, the driver should clear only the IE register for the shared 
interrupt that the driver uses. Other IEs are used by other entities.
> If they are indeed separate
> for each processor, it should be fairly easy to keep track of the
> mailboxes used by the kernel and process only those.
Yes, that is what we should do. Again the directionality should be 
specified in DT.
> I'm not entirely clear on what the advantages are of using the per-
> mailbox registers, or how they are supposed to be used. The existing
> documentation doesn't really explain how these are supposed to be used
> either, so I was mostly just going by trial and error.
On virtualized configs, multiple VMs can use same HSP block but each VM 
must use its own interrupt. Because all the IE registers are on the same 
64 KB page, the writes to the page are trapped by hypervisor, which 
means that enabling or disabling an interrupt via the shared IE register 
is pretty heavy operation. The per-mailbox IE registers are on a page 
accessed freely by the IE, no need to trap.

If a VM uses only one empty mailbox interrupt, using a dedicated 
"shared" interrupt and disabling that in GIC could be a lighter 
operation, we used to do that in Parker.

--Pekka

On 10/29/2018 12:39 PM, Thierry Reding wrote:
> On Mon, Oct 29, 2018 at 12:04:22PM +0200, Pekka Pessi wrote:
>> Hi Thierry,
>>
>> There is typically one entity (aux cpu or a VM running on CCPLEX) owning the
>> "empty" or producer side of mailbox (iow, waking up on empty) and another
>> entity owning the "full" or consumer side of mailbox (waking up on full). An
>> entity should not muck with the interrupts used by the opposite side.
> Okay, that explains some of my observations. I was initially trying to
> program interrupt enables for both FULL and EMPTY interrupts for all
> mailboxes, but then I'd usually get timeouts because the consumer wasn't
> responding (i.e. the SPE wasn't getting FULL interrupts for the CCPLEX's
> TX mailbox).
>
> If I understand correctly, you're saying that the CPU should only be
> using the EMPTY interrupts for it's TX mailbox (while leaving the FULL
> interrupts completely untouched) and only the FULL interrupt for it's
> RX mailbox (while leaving the EMPTY interrupts untouched). This is a bit
> of a problem because the mailbox driver doesn't really know anything
> about the direction when it starts up, so how would it make the decision
> about how to program the registers?
>
>> One entity typically owns one shared interrupt only.  For the
>> BPMP/SCE/RCE/SPE HSP blocks the shared interrupt 0 is owned by the auxiliary
>> processor itself, the shared interrupts 1..4 are connected to LIC and are
>> available to other entities. The convention is to go through the interrupts
>> 0..4 and then using the first available shared interrupt for both full and
>> empty.
> That partially matches another of my observations. It seems like we
> can't use the shared interrupt 0 at all on at least the AON HSP. That's
> fine because that HSP instance contains the TX mailbox for TCU and by
> the current convention in the HSP driver, shared interrupt 0 would be
> aggregating the FULL interrupts, which according to the above we don't
> need for TX mailboxes.
>
> What's somewhat surprising is that you're saying that both FULL and
> EMPTY interrupts should be handled by the same shared interrupt. That's
> the opposite of what the recommended programming sequence is that the
> TRM specifies. Why is it better to handle both FULL and EMPTY interrupts
> with the same shared interrupt?
>
>> The interrupt functions should use a mask for mailboxes owned by kernel (in
>> essence what the IE register should be for the HSP shared interrupt owned by
>> the kernel) and serve only those mailboxes owned by kernel. Note that there
>> is no reset for HSP in Xavier, and the IE register contents may be stale.
> Would it be safe to clear all of the IE registers to 0 on driver probe?
> I seem to remember trying to do that and getting similar behaviour to
> what I describe above, namely that interrupts on the SPE weren't working
> anymore. I concluded that the IE register must be shared between the
> various processors, even though that's somewhat suprising given that
> there is no way to synchronize accesses to those registers, so their
> programming would be somewhat up to chance.
>
> Do you know any more about these registers? If they are indeed separate
> for each processor, it should be fairly easy to keep track of the
> mailboxes used by the kernel and process only those. Again I don't know
> how exactly to distinguish between TX and RX mailboxes because they all
> start out the same and only their use defines which direction they go.
> Currently this works because we program them as consumers by default.
> That means we enable the FULL interrupts but keep EMPTY interrupts
> disabled until a message in transmitted on the mailbox, at which point
> we enable the EMPTY interrupt. I suppose at that point we should also
> disable the FULL interrupt, given the above discussion.
>
>> And lastly, if we want to support only Xavier and later, perhaps we should
>> be more clear in the bindings? There are no mailbox-specific interrupt
>> enable registers available on Parker and your design relies on them.
> That was certainly not the intention. I thought I had seen the per-
> mailbox interrupt enable registers also in Tegra186 documentation, but
> after double-checking they're indeed not there. I don't think the driver
> currently "relies" on them because it uses them in addition to the
> HSP_IE registers. I suppose that accessing them might cause aborts on
> Tegra186 if they don't exist, though.
>
> I'm not entirely clear on what the advantages are of using the per-
> mailbox registers, or how they are supposed to be used. The existing
> documentation doesn't really explain how these are supposed to be used
> either, so I was mostly just going by trial and error.
>
> Do you know anything more on how to use these registers? I can easily
> make them Tegra194 specific in the code, but if they're aren't any clear
> advantages, it might just be easier to stick with HSP_IE programming
> only.
>
> Thierry

Thierry Reding Oct. 29, 2018, 1:16 p.m. UTC | #4

On Mon, Oct 29, 2018 at 02:25:42PM +0200, Pekka Pessi wrote:
> Hi Thierry,
> 
> From the 0/9:
> > Are you aware of any others that we need to take into account?
> We would like to use upstream driver for RCE (and probably AON and SCE)
> mailbox handling, too. Eventually.
> > This is a bit
> > of a problem because the mailbox driver doesn't really know anything
> > about the direction when it starts up, so how would it make the decision
> > about how to program the registers?
> I'm afraid that information must be stored in the device tree.
> > What's somewhat surprising is that you're saying that both FULL and
> > EMPTY interrupts should be handled by the same shared interrupt. That's
> > the opposite of what the recommended programming sequence is that the
> > TRM specifies.
> Which TRM you mean? Something here?
> 
> https://p4viewer.nvidia.com/get///hw/ar/doc/Tech_Pubs/Xavier/TRM/
> 
> I browsed through he Xavier_TRM_DP09253001v1.pdf HSP section and I could not
> find any recommendations? But see below.

I'm referring to Xavier_TRM_Public.pdf in that location. If you look at
page 4422, "Shared Interrupts Configuration", the example has aggregated
EMPTY and FULL interrupts on shared interrupts 0 and 1, respectively. I
guess it being an example doesn't make it strictly a recommendation, but
I wonder if we should avoid giving examples that use mappings which we
discourage.

> > Why is it better to handle both FULL and EMPTY interrupts
> > with the same shared interrupt?
> Only top0 has 8 interrupts, rest of the HSP blocks have only 4 (or 5 in case
> of top1) interrupts per HSP available through LIC. Hogging two of them means
> that only two VMs can access a HSP.

Virtualization isn't something that we're very concerned about upstream,
but I'll take that under consideration.

> > Would it be safe to clear all of the IE registers to 0 on driver probe?
> Nope, the driver should clear only the IE register for the shared interrupt
> that the driver uses. Other IEs are used by other entities.

Right, that makes sense. But within that IE register it can consider all
bits fair game, right? One thing I wonder, though, is whether there
should be some external mechanism to set the shared interrupt to use. If
we go purely by convention we'll eventually get this wrong. So currently
we list all available interrupts in DT and then we could pick just the
first. But what if somebody else already picked the first and "owns" it?

I'm not sure we have any practical mechanism to rewrite the DTB, but
perhaps something to keep in mind if ever we need to support other
entities down the road.

> > If they are indeed separate
> > for each processor, it should be fairly easy to keep track of the
> > mailboxes used by the kernel and process only those.
> Yes, that is what we should do. Again the directionality should be specified
> in DT.

Currently we encode the shared mailboxes in DT like this:

	mboxes = <&hsp_top0 TEGRA_HSP_MBOX_TYPE_SM 0>,
		 <&hsp_aon TEGRA_HSP_MBOX_TYPE_SM 1>;
	mbox-names = "rx", "tx";

I suppose we could change that to something like:

	mboxes = <&hsp_top0 TEGRA_HSP_MBOX_TYPE_SM (0 << 31 | 0)>,
		 <&hsp_aon TEGRA_HSP_MBOX_TYPE_SM (1 << 31 | 0)>;

Where the MSB of the mailbox index would indicate the direction. We
could maybe add some eye-candy to make it easier to read:

	mboxes = <&hsp_top0 TEGRA_HSP_MBOX_TYPE_SM TEGRA_HSP_MBOX_SM_RX(0)>,
		 <&hsp_aon TEGRA_HSP_MBOX_TYPE_SM TEGRA_HSP_MBOX_SM_TX(1)>;

That said, I wonder if perhaps it is safe to treat all mailboxes as
consumers by default and omit and direction in DT. If a mailbox is used
as consumer, then the CCPLEX is only interested in FULL interrupts, so
we enable those. The entity that shares the mailbox will write data to
it and is only interested in the EMPTY interrupt, which we don't touch
from the CCPLEX.

For mailboxes that the CCPLEX writes to, we can reconfigure them as
producers by disabling the FULL interrupt and enabling the EMPTY
interrupt instead. We can do that the first time somebody calls the
mbox_send_message() on the mailbox.

Do you see any problems with that approach?

> > I'm not entirely clear on what the advantages are of using the per-
> > mailbox registers, or how they are supposed to be used. The existing
> > documentation doesn't really explain how these are supposed to be used
> > either, so I was mostly just going by trial and error.
> On virtualized configs, multiple VMs can use same HSP block but each VM must
> use its own interrupt. Because all the IE registers are on the same 64 KB
> page, the writes to the page are trapped by hypervisor, which means that
> enabling or disabling an interrupt via the shared IE register is pretty
> heavy operation. The per-mailbox IE registers are on a page accessed freely
> by the IE, no need to trap.

Do the per-mailbox IE registers act as a second level enable, then? The
HSP driver would still have to set the IE register to aggregate FULL and
EMPTY interrupts as needed, but could then use the per-mailbox IE
registers to actually enable and disable (that is, unmask and mask) the
interrupts?

> If a VM uses only one empty mailbox interrupt, using a dedicated "shared"
> interrupt and disabling that in GIC could be a lighter operation, we used to
> do that in Parker.

Okay, that seems like it would somewhat complicate things and we don't
really support VM uses yet, so I think it best to leave that out for
now.

Thierry

> On 10/29/2018 12:39 PM, Thierry Reding wrote:
> > On Mon, Oct 29, 2018 at 12:04:22PM +0200, Pekka Pessi wrote:
> > > Hi Thierry,
> > > 
> > > There is typically one entity (aux cpu or a VM running on CCPLEX) owning the
> > > "empty" or producer side of mailbox (iow, waking up on empty) and another
> > > entity owning the "full" or consumer side of mailbox (waking up on full). An
> > > entity should not muck with the interrupts used by the opposite side.
> > Okay, that explains some of my observations. I was initially trying to
> > program interrupt enables for both FULL and EMPTY interrupts for all
> > mailboxes, but then I'd usually get timeouts because the consumer wasn't
> > responding (i.e. the SPE wasn't getting FULL interrupts for the CCPLEX's
> > TX mailbox).
> > 
> > If I understand correctly, you're saying that the CPU should only be
> > using the EMPTY interrupts for it's TX mailbox (while leaving the FULL
> > interrupts completely untouched) and only the FULL interrupt for it's
> > RX mailbox (while leaving the EMPTY interrupts untouched). This is a bit
> > of a problem because the mailbox driver doesn't really know anything
> > about the direction when it starts up, so how would it make the decision
> > about how to program the registers?
> > 
> > > One entity typically owns one shared interrupt only.  For the
> > > BPMP/SCE/RCE/SPE HSP blocks the shared interrupt 0 is owned by the auxiliary
> > > processor itself, the shared interrupts 1..4 are connected to LIC and are
> > > available to other entities. The convention is to go through the interrupts
> > > 0..4 and then using the first available shared interrupt for both full and
> > > empty.
> > That partially matches another of my observations. It seems like we
> > can't use the shared interrupt 0 at all on at least the AON HSP. That's
> > fine because that HSP instance contains the TX mailbox for TCU and by
> > the current convention in the HSP driver, shared interrupt 0 would be
> > aggregating the FULL interrupts, which according to the above we don't
> > need for TX mailboxes.
> > 
> > What's somewhat surprising is that you're saying that both FULL and
> > EMPTY interrupts should be handled by the same shared interrupt. That's
> > the opposite of what the recommended programming sequence is that the
> > TRM specifies. Why is it better to handle both FULL and EMPTY interrupts
> > with the same shared interrupt?
> > 
> > > The interrupt functions should use a mask for mailboxes owned by kernel (in
> > > essence what the IE register should be for the HSP shared interrupt owned by
> > > the kernel) and serve only those mailboxes owned by kernel. Note that there
> > > is no reset for HSP in Xavier, and the IE register contents may be stale.
> > Would it be safe to clear all of the IE registers to 0 on driver probe?
> > I seem to remember trying to do that and getting similar behaviour to
> > what I describe above, namely that interrupts on the SPE weren't working
> > anymore. I concluded that the IE register must be shared between the
> > various processors, even though that's somewhat suprising given that
> > there is no way to synchronize accesses to those registers, so their
> > programming would be somewhat up to chance.
> > 
> > Do you know any more about these registers? If they are indeed separate
> > for each processor, it should be fairly easy to keep track of the
> > mailboxes used by the kernel and process only those. Again I don't know
> > how exactly to distinguish between TX and RX mailboxes because they all
> > start out the same and only their use defines which direction they go.
> > Currently this works because we program them as consumers by default.
> > That means we enable the FULL interrupts but keep EMPTY interrupts
> > disabled until a message in transmitted on the mailbox, at which point
> > we enable the EMPTY interrupt. I suppose at that point we should also
> > disable the FULL interrupt, given the above discussion.
> > 
> > > And lastly, if we want to support only Xavier and later, perhaps we should
> > > be more clear in the bindings? There are no mailbox-specific interrupt
> > > enable registers available on Parker and your design relies on them.
> > That was certainly not the intention. I thought I had seen the per-
> > mailbox interrupt enable registers also in Tegra186 documentation, but
> > after double-checking they're indeed not there. I don't think the driver
> > currently "relies" on them because it uses them in addition to the
> > HSP_IE registers. I suppose that accessing them might cause aborts on
> > Tegra186 if they don't exist, though.
> > 
> > I'm not entirely clear on what the advantages are of using the per-
> > mailbox registers, or how they are supposed to be used. The existing
> > documentation doesn't really explain how these are supposed to be used
> > either, so I was mostly just going by trial and error.
> > 
> > Do you know anything more on how to use these registers? I can easily
> > make them Tegra194 specific in the code, but if they're aren't any clear
> > advantages, it might just be easier to stick with HSP_IE programming
> > only.
> > 
> > Thierry
>

Pekka Pessi Oct. 30, 2018, 4:15 p.m. UTC | #5

> I guess it being an example doesn't make it strictly a recommendation, but
> I wonder if we should avoid giving examples that use mappings which we
> discourage.

Well, yes, that is example for top0 – but I agree, giving examples that 
do not make much sense is not very productive.

> So currently
> we list all available interrupts in DT and then we could pick just the
> first. But what if somebody else already picked the first and "owns" it

We have rather static hardware allocation in device trees, if an another 
VM instance or another aux cpu owns interrupt, we just remove it from 
the DTS or DTSI.  So the native Linux would see all the possible 
interrupts (that are not used by SPE or RCE, for instance), but a 
virtualized one would see only one.

> For mailboxes that the CCPLEX writes to, we can reconfigure them as
> producers by disabling the FULL interrupt and enabling the EMPTY
> interrupt instead. We can do that the first time somebody calls the
> mbox_send_message() on the mailbox.

The problem here is "disabling the FULL interrupt". The main problem are 
the mailbox-specific IE registers, they are not really shared, but the 
consumer owns the full_ie and producer empty_ie. We can not enable or 
disable interrupts without interfering with the interrupt settings of 
the remote end, when the interrupt is enabled, we either modify 
full_ie/empty_ie register or run with a risk that it has incorrect value.

Can we just leave the mbox channel without interrupts? How the consumer 
indicates that it is interested in receiving messages? mbox_peek_data()?

--Pekka



On 10/29/2018 03:16 PM, Thierry Reding wrote:
> On Mon, Oct 29, 2018 at 02:25:42PM +0200, Pekka Pessi wrote:
>> Hi Thierry,
>>
>>  From the 0/9:
>>> Are you aware of any others that we need to take into account?
>> We would like to use upstream driver for RCE (and probably AON and SCE)
>> mailbox handling, too. Eventually.
>>> This is a bit
>>> of a problem because the mailbox driver doesn't really know anything
>>> about the direction when it starts up, so how would it make the decision
>>> about how to program the registers?
>> I'm afraid that information must be stored in the device tree.
>>> What's somewhat surprising is that you're saying that both FULL and
>>> EMPTY interrupts should be handled by the same shared interrupt. That's
>>> the opposite of what the recommended programming sequence is that the
>>> TRM specifies.
>> Which TRM you mean? Something here?
>>
>> https://p4viewer.nvidia.com/get///hw/ar/doc/Tech_Pubs/Xavier/TRM/
>>
>> I browsed through he Xavier_TRM_DP09253001v1.pdf HSP section and I could not
>> find any recommendations? But see below.
> I'm referring to Xavier_TRM_Public.pdf in that location. If you look at
> page 4422, "Shared Interrupts Configuration", the example has aggregated
> EMPTY and FULL interrupts on shared interrupts 0 and 1, respectively. I
> guess it being an example doesn't make it strictly a recommendation, but
> I wonder if we should avoid giving examples that use mappings which we
> discourage.
>
>>> Why is it better to handle both FULL and EMPTY interrupts
>>> with the same shared interrupt?
>> Only top0 has 8 interrupts, rest of the HSP blocks have only 4 (or 5 in case
>> of top1) interrupts per HSP available through LIC. Hogging two of them means
>> that only two VMs can access a HSP.
> Virtualization isn't something that we're very concerned about upstream,
> but I'll take that under consideration.
>
>>> Would it be safe to clear all of the IE registers to 0 on driver probe?
>> Nope, the driver should clear only the IE register for the shared interrupt
>> that the driver uses. Other IEs are used by other entities.
> Right, that makes sense. But within that IE register it can consider all
> bits fair game, right? One thing I wonder, though, is whether there
> should be some external mechanism to set the shared interrupt to use. If
> we go purely by convention we'll eventually get this wrong. So currently
> we list all available interrupts in DT and then we could pick just the
> first. But what if somebody else already picked the first and "owns" it?
>
> I'm not sure we have any practical mechanism to rewrite the DTB, but
> perhaps something to keep in mind if ever we need to support other
> entities down the road.
>
>>> If they are indeed separate
>>> for each processor, it should be fairly easy to keep track of the
>>> mailboxes used by the kernel and process only those.
>> Yes, that is what we should do. Again the directionality should be specified
>> in DT.
> Currently we encode the shared mailboxes in DT like this:
>
> 	mboxes = <&hsp_top0 TEGRA_HSP_MBOX_TYPE_SM 0>,
> 		 <&hsp_aon TEGRA_HSP_MBOX_TYPE_SM 1>;
> 	mbox-names = "rx", "tx";
>
> I suppose we could change that to something like:
>
> 	mboxes = <&hsp_top0 TEGRA_HSP_MBOX_TYPE_SM (0 << 31 | 0)>,
> 		 <&hsp_aon TEGRA_HSP_MBOX_TYPE_SM (1 << 31 | 0)>;
>
> Where the MSB of the mailbox index would indicate the direction. We
> could maybe add some eye-candy to make it easier to read:
>
> 	mboxes = <&hsp_top0 TEGRA_HSP_MBOX_TYPE_SM TEGRA_HSP_MBOX_SM_RX(0)>,
> 		 <&hsp_aon TEGRA_HSP_MBOX_TYPE_SM TEGRA_HSP_MBOX_SM_TX(1)>;
>
> That said, I wonder if perhaps it is safe to treat all mailboxes as
> consumers by default and omit and direction in DT. If a mailbox is used
> as consumer, then the CCPLEX is only interested in FULL interrupts, so
> we enable those. The entity that shares the mailbox will write data to
> it and is only interested in the EMPTY interrupt, which we don't touch
> from the CCPLEX.
>
> For mailboxes that the CCPLEX writes to, we can reconfigure them as
> producers by disabling the FULL interrupt and enabling the EMPTY
> interrupt instead. We can do that the first time somebody calls the
> mbox_send_message() on the mailbox.
>
> Do you see any problems with that approach?
>
>>> I'm not entirely clear on what the advantages are of using the per-
>>> mailbox registers, or how they are supposed to be used. The existing
>>> documentation doesn't really explain how these are supposed to be used
>>> either, so I was mostly just going by trial and error.
>> On virtualized configs, multiple VMs can use same HSP block but each VM must
>> use its own interrupt. Because all the IE registers are on the same 64 KB
>> page, the writes to the page are trapped by hypervisor, which means that
>> enabling or disabling an interrupt via the shared IE register is pretty
>> heavy operation. The per-mailbox IE registers are on a page accessed freely
>> by the IE, no need to trap.
> Do the per-mailbox IE registers act as a second level enable, then? The
> HSP driver would still have to set the IE register to aggregate FULL and
> EMPTY interrupts as needed, but could then use the per-mailbox IE
> registers to actually enable and disable (that is, unmask and mask) the
> interrupts?
>
>> If a VM uses only one empty mailbox interrupt, using a dedicated "shared"
>> interrupt and disabling that in GIC could be a lighter operation, we used to
>> do that in Parker.
> Okay, that seems like it would somewhat complicate things and we don't
> really support VM uses yet, so I think it best to leave that out for
> now.
>
> Thierry
>
>> On 10/29/2018 12:39 PM, Thierry Reding wrote:
>>> On Mon, Oct 29, 2018 at 12:04:22PM +0200, Pekka Pessi wrote:
>>>> Hi Thierry,
>>>>
>>>> There is typically one entity (aux cpu or a VM running on CCPLEX) owning the
>>>> "empty" or producer side of mailbox (iow, waking up on empty) and another
>>>> entity owning the "full" or consumer side of mailbox (waking up on full). An
>>>> entity should not muck with the interrupts used by the opposite side.
>>> Okay, that explains some of my observations. I was initially trying to
>>> program interrupt enables for both FULL and EMPTY interrupts for all
>>> mailboxes, but then I'd usually get timeouts because the consumer wasn't
>>> responding (i.e. the SPE wasn't getting FULL interrupts for the CCPLEX's
>>> TX mailbox).
>>>
>>> If I understand correctly, you're saying that the CPU should only be
>>> using the EMPTY interrupts for it's TX mailbox (while leaving the FULL
>>> interrupts completely untouched) and only the FULL interrupt for it's
>>> RX mailbox (while leaving the EMPTY interrupts untouched). This is a bit
>>> of a problem because the mailbox driver doesn't really know anything
>>> about the direction when it starts up, so how would it make the decision
>>> about how to program the registers?
>>>
>>>> One entity typically owns one shared interrupt only.  For the
>>>> BPMP/SCE/RCE/SPE HSP blocks the shared interrupt 0 is owned by the auxiliary
>>>> processor itself, the shared interrupts 1..4 are connected to LIC and are
>>>> available to other entities. The convention is to go through the interrupts
>>>> 0..4 and then using the first available shared interrupt for both full and
>>>> empty.
>>> That partially matches another of my observations. It seems like we
>>> can't use the shared interrupt 0 at all on at least the AON HSP. That's
>>> fine because that HSP instance contains the TX mailbox for TCU and by
>>> the current convention in the HSP driver, shared interrupt 0 would be
>>> aggregating the FULL interrupts, which according to the above we don't
>>> need for TX mailboxes.
>>>
>>> What's somewhat surprising is that you're saying that both FULL and
>>> EMPTY interrupts should be handled by the same shared interrupt. That's
>>> the opposite of what the recommended programming sequence is that the
>>> TRM specifies. Why is it better to handle both FULL and EMPTY interrupts
>>> with the same shared interrupt?
>>>
>>>> The interrupt functions should use a mask for mailboxes owned by kernel (in
>>>> essence what the IE register should be for the HSP shared interrupt owned by
>>>> the kernel) and serve only those mailboxes owned by kernel. Note that there
>>>> is no reset for HSP in Xavier, and the IE register contents may be stale.
>>> Would it be safe to clear all of the IE registers to 0 on driver probe?
>>> I seem to remember trying to do that and getting similar behaviour to
>>> what I describe above, namely that interrupts on the SPE weren't working
>>> anymore. I concluded that the IE register must be shared between the
>>> various processors, even though that's somewhat suprising given that
>>> there is no way to synchronize accesses to those registers, so their
>>> programming would be somewhat up to chance.
>>>
>>> Do you know any more about these registers? If they are indeed separate
>>> for each processor, it should be fairly easy to keep track of the
>>> mailboxes used by the kernel and process only those. Again I don't know
>>> how exactly to distinguish between TX and RX mailboxes because they all
>>> start out the same and only their use defines which direction they go.
>>> Currently this works because we program them as consumers by default.
>>> That means we enable the FULL interrupts but keep EMPTY interrupts
>>> disabled until a message in transmitted on the mailbox, at which point
>>> we enable the EMPTY interrupt. I suppose at that point we should also
>>> disable the FULL interrupt, given the above discussion.
>>>
>>>> And lastly, if we want to support only Xavier and later, perhaps we should
>>>> be more clear in the bindings? There are no mailbox-specific interrupt
>>>> enable registers available on Parker and your design relies on them.
>>> That was certainly not the intention. I thought I had seen the per-
>>> mailbox interrupt enable registers also in Tegra186 documentation, but
>>> after double-checking they're indeed not there. I don't think the driver
>>> currently "relies" on them because it uses them in addition to the
>>> HSP_IE registers. I suppose that accessing them might cause aborts on
>>> Tegra186 if they don't exist, though.
>>>
>>> I'm not entirely clear on what the advantages are of using the per-
>>> mailbox registers, or how they are supposed to be used. The existing
>>> documentation doesn't really explain how these are supposed to be used
>>> either, so I was mostly just going by trial and error.
>>>
>>> Do you know anything more on how to use these registers? I can easily
>>> make them Tegra194 specific in the code, but if they're aren't any clear
>>> advantages, it might just be easier to stick with HSP_IE programming
>>> only.
>>>
>>> Thierry

[4/9] mailbox: tegra-hsp: Add support for shared mailboxes

Commit Message

Comments

Patch