diff mbox series

[v13,2/2] PCI: endpoint: pci-epf-vntb: using platform MSI as doorbell

Message ID 20221124055036.1630573-3-Frank.Li@nxp.com (mailing list archive)
State Changes Requested
Delegated to: Lorenzo Pieralisi
Headers show
Series PCI: endpoint: pci-epf-vntb: using platform MSI as doorbell | expand

Commit Message

Frank Li Nov. 24, 2022, 5:50 a.m. UTC
┌────────────┐   ┌───────────────────────────────────┐   ┌────────────────┐
│            │   │                                   │   │                │
│            │   │ PCI Endpoint                      │   │ PCI Host       │
│            │   │                                   │   │                │
│            │◄──┤ 1.platform_msi_domain_alloc_irqs()│   │                │
│            │   │                                   │   │                │
│ MSI        ├──►│ 2.write_msi_msg()                 ├──►├─BAR<n>         │
│ Controller │   │   update doorbell register address│   │                │
│            │   │   for BAR                         │   │                │
│            │   │                                   │   │ 3. Write BAR<n>│
│            │◄──┼───────────────────────────────────┼───┤                │
│            │   │                                   │   │                │
│            ├──►│ 4.Irq Handle                      │   │                │
│            │   │                                   │   │                │
│            │   │                                   │   │                │
└────────────┘   └───────────────────────────────────┘   └────────────────┘

Using platform MSI interrupt controller as endpoint(EP)'s doorbell.

The memory assigned for BAR region by the PCI host is mapped to the
message address of platform msi interrupt controller in PCI Endpoint.
Such that, whenever the PCI host writes to the BAR region, it will
trigger an IRQ in the EP.

Basic working follow as
1. EP function driver call platform_msi_domain_alloc_irqs() alloc a
MSI irq from MSI controller with call back function write_msi_msg();
2. write_msg_msg will config BAR and map to address defined in msi_msg;
3. Host side trigger an IRQ at Endpoint by write to BAR region.

Add MSI doorbell support for pci-epf-vntb. Query if system has an MSI
controller. Set up doorbell address according to struct msi_msg.

So PCI host can write this doorbell address to trigger EP side's IRQ.

If no MSI controller exists, fall back to software polling.

Signed-off-by: Frank Li <Frank.Li@nxp.com>
---
 drivers/pci/endpoint/functions/pci-epf-vntb.c | 146 +++++++++++++++---
 1 file changed, 125 insertions(+), 21 deletions(-)

Comments

Manivannan Sadhasivam Nov. 24, 2022, 9 a.m. UTC | #1
On Thu, Nov 24, 2022 at 12:50:36AM -0500, Frank Li wrote:
> ┌────────────┐   ┌───────────────────────────────────┐   ┌────────────────┐
> │            │   │                                   │   │                │
> │            │   │ PCI Endpoint                      │   │ PCI Host       │
> │            │   │                                   │   │                │
> │            │◄──┤ 1.platform_msi_domain_alloc_irqs()│   │                │
> │            │   │                                   │   │                │
> │ MSI        ├──►│ 2.write_msi_msg()                 ├──►├─BAR<n>         │
> │ Controller │   │   update doorbell register address│   │                │
> │            │   │   for BAR                         │   │                │
> │            │   │                                   │   │ 3. Write BAR<n>│
> │            │◄──┼───────────────────────────────────┼───┤                │
> │            │   │                                   │   │                │
> │            ├──►│ 4.Irq Handle                      │   │                │
> │            │   │                                   │   │                │
> │            │   │                                   │   │                │
> └────────────┘   └───────────────────────────────────┘   └────────────────┘
> 

There are at least couple of BAR regions used in this patch but they were not
mentioned in the above diagram.

The subject should be:

"PCI: endpoint: pci-epf-vntb: Use EP MSI controller to handle DB from host"

> Using platform MSI interrupt controller as endpoint(EP)'s doorbell.
> 

Above line is not needed.

> The memory assigned for BAR region by the PCI host is mapped to the

Which BAR? (BAR 1 aka. DB BAR)? There are multiple BAR regions exposed by this function driver.

> message address of platform msi interrupt controller in PCI Endpoint.

s/msi/MSI. Also, use either Endpoint or EP, pick one but not both.

> Such that, whenever the PCI host writes to the BAR region, it will
> trigger an IRQ in the EP.
> 
> Basic working follow as

"work flow is"?

> 1. EP function driver call platform_msi_domain_alloc_irqs() alloc a

pci-epf-vntb function driver calls platform_msi_domain_alloc_irqs() to allocate
MSI's from the platform MSI controller.

> MSI irq from MSI controller with call back function write_msi_msg();
> 2. write_msg_msg will config BAR and map to address defined in msi_msg;

The epf_ntb_write_msi_msg() passed as a callback will write the offset of the
MSI controller's MSI address dedicated for each MSI to the doorbell register
db_offset and also writes the MSI data to db_data register in the CTRL BAR
region.

> 3. Host side trigger an IRQ at Endpoint by write to BAR region.
> 

Finally, the host can trigger doorbell by reading the offset of the doorbell
from db_offset register and writing the data read from db_data register in CTRL
BAR region to the computed address in the DB BAR region.

> Add MSI doorbell support for pci-epf-vntb. Query if system has an MSI
> controller. Set up doorbell address according to struct msi_msg.
> 
> So PCI host can write this doorbell address to trigger EP side's IRQ.
> 
> If no MSI controller exists, fall back to software polling.
> 

"Add doorbell support to pci-epf-vntb function driver making use of the platform
MSI controller. If the MSI controller is not available, fallback to the polling
method."

Also, please move this paragraph to the beginning of the description.

> Signed-off-by: Frank Li <Frank.Li@nxp.com>
> ---
>  drivers/pci/endpoint/functions/pci-epf-vntb.c | 146 +++++++++++++++---
>  1 file changed, 125 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/pci/endpoint/functions/pci-epf-vntb.c b/drivers/pci/endpoint/functions/pci-epf-vntb.c
> index 0d744975f815..f770a068e58c 100644
> --- a/drivers/pci/endpoint/functions/pci-epf-vntb.c
> +++ b/drivers/pci/endpoint/functions/pci-epf-vntb.c
> @@ -44,6 +44,7 @@
>  #include <linux/pci-epc.h>
>  #include <linux/pci-epf.h>
>  #include <linux/ntb.h>
> +#include <linux/msi.h>
>  
>  static struct workqueue_struct *kpcintb_workqueue;
>  
> @@ -137,11 +138,14 @@ struct epf_ntb {
>  	struct epf_ntb_ctrl *reg;
>  
>  	u32 *epf_db;
> +	phys_addr_t epf_db_phys;
>  
>  	phys_addr_t vpci_mw_phy[MAX_MW];
>  	void __iomem *vpci_mw_addr[MAX_MW];
>  
>  	struct delayed_work cmd_handler;
> +
> +	int msi_virqbase;
>  };

You should add kernel doc comments for this struct in a separate patch. It will
help in understanding the driver better.

>  
>  #define to_epf_ntb(epf_group) container_of((epf_group), struct epf_ntb, group)
> @@ -256,11 +260,13 @@ static void epf_ntb_cmd_handler(struct work_struct *work)
>  
>  	ntb = container_of(work, struct epf_ntb, cmd_handler.work);
>  
> -	for (i = 1; i < ntb->db_count; i++) {
> -		if (ntb->epf_db[i]) {
> -			ntb->db |= 1 << (i - 1);
> -			ntb_db_event(&ntb->ntb, i);
> -			ntb->epf_db[i] = 0;

A comment here stating that polling is implemented would be better.

> +	if (!ntb->epf_db_phys) {
> +		for (i = 1; i < ntb->db_count; i++) {
> +			if (ntb->epf_db[i]) {
> +				ntb->db |= 1 << (i - 1);
> +				ntb_db_event(&ntb->ntb, i);
> +				ntb->epf_db[i] = 0;
> +			}
>  		}
>  	}
>  
> @@ -518,6 +524,28 @@ static int epf_ntb_configure_interrupt(struct epf_ntb *ntb)
>  	return 0;
>  }
>  
> +static int epf_ntb_db_size(struct epf_ntb *ntb)
> +{
> +	const struct pci_epc_features *epc_features;
> +	size_t size = sizeof(u32) * ntb->db_count;
> +	u32 align;
> +
> +	epc_features = pci_epc_get_features(ntb->epf->epc,
> +					    ntb->epf->func_no,
> +					    ntb->epf->vfunc_no);
> +	align = epc_features->align;
> +
> +	if (size < 128)

Shouldn't this be (size > 128)?

> +		size = 128;
> +
> +	if (align)
> +		size = ALIGN(size, align);
> +	else
> +		size = roundup_pow_of_two(size);
> +
> +	return size;
> +}
> +
>  /**
>   * epf_ntb_db_bar_init() - Configure Doorbell window BARs
>   * @ntb: NTB device that facilitates communication between HOST and VHOST
> @@ -539,27 +567,26 @@ static int epf_ntb_db_bar_init(struct epf_ntb *ntb)
>  					    ntb->epf->func_no,
>  					    ntb->epf->vfunc_no);
>  	align = epc_features->align;
> -
> -	if (size < 128)
> -		size = 128;
> -
> -	if (align)
> -		size = ALIGN(size, align);
> -	else
> -		size = roundup_pow_of_two(size);
> +	size = epf_ntb_db_size(ntb);
>  
>  	barno = ntb->epf_ntb_bar[BAR_DB];
> +	epf_bar = &ntb->epf->bar[barno];
>  
> -	mw_addr = pci_epf_alloc_space(ntb->epf, size, barno, align, 0);
> -	if (!mw_addr) {
> -		dev_err(dev, "Failed to allocate OB address\n");
> -		return -ENOMEM;
> +	if (ntb->epf_db_phys) {
> +		mw_addr = NULL;
> +		epf_bar->phys_addr = ntb->epf_db_phys;
> +		epf_bar->barno = barno;
> +		epf_bar->size = size;
> +	} else {
> +		mw_addr = pci_epf_alloc_space(ntb->epf, size, barno, align, 0);
> +		if (!mw_addr) {
> +			dev_err(dev, "Failed to allocate doorbell address\n");
> +			return -ENOMEM;
> +		}
>  	}
>  
>  	ntb->epf_db = mw_addr;
>  
> -	epf_bar = &ntb->epf->bar[barno];
> -
>  	ret = pci_epc_set_bar(ntb->epf->epc, ntb->epf->func_no, ntb->epf->vfunc_no, epf_bar);
>  	if (ret) {
>  		dev_err(dev, "Doorbell BAR set failed\n");
> @@ -728,6 +755,82 @@ static int epf_ntb_init_epc_bar(struct epf_ntb *ntb)
>  	return 0;
>  }
>  
> +static irqreturn_t epf_ntb_interrupt_handler(int irq, void *data)

Shouldn't this function also be guarded?

> +{
> +	struct epf_ntb *ntb = data;
> +	int index;
> +
> +	index = irq - ntb->msi_virqbase;
> +	ntb->db |= 1 << (index - 1);
> +	ntb_db_event(&ntb->ntb, index);
> +
> +	return IRQ_HANDLED;
> +}
> +
> +#ifdef CONFIG_GENERIC_MSI_IRQ_DOMAIN

It'd be better to use the relevant commit description as a comment here.

> +static void epf_ntb_write_msi_msg(struct msi_desc *desc, struct msi_msg *msg)
> +{
> +	struct epf_ntb *ntb = dev_get_drvdata(desc->dev);
> +	struct epf_ntb_ctrl *reg = ntb->reg;
> +	int size = epf_ntb_db_size(ntb);
> +	u64 addr;
> +
> +	addr = msg->address_hi;
> +	addr <<= 32;
> +	addr |= msg->address_lo;
> +
> +	reg->db_data[desc->msi_index] = msg->data;
> +

A comment stating that the base address to be used as the DB BAR is set here
would be useful too.

> +	if (!desc->msi_index)
> +		ntb->epf_db_phys = round_down(addr, size);
> +
> +	reg->db_offset[desc->msi_index] = addr - ntb->epf_db_phys;
> +}
> +
> +static void epf_ntb_epc_msi_init(struct epf_ntb *ntb)
> +{
> +	struct device *dev = &ntb->epf->dev;
> +	struct irq_domain *domain;
> +	int virq;
> +	int ret;
> +	int i;
> +
> +	domain = dev_get_msi_domain(ntb->epf->epc->dev.parent);
> +	if (!domain)
> +		return;
> +
> +	dev_set_msi_domain(dev, domain);
> +
> +	if (platform_msi_domain_alloc_irqs(&ntb->epf->dev,
> +		ntb->db_count,
> +		epf_ntb_write_msi_msg)) {

Please wrap above two in a single line till 100 column limit.

> +		dev_err(dev, "Can't allocate MSI, falling back to polling mode\n");

This should be dev_dbg().

> +		return;
> +	}
> +	dev_dbg(dev, "Using MSI as doorbell\n");
> +
> +	for (i = 0; i < ntb->db_count; i++) {
> +		virq = msi_get_virq(dev, i);
> +		ret = devm_request_irq(dev, virq,
> +			       epf_ntb_interrupt_handler, 0,
> +			       "pci_epf_vntb", ntb);
> +
> +		if (ret) {
> +			dev_err(dev, "Failed to request doorbell IRQ! Falling back to polling mode");

Again, dev_dbg()

> +			ntb->epf_db_phys = 0;
> +			platform_msi_domain_free_irqs(&ntb->epf->dev);
> +			break;
> +		}
> +
> +		if (!i)
> +			ntb->msi_virqbase = virq; /* msi start virq number */
> +	}
> +}
> +#else

Since this is not exposed as an API, just end the ifdef here and...

> +static void epf_ntb_epc_msi_init(struct epf_ntb *ntb)
> +{
> +}
> +#endif /* CONFIG_GENERIC_MSI_IRQ_DOMAIN */
>  /**
>   * epf_ntb_epc_init() - Initialize NTB interface
>   * @ntb: NTB device that facilitates communication between HOST and VHOST
> @@ -1336,14 +1439,15 @@ static int epf_ntb_bind(struct pci_epf *epf)
>  		goto err_bar_alloc;
>  	}
>  
> +	epf_set_drvdata(epf, ntb);
> +	epf_ntb_epc_msi_init(ntb);

Guard this function instead:

#ifdef CONFIG_GENERIC_MSI_IRQ_DOMAIN
	epf_ntb_epc_msi_init(ntb);
#endif

Thanks,
Mani

> +
>  	ret = epf_ntb_epc_init(ntb);
>  	if (ret) {
>  		dev_err(dev, "Failed to initialize EPC\n");
>  		goto err_bar_alloc;
>  	}
>  
> -	epf_set_drvdata(epf, ntb);
> -
>  	pci_space[0] = (ntb->vntb_pid << 16) | ntb->vntb_vid;
>  	pci_vntb_table[0].vendor = ntb->vntb_vid;
>  	pci_vntb_table[0].device = ntb->vntb_pid;
> -- 
> 2.34.1
>
Frank Li Nov. 24, 2022, 6:03 p.m. UTC | #2
> -----Original Message-----
> From: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
> Sent: Thursday, November 24, 2022 3:00 AM
> To: Frank Li <frank.li@nxp.com>
> Cc: lpieralisi@kernel.org; Aisheng Dong <aisheng.dong@nxp.com>;
> bhelgaas@google.com; devicetree@vger.kernel.org; festevam@gmail.com;
> imx@lists.linux.dev; jdmason@kudzu.us; kernel@pengutronix.de;
> kishon@ti.com; krzysztof.kozlowski+dt@linaro.org; kw@linux.com; linux-
> arm-kernel@lists.infradead.org; dl-linux-imx <linux-imx@nxp.com>; linux-
> kernel@vger.kernel.org; linux-pci@vger.kernel.org;
> lorenzo.pieralisi@arm.com; lznuaa@gmail.com; maz@kernel.org;
> ntb@lists.linux.dev; Peng Fan <peng.fan@nxp.com>; robh+dt@kernel.org;
> s.hauer@pengutronix.de; shawnguo@kernel.org; tglx@linutronix.de
> Subject: [EXT] Re: [PATCH v13 2/2] PCI: endpoint: pci-epf-vntb: using platform
> MSI as doorbell
> 
> Caution: EXT Email
> 
> On Thu, Nov 24, 2022 at 12:50:36AM -0500, Frank Li wrote:
> > ┌────────────┐   ┌───────────────
> ────────────────────┐   ┌─────────
> ───────┐
> > │            │   │                                   │   │                │
> > │            │   │ PCI Endpoint                      │   │ PCI Host       │
> > │            │   │                                   │   │                │
> > │            │◄──┤ 1.platform_msi_domain_alloc_irqs()│   │
> │
> > │            │   │                                   │   │                │
> > │ MSI        ├──►│ 2.write_msi_msg()                 ├──►├─BAR<n>
> │
> > │ Controller │   │   update doorbell register address│   │                │
> > │            │   │   for BAR                         │   │                │
> > │            │   │                                   │   │ 3. Write BAR<n>│
> > │            │◄──┼──────────────────────
> ─────────────┼───┤                │
> > │            │   │                                   │   │                │
> > │            ├──►│ 4.Irq Handle                      │   │                │
> > │            │   │                                   │   │                │
> > │            │   │                                   │   │                │
> > └────────────┘   └───────────────
> ────────────────────┘   └─────────
> ───────┘
> >
> 
> There are at least couple of BAR regions used in this patch but they were not
> mentioned in the above diagram.

This patch just affected one BAR regions.  Do you like "BAR[DB]"?

Do you want to me draw other BARs, which used by this function?

> 
> The subject should be:
> 
> "PCI: endpoint: pci-epf-vntb: Use EP MSI controller to handle DB from host"
> 
> > Using platform MSI interrupt controller as endpoint(EP)'s doorbell.
> >
> 
> Above line is not needed.
> 
> > The memory assigned for BAR region by the PCI host is mapped to the
> 
> Which BAR? (BAR 1 aka. DB BAR)? There are multiple BAR regions exposed by
> this function driver.
> 
> > message address of platform msi interrupt controller in PCI Endpoint.
> 
> s/msi/MSI. Also, use either Endpoint or EP, pick one but not both.
> 
> > Such that, whenever the PCI host writes to the BAR region, it will
> > trigger an IRQ in the EP.
> >
> > Basic working follow as
> 
> "work flow is"?
> 
> > 1. EP function driver call platform_msi_domain_alloc_irqs() alloc a
> 
> pci-epf-vntb function driver calls platform_msi_domain_alloc_irqs() to
> allocate
> MSI's from the platform MSI controller.
> 
> > MSI irq from MSI controller with call back function write_msi_msg();
> > 2. write_msg_msg will config BAR and map to address defined in msi_msg;
> 
> The epf_ntb_write_msi_msg() passed as a callback will write the offset of the
> MSI controller's MSI address dedicated for each MSI to the doorbell register
> db_offset and also writes the MSI data to db_data register in the CTRL BAR
> region.
> 
> > 3. Host side trigger an IRQ at Endpoint by write to BAR region.
> >
> 
> Finally, the host can trigger doorbell by reading the offset of the doorbell
> from db_offset register and writing the data read from db_data register in
> CTRL
> BAR region to the computed address in the DB BAR region.
> 
> > Add MSI doorbell support for pci-epf-vntb. Query if system has an MSI
> > controller. Set up doorbell address according to struct msi_msg.
> >
> > So PCI host can write this doorbell address to trigger EP side's IRQ.
> >
> > If no MSI controller exists, fall back to software polling.
> >
> 
> "Add doorbell support to pci-epf-vntb function driver making use of the
> platform
> MSI controller. If the MSI controller is not available, fallback to the polling
> method."
> 
> Also, please move this paragraph to the beginning of the description.
> 
> > Signed-off-by: Frank Li <Frank.Li@nxp.com>
> > ---
> >  drivers/pci/endpoint/functions/pci-epf-vntb.c | 146 +++++++++++++++---
> >  1 file changed, 125 insertions(+), 21 deletions(-)
> >
> > diff --git a/drivers/pci/endpoint/functions/pci-epf-vntb.c
> b/drivers/pci/endpoint/functions/pci-epf-vntb.c
> > index 0d744975f815..f770a068e58c 100644
> > --- a/drivers/pci/endpoint/functions/pci-epf-vntb.c
> > +++ b/drivers/pci/endpoint/functions/pci-epf-vntb.c
> > @@ -44,6 +44,7 @@
> >  #include <linux/pci-epc.h>
> >  #include <linux/pci-epf.h>
> >  #include <linux/ntb.h>
> > +#include <linux/msi.h>
> >
> >  static struct workqueue_struct *kpcintb_workqueue;
> >
> > @@ -137,11 +138,14 @@ struct epf_ntb {
> >       struct epf_ntb_ctrl *reg;
> >
> >       u32 *epf_db;
> > +     phys_addr_t epf_db_phys;
> >
> >       phys_addr_t vpci_mw_phy[MAX_MW];
> >       void __iomem *vpci_mw_addr[MAX_MW];
> >
> >       struct delayed_work cmd_handler;
> > +
> > +     int msi_virqbase;
> >  };
> 
> You should add kernel doc comments for this struct in a separate patch. It
> will
> help in understanding the driver better.
> 
> >
> >  #define to_epf_ntb(epf_group) container_of((epf_group), struct epf_ntb,
> group)
> > @@ -256,11 +260,13 @@ static void epf_ntb_cmd_handler(struct
> work_struct *work)
> >
> >       ntb = container_of(work, struct epf_ntb, cmd_handler.work);
> >
> > -     for (i = 1; i < ntb->db_count; i++) {
> > -             if (ntb->epf_db[i]) {
> > -                     ntb->db |= 1 << (i - 1);
> > -                     ntb_db_event(&ntb->ntb, i);
> > -                     ntb->epf_db[i] = 0;
> 
> A comment here stating that polling is implemented would be better.
> 
> > +     if (!ntb->epf_db_phys) {
> > +             for (i = 1; i < ntb->db_count; i++) {
> > +                     if (ntb->epf_db[i]) {
> > +                             ntb->db |= 1 << (i - 1);
> > +                             ntb_db_event(&ntb->ntb, i);
> > +                             ntb->epf_db[i] = 0;
> > +                     }
> >               }
> >       }
> >
> > @@ -518,6 +524,28 @@ static int epf_ntb_configure_interrupt(struct
> epf_ntb *ntb)
> >       return 0;
> >  }
> >
> > +static int epf_ntb_db_size(struct epf_ntb *ntb)
> > +{
> > +     const struct pci_epc_features *epc_features;
> > +     size_t size = sizeof(u32) * ntb->db_count;
> > +     u32 align;
> > +
> > +     epc_features = pci_epc_get_features(ntb->epf->epc,
> > +                                         ntb->epf->func_no,
> > +                                         ntb->epf->vfunc_no);
> > +     align = epc_features->align;
> > +
> > +     if (size < 128)
> 
> Shouldn't this be (size > 128)?
 
This is one coming from pci-epf-ntb.c.
Not sure there are some EP hardware have such limitation.
 
> 
> > +             size = 128;
> > +
> > +     if (align)
> > +             size = ALIGN(size, align);
> > +     else
> > +             size = roundup_pow_of_two(size);
> > +
> > +     return size;
> > +}
> > +
> >  /**
> >   * epf_ntb_db_bar_init() - Configure Doorbell window BARs
> >   * @ntb: NTB device that facilitates communication between HOST and
> VHOST
> > @@ -539,27 +567,26 @@ static int epf_ntb_db_bar_init(struct epf_ntb
> *ntb)
> >                                           ntb->epf->func_no,
> >                                           ntb->epf->vfunc_no);
> >       align = epc_features->align;
> > -
> > -     if (size < 128)
> > -             size = 128;
> > -
> > -     if (align)
> > -             size = ALIGN(size, align);
> > -     else
> > -             size = roundup_pow_of_two(size);
> > +     size = epf_ntb_db_size(ntb);
> >
> >       barno = ntb->epf_ntb_bar[BAR_DB];
> > +     epf_bar = &ntb->epf->bar[barno];
> >
> > -     mw_addr = pci_epf_alloc_space(ntb->epf, size, barno, align, 0);
> > -     if (!mw_addr) {
> > -             dev_err(dev, "Failed to allocate OB address\n");
> > -             return -ENOMEM;
> > +     if (ntb->epf_db_phys) {
> > +             mw_addr = NULL;
> > +             epf_bar->phys_addr = ntb->epf_db_phys;
> > +             epf_bar->barno = barno;
> > +             epf_bar->size = size;
> > +     } else {
> > +             mw_addr = pci_epf_alloc_space(ntb->epf, size, barno, align, 0);
> > +             if (!mw_addr) {
> > +                     dev_err(dev, "Failed to allocate doorbell address\n");
> > +                     return -ENOMEM;
> > +             }
> >       }
> >
> >       ntb->epf_db = mw_addr;
> >
> > -     epf_bar = &ntb->epf->bar[barno];
> > -
> >       ret = pci_epc_set_bar(ntb->epf->epc, ntb->epf->func_no, ntb->epf-
> >vfunc_no, epf_bar);
> >       if (ret) {
> >               dev_err(dev, "Doorbell BAR set failed\n");
> > @@ -728,6 +755,82 @@ static int epf_ntb_init_epc_bar(struct epf_ntb
> *ntb)
> >       return 0;
> >  }
> >
> > +static irqreturn_t epf_ntb_interrupt_handler(int irq, void *data)
> 
> Shouldn't this function also be guarded?
> 
> > +{
> > +     struct epf_ntb *ntb = data;
> > +     int index;
> > +
> > +     index = irq - ntb->msi_virqbase;
> > +     ntb->db |= 1 << (index - 1);
> > +     ntb_db_event(&ntb->ntb, index);
> > +
> > +     return IRQ_HANDLED;
> > +}
> > +
> > +#ifdef CONFIG_GENERIC_MSI_IRQ_DOMAIN
> 
> It'd be better to use the relevant commit description as a comment here.
> 
> > +static void epf_ntb_write_msi_msg(struct msi_desc *desc, struct msi_msg
> *msg)
> > +{
> > +     struct epf_ntb *ntb = dev_get_drvdata(desc->dev);
> > +     struct epf_ntb_ctrl *reg = ntb->reg;
> > +     int size = epf_ntb_db_size(ntb);
> > +     u64 addr;
> > +
> > +     addr = msg->address_hi;
> > +     addr <<= 32;
> > +     addr |= msg->address_lo;
> > +
> > +     reg->db_data[desc->msi_index] = msg->data;
> > +
> 
> A comment stating that the base address to be used as the DB BAR is set here
> would be useful too.
> 
> > +     if (!desc->msi_index)
> > +             ntb->epf_db_phys = round_down(addr, size);
> > +
> > +     reg->db_offset[desc->msi_index] = addr - ntb->epf_db_phys;
> > +}
> > +
> > +static void epf_ntb_epc_msi_init(struct epf_ntb *ntb)
> > +{
> > +     struct device *dev = &ntb->epf->dev;
> > +     struct irq_domain *domain;
> > +     int virq;
> > +     int ret;
> > +     int i;
> > +
> > +     domain = dev_get_msi_domain(ntb->epf->epc->dev.parent);
> > +     if (!domain)
> > +             return;
> > +
> > +     dev_set_msi_domain(dev, domain);
> > +
> > +     if (platform_msi_domain_alloc_irqs(&ntb->epf->dev,
> > +             ntb->db_count,
> > +             epf_ntb_write_msi_msg)) {
> 
> Please wrap above two in a single line till 100 column limit.
> 
> > +             dev_err(dev, "Can't allocate MSI, falling back to polling mode\n");
> 
> This should be dev_dbg().
> 
> > +             return;
> > +     }
> > +     dev_dbg(dev, "Using MSI as doorbell\n");
> > +
> > +     for (i = 0; i < ntb->db_count; i++) {
> > +             virq = msi_get_virq(dev, i);
> > +             ret = devm_request_irq(dev, virq,
> > +                            epf_ntb_interrupt_handler, 0,
> > +                            "pci_epf_vntb", ntb);
> > +
> > +             if (ret) {
> > +                     dev_err(dev, "Failed to request doorbell IRQ! Falling back to
> polling mode");
> 
> Again, dev_dbg()
> 
> > +                     ntb->epf_db_phys = 0;
> > +                     platform_msi_domain_free_irqs(&ntb->epf->dev);
> > +                     break;
> > +             }
> > +
> > +             if (!i)
> > +                     ntb->msi_virqbase = virq; /* msi start virq number */
> > +     }
> > +}
> > +#else
> 
> Since this is not exposed as an API, just end the ifdef here and...
> 
> > +static void epf_ntb_epc_msi_init(struct epf_ntb *ntb)
> > +{
> > +}
> > +#endif /* CONFIG_GENERIC_MSI_IRQ_DOMAIN */
> >  /**
> >   * epf_ntb_epc_init() - Initialize NTB interface
> >   * @ntb: NTB device that facilitates communication between HOST and
> VHOST
> > @@ -1336,14 +1439,15 @@ static int epf_ntb_bind(struct pci_epf *epf)
> >               goto err_bar_alloc;
> >       }
> >
> > +     epf_set_drvdata(epf, ntb);
> > +     epf_ntb_epc_msi_init(ntb);
> 
> Guard this function instead:
> 
> #ifdef CONFIG_GENERIC_MSI_IRQ_DOMAIN
>         epf_ntb_epc_msi_init(ntb);
> #endif
> 
> Thanks,
> Mani
> 
> > +
> >       ret = epf_ntb_epc_init(ntb);
> >       if (ret) {
> >               dev_err(dev, "Failed to initialize EPC\n");
> >               goto err_bar_alloc;
> >       }
> >
> > -     epf_set_drvdata(epf, ntb);
> > -
> >       pci_space[0] = (ntb->vntb_pid << 16) | ntb->vntb_vid;
> >       pci_vntb_table[0].vendor = ntb->vntb_vid;
> >       pci_vntb_table[0].device = ntb->vntb_pid;
> > --
> > 2.34.1
> >
> 
> --
> மணிவண்ணன் சதாசிவம்
Manivannan Sadhasivam Nov. 24, 2022, 6:44 p.m. UTC | #3
On Thu, Nov 24, 2022 at 06:03:40PM +0000, Frank Li wrote:
> 
> 
> > -----Original Message-----
> > From: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
> > Sent: Thursday, November 24, 2022 3:00 AM
> > To: Frank Li <frank.li@nxp.com>
> > Cc: lpieralisi@kernel.org; Aisheng Dong <aisheng.dong@nxp.com>;
> > bhelgaas@google.com; devicetree@vger.kernel.org; festevam@gmail.com;
> > imx@lists.linux.dev; jdmason@kudzu.us; kernel@pengutronix.de;
> > kishon@ti.com; krzysztof.kozlowski+dt@linaro.org; kw@linux.com; linux-
> > arm-kernel@lists.infradead.org; dl-linux-imx <linux-imx@nxp.com>; linux-
> > kernel@vger.kernel.org; linux-pci@vger.kernel.org;
> > lorenzo.pieralisi@arm.com; lznuaa@gmail.com; maz@kernel.org;
> > ntb@lists.linux.dev; Peng Fan <peng.fan@nxp.com>; robh+dt@kernel.org;
> > s.hauer@pengutronix.de; shawnguo@kernel.org; tglx@linutronix.de
> > Subject: [EXT] Re: [PATCH v13 2/2] PCI: endpoint: pci-epf-vntb: using platform
> > MSI as doorbell
> > 
> > Caution: EXT Email
> > 
> > On Thu, Nov 24, 2022 at 12:50:36AM -0500, Frank Li wrote:
> > > ┌────────────┐   ┌───────────────
> > ────────────────────┐   ┌─────────
> > ───────┐
> > > │            │   │                                   │   │                │
> > > │            │   │ PCI Endpoint                      │   │ PCI Host       │
> > > │            │   │                                   │   │                │
> > > │            │◄──┤ 1.platform_msi_domain_alloc_irqs()│   │
> > │
> > > │            │   │                                   │   │                │
> > > │ MSI        ├──►│ 2.write_msi_msg()                 ├──►├─BAR<n>
> > │
> > > │ Controller │   │   update doorbell register address│   │                │
> > > │            │   │   for BAR                         │   │                │
> > > │            │   │                                   │   │ 3. Write BAR<n>│
> > > │            │◄──┼──────────────────────
> > ─────────────┼───┤                │
> > > │            │   │                                   │   │                │
> > > │            ├──►│ 4.Irq Handle                      │   │                │
> > > │            │   │                                   │   │                │
> > > │            │   │                                   │   │                │
> > > └────────────┘   └───────────────
> > ────────────────────┘   └─────────
> > ───────┘
> > >
> > 
> > There are at least couple of BAR regions used in this patch but they were not
> > mentioned in the above diagram.
> 
> This patch just affected one BAR regions.  Do you like "BAR[DB]"?
> 
> Do you want to me draw other BARs, which used by this function?
> 

It'd be good to just mention DB BAR.

> > 
> > The subject should be:
> > 
> > "PCI: endpoint: pci-epf-vntb: Use EP MSI controller to handle DB from host"
> > 
> > > Using platform MSI interrupt controller as endpoint(EP)'s doorbell.
> > >
> > 
> > Above line is not needed.
> > 
> > > The memory assigned for BAR region by the PCI host is mapped to the
> > 
> > Which BAR? (BAR 1 aka. DB BAR)? There are multiple BAR regions exposed by
> > this function driver.
> > 
> > > message address of platform msi interrupt controller in PCI Endpoint.
> > 
> > s/msi/MSI. Also, use either Endpoint or EP, pick one but not both.
> > 
> > > Such that, whenever the PCI host writes to the BAR region, it will
> > > trigger an IRQ in the EP.
> > >
> > > Basic working follow as
> > 
> > "work flow is"?
> > 
> > > 1. EP function driver call platform_msi_domain_alloc_irqs() alloc a
> > 
> > pci-epf-vntb function driver calls platform_msi_domain_alloc_irqs() to
> > allocate
> > MSI's from the platform MSI controller.
> > 
> > > MSI irq from MSI controller with call back function write_msi_msg();
> > > 2. write_msg_msg will config BAR and map to address defined in msi_msg;
> > 
> > The epf_ntb_write_msi_msg() passed as a callback will write the offset of the
> > MSI controller's MSI address dedicated for each MSI to the doorbell register
> > db_offset and also writes the MSI data to db_data register in the CTRL BAR
> > region.
> > 
> > > 3. Host side trigger an IRQ at Endpoint by write to BAR region.
> > >
> > 
> > Finally, the host can trigger doorbell by reading the offset of the doorbell
> > from db_offset register and writing the data read from db_data register in
> > CTRL
> > BAR region to the computed address in the DB BAR region.
> > 
> > > Add MSI doorbell support for pci-epf-vntb. Query if system has an MSI
> > > controller. Set up doorbell address according to struct msi_msg.
> > >
> > > So PCI host can write this doorbell address to trigger EP side's IRQ.
> > >
> > > If no MSI controller exists, fall back to software polling.
> > >
> > 
> > "Add doorbell support to pci-epf-vntb function driver making use of the
> > platform
> > MSI controller. If the MSI controller is not available, fallback to the polling
> > method."
> > 
> > Also, please move this paragraph to the beginning of the description.
> > 
> > > Signed-off-by: Frank Li <Frank.Li@nxp.com>
> > > ---
> > >  drivers/pci/endpoint/functions/pci-epf-vntb.c | 146 +++++++++++++++---
> > >  1 file changed, 125 insertions(+), 21 deletions(-)
> > >
> > > diff --git a/drivers/pci/endpoint/functions/pci-epf-vntb.c
> > b/drivers/pci/endpoint/functions/pci-epf-vntb.c
> > > index 0d744975f815..f770a068e58c 100644
> > > --- a/drivers/pci/endpoint/functions/pci-epf-vntb.c
> > > +++ b/drivers/pci/endpoint/functions/pci-epf-vntb.c
> > > @@ -44,6 +44,7 @@
> > >  #include <linux/pci-epc.h>
> > >  #include <linux/pci-epf.h>
> > >  #include <linux/ntb.h>
> > > +#include <linux/msi.h>
> > >
> > >  static struct workqueue_struct *kpcintb_workqueue;
> > >
> > > @@ -137,11 +138,14 @@ struct epf_ntb {
> > >       struct epf_ntb_ctrl *reg;
> > >
> > >       u32 *epf_db;
> > > +     phys_addr_t epf_db_phys;
> > >
> > >       phys_addr_t vpci_mw_phy[MAX_MW];
> > >       void __iomem *vpci_mw_addr[MAX_MW];
> > >
> > >       struct delayed_work cmd_handler;
> > > +
> > > +     int msi_virqbase;
> > >  };
> > 
> > You should add kernel doc comments for this struct in a separate patch. It
> > will
> > help in understanding the driver better.
> > 
> > >
> > >  #define to_epf_ntb(epf_group) container_of((epf_group), struct epf_ntb,
> > group)
> > > @@ -256,11 +260,13 @@ static void epf_ntb_cmd_handler(struct
> > work_struct *work)
> > >
> > >       ntb = container_of(work, struct epf_ntb, cmd_handler.work);
> > >
> > > -     for (i = 1; i < ntb->db_count; i++) {
> > > -             if (ntb->epf_db[i]) {
> > > -                     ntb->db |= 1 << (i - 1);
> > > -                     ntb_db_event(&ntb->ntb, i);
> > > -                     ntb->epf_db[i] = 0;
> > 
> > A comment here stating that polling is implemented would be better.
> > 
> > > +     if (!ntb->epf_db_phys) {
> > > +             for (i = 1; i < ntb->db_count; i++) {
> > > +                     if (ntb->epf_db[i]) {
> > > +                             ntb->db |= 1 << (i - 1);
> > > +                             ntb_db_event(&ntb->ntb, i);
> > > +                             ntb->epf_db[i] = 0;
> > > +                     }
> > >               }
> > >       }
> > >
> > > @@ -518,6 +524,28 @@ static int epf_ntb_configure_interrupt(struct
> > epf_ntb *ntb)
> > >       return 0;
> > >  }
> > >
> > > +static int epf_ntb_db_size(struct epf_ntb *ntb)
> > > +{
> > > +     const struct pci_epc_features *epc_features;
> > > +     size_t size = sizeof(u32) * ntb->db_count;
> > > +     u32 align;
> > > +
> > > +     epc_features = pci_epc_get_features(ntb->epf->epc,
> > > +                                         ntb->epf->func_no,
> > > +                                         ntb->epf->vfunc_no);
> > > +     align = epc_features->align;
> > > +
> > > +     if (size < 128)
> > 
> > Shouldn't this be (size > 128)?
>  
> This is one coming from pci-epf-ntb.c.
> Not sure there are some EP hardware have such limitation.
>  

I'm not sure if that is correct though. drivers/ntb/hw/epf/ntb_hw_epf.c sets
the upper limit to 32 (NTB_EPF_MAX_DB_COUNT + 1) DBs, in that case the size
cannot go beyond 128.

Thanks,
Mani
Frank Li Nov. 24, 2022, 10:02 p.m. UTC | #4
> -----Original Message-----
> From: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
> Sent: Thursday, November 24, 2022 12:45 PM
> To: Frank Li <frank.li@nxp.com>
> Cc: lpieralisi@kernel.org; Aisheng Dong <aisheng.dong@nxp.com>;
> bhelgaas@google.com; devicetree@vger.kernel.org; festevam@gmail.com;
> imx@lists.linux.dev; jdmason@kudzu.us; kernel@pengutronix.de;
> kishon@ti.com; krzysztof.kozlowski+dt@linaro.org; kw@linux.com; linux-
> arm-kernel@lists.infradead.org; dl-linux-imx <linux-imx@nxp.com>; linux-
> kernel@vger.kernel.org; linux-pci@vger.kernel.org;
> lorenzo.pieralisi@arm.com; lznuaa@gmail.com; maz@kernel.org;
> ntb@lists.linux.dev; Peng Fan <peng.fan@nxp.com>; robh+dt@kernel.org;
> s.hauer@pengutronix.de; shawnguo@kernel.org; tglx@linutronix.de
> Subject: Re: [EXT] Re: [PATCH v13 2/2] PCI: endpoint: pci-epf-vntb: using
> platform MSI as doorbell
> 
> Caution: EXT Email
> 
> On Thu, Nov 24, 2022 at 06:03:40PM +0000, Frank Li wrote:
> >
> >
> > > -----Original Message-----
> > > From: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
> > > Sent: Thursday, November 24, 2022 3:00 AM
> > > To: Frank Li <frank.li@nxp.com>
> > > Cc: lpieralisi@kernel.org; Aisheng Dong <aisheng.dong@nxp.com>;
> > > bhelgaas@google.com; devicetree@vger.kernel.org;
> festevam@gmail.com;
> > > imx@lists.linux.dev; jdmason@kudzu.us; kernel@pengutronix.de;
> > > kishon@ti.com; krzysztof.kozlowski+dt@linaro.org; kw@linux.com; linux-
> > > arm-kernel@lists.infradead.org; dl-linux-imx <linux-imx@nxp.com>;
> linux-
> > > kernel@vger.kernel.org; linux-pci@vger.kernel.org;
> > > lorenzo.pieralisi@arm.com; lznuaa@gmail.com; maz@kernel.org;
> > > ntb@lists.linux.dev; Peng Fan <peng.fan@nxp.com>; robh+dt@kernel.org;
> > > s.hauer@pengutronix.de; shawnguo@kernel.org; tglx@linutronix.de
> > > Subject: [EXT] Re: [PATCH v13 2/2] PCI: endpoint: pci-epf-vntb: using
> platform
> > > MSI as doorbell
> > >
> > > Caution: EXT Email
> > >
> > > On Thu, Nov 24, 2022 at 12:50:36AM -0500, Frank Li wrote:
> > > > ┌────────────┐   ┌──────────────
> ─
> > > ────────────────────┐   ┌───────
> ──
> > > ───────┐
> > > > │            │   │                                   │   │                │
> > > > │            │   │ PCI Endpoint                      │   │ PCI Host       │
> > > > │            │   │                                   │   │                │
> > > > │            │◄──┤ 1.platform_msi_domain_alloc_irqs()│   │
> > > │
> > > > │            │   │                                   │   │                │
> > > > │ MSI        ├──►│ 2.write_msi_msg()                 ├──►├─
> BAR<n>
> > > │
> > > > │ Controller │   │   update doorbell register address│   │                │
> > > > │            │   │   for BAR                         │   │                │
> > > > │            │   │                                   │   │ 3. Write BAR<n>│
> > > > │            │◄──┼─────────────────────
> ─
> > > ─────────────┼───┤                │
> > > > │            │   │                                   │   │                │
> > > > │            ├──►│ 4.Irq Handle                      │   │                │
> > > > │            │   │                                   │   │                │
> > > > │            │   │                                   │   │                │
> > > > └────────────┘   └──────────────
> ─
> > > ────────────────────┘   └───────
> ──
> > > ───────┘
> > > >
> > >
> > > There are at least couple of BAR regions used in this patch but they were
> not
> > > mentioned in the above diagram.
> >
> > This patch just affected one BAR regions.  Do you like "BAR[DB]"?
> >
> > Do you want to me draw other BARs, which used by this function?
> >
> 
> It'd be good to just mention DB BAR.
> 
> > >
> > > The subject should be:
> > >
> > > "PCI: endpoint: pci-epf-vntb: Use EP MSI controller to handle DB from
> host"
> > >
> > > > Using platform MSI interrupt controller as endpoint(EP)'s doorbell.
> > > >
> > >
> > > Above line is not needed.
> > >
> > > > The memory assigned for BAR region by the PCI host is mapped to the
> > >
> > > Which BAR? (BAR 1 aka. DB BAR)? There are multiple BAR regions
> exposed by
> > > this function driver.
> > >
> > > > message address of platform msi interrupt controller in PCI Endpoint.
> > >
> > > s/msi/MSI. Also, use either Endpoint or EP, pick one but not both.
> > >
> > > > Such that, whenever the PCI host writes to the BAR region, it will
> > > > trigger an IRQ in the EP.
> > > >
> > > > Basic working follow as
> > >
> > > "work flow is"?
> > >
> > > > 1. EP function driver call platform_msi_domain_alloc_irqs() alloc a
> > >
> > > pci-epf-vntb function driver calls platform_msi_domain_alloc_irqs() to
> > > allocate
> > > MSI's from the platform MSI controller.
> > >
> > > > MSI irq from MSI controller with call back function write_msi_msg();
> > > > 2. write_msg_msg will config BAR and map to address defined in
> msi_msg;
> > >
> > > The epf_ntb_write_msi_msg() passed as a callback will write the offset of
> the
> > > MSI controller's MSI address dedicated for each MSI to the doorbell
> register
> > > db_offset and also writes the MSI data to db_data register in the CTRL
> BAR
> > > region.
> > >
> > > > 3. Host side trigger an IRQ at Endpoint by write to BAR region.
> > > >
> > >
> > > Finally, the host can trigger doorbell by reading the offset of the doorbell
> > > from db_offset register and writing the data read from db_data register
> in
> > > CTRL
> > > BAR region to the computed address in the DB BAR region.
> > >
> > > > Add MSI doorbell support for pci-epf-vntb. Query if system has an MSI
> > > > controller. Set up doorbell address according to struct msi_msg.
> > > >
> > > > So PCI host can write this doorbell address to trigger EP side's IRQ.
> > > >
> > > > If no MSI controller exists, fall back to software polling.
> > > >
> > >
> > > "Add doorbell support to pci-epf-vntb function driver making use of the
> > > platform
> > > MSI controller. If the MSI controller is not available, fallback to the polling
> > > method."
> > >
> > > Also, please move this paragraph to the beginning of the description.
> > >
> > > > Signed-off-by: Frank Li <Frank.Li@nxp.com>
> > > > ---
> > > >  drivers/pci/endpoint/functions/pci-epf-vntb.c | 146 +++++++++++++++-
> --
> > > >  1 file changed, 125 insertions(+), 21 deletions(-)
> > > >
> > > > diff --git a/drivers/pci/endpoint/functions/pci-epf-vntb.c
> > > b/drivers/pci/endpoint/functions/pci-epf-vntb.c
> > > > index 0d744975f815..f770a068e58c 100644
> > > > --- a/drivers/pci/endpoint/functions/pci-epf-vntb.c
> > > > +++ b/drivers/pci/endpoint/functions/pci-epf-vntb.c
> > > > @@ -44,6 +44,7 @@
> > > >  #include <linux/pci-epc.h>
> > > >  #include <linux/pci-epf.h>
> > > >  #include <linux/ntb.h>
> > > > +#include <linux/msi.h>
> > > >
> > > >  static struct workqueue_struct *kpcintb_workqueue;
> > > >
> > > > @@ -137,11 +138,14 @@ struct epf_ntb {
> > > >       struct epf_ntb_ctrl *reg;
> > > >
> > > >       u32 *epf_db;
> > > > +     phys_addr_t epf_db_phys;
> > > >
> > > >       phys_addr_t vpci_mw_phy[MAX_MW];
> > > >       void __iomem *vpci_mw_addr[MAX_MW];
> > > >
> > > >       struct delayed_work cmd_handler;
> > > > +
> > > > +     int msi_virqbase;
> > > >  };
> > >
> > > You should add kernel doc comments for this struct in a separate patch. It
> > > will
> > > help in understanding the driver better.
> > >
> > > >
> > > >  #define to_epf_ntb(epf_group) container_of((epf_group), struct
> epf_ntb,
> > > group)
> > > > @@ -256,11 +260,13 @@ static void epf_ntb_cmd_handler(struct
> > > work_struct *work)
> > > >
> > > >       ntb = container_of(work, struct epf_ntb, cmd_handler.work);
> > > >
> > > > -     for (i = 1; i < ntb->db_count; i++) {
> > > > -             if (ntb->epf_db[i]) {
> > > > -                     ntb->db |= 1 << (i - 1);
> > > > -                     ntb_db_event(&ntb->ntb, i);
> > > > -                     ntb->epf_db[i] = 0;
> > >
> > > A comment here stating that polling is implemented would be better.
> > >
> > > > +     if (!ntb->epf_db_phys) {
> > > > +             for (i = 1; i < ntb->db_count; i++) {
> > > > +                     if (ntb->epf_db[i]) {
> > > > +                             ntb->db |= 1 << (i - 1);
> > > > +                             ntb_db_event(&ntb->ntb, i);
> > > > +                             ntb->epf_db[i] = 0;
> > > > +                     }
> > > >               }
> > > >       }
> > > >
> > > > @@ -518,6 +524,28 @@ static int epf_ntb_configure_interrupt(struct
> > > epf_ntb *ntb)
> > > >       return 0;
> > > >  }
> > > >
> > > > +static int epf_ntb_db_size(struct epf_ntb *ntb)
> > > > +{
> > > > +     const struct pci_epc_features *epc_features;
> > > > +     size_t size = sizeof(u32) * ntb->db_count;
> > > > +     u32 align;
> > > > +
> > > > +     epc_features = pci_epc_get_features(ntb->epf->epc,
> > > > +                                         ntb->epf->func_no,
> > > > +                                         ntb->epf->vfunc_no);
> > > > +     align = epc_features->align;
> > > > +
> > > > +     if (size < 128)
> > >
> > > Shouldn't this be (size > 128)?
> >
> > This is one coming from pci-epf-ntb.c.
> > Not sure there are some EP hardware have such limitation.
> >
> 
> I'm not sure if that is correct though. drivers/ntb/hw/epf/ntb_hw_epf.c sets
> the upper limit to 32 (NTB_EPF_MAX_DB_COUNT + 1) DBs, in that case the
> size
> cannot go beyond 128.

I don’t think so.  Please check
drivers/pci/endpoint/functions/pci-epf-ntb.c

Looks like some EP hardware have mini windows map size requirement.

I think it is not important for this patch.  

> 
> Thanks,
> Mani
> --
> மணிவண்ணன் சதாசிவம்
Thomas Gleixner Nov. 28, 2022, 7:14 p.m. UTC | #5
On Thu, Nov 24 2022 at 00:50, Frank Li wrote:
> ┌────────────┐   ┌───────────────────────────────────┐   ┌────────────────┐
> │            │   │                                   │   │                │
> │            │   │ PCI Endpoint                      │   │ PCI Host       │
> │            │   │                                   │   │                │
> │            │◄──┤ 1.platform_msi_domain_alloc_irqs()│   │                │
> │            │   │                                   │   │                │
> │ MSI        ├──►│ 2.write_msi_msg()                 ├──►├─BAR<n>         │
> │ Controller │   │   update doorbell register address│   │                │
> │            │   │   for BAR                         │   │                │
> │            │   │                                   │   │ 3. Write BAR<n>│
> │            │◄──┼───────────────────────────────────┼───┤                │
> │            │   │                                   │   │                │
> │            ├──►│ 4.Irq Handle                      │   │                │
> │            │   │                                   │   │                │
> │            │   │                                   │   │                │
> └────────────┘   └───────────────────────────────────┘   └────────────────┘
>
> Using platform MSI interrupt controller as endpoint(EP)'s doorbell.

Can you please explain what the MSI controller is in this picture? MSI
controller is not a term which is common in the interrupt handling
landscape despite the fact that it's pretty wide spread in device tree
bindings presumably through intensive copy & pasta cargo cult.

> Basic working follow as
> 1. EP function driver call platform_msi_domain_alloc_irqs() alloc a
> MSI irq from MSI controller with call back function write_msi_msg();
> 2. write_msg_msg will config BAR and map to address defined in msi_msg;
> 3. Host side trigger an IRQ at Endpoint by write to BAR region.

You're explaining what the code does, but fail to explain the underlying
mechanisms.

Platform MSI is definitely the wrong mechanism here. Why?

This is about a PCIe endpoint, which is usually handled by a PCI/MSI
interrupt domain. Obviously this usage does not fit into the way how the
"global" PCI/MSI domains work.

There is upcoming work and at least the generic parts should show up in
6.2 which addresses exactly the problem you are trying to solve:

   https://lore.kernel.org/all/20221124225331.464480443@linutronix.de
   https://lore.kernel.org/all/20221124230505.073418677@linutronix.de

plus the prove that the platform MSI mess can be replaced by this:

   https://lore.kernel.org/all/20221121135653.208611233@linutronix.de

NTB in it's current form should never have happened, but that's water
down the bridge.

What you really want is:

  1) Convert your platform to the new MSI parent model

  2) Utilize PCI/IMS which is giving you exactly what you need with
     proper PCI semantics

Thanks,

        tglx
Frank Li Nov. 28, 2022, 9:25 p.m. UTC | #6
> -----Original Message-----
> From: Thomas Gleixner <tglx@linutronix.de>
> Sent: Monday, November 28, 2022 1:15 PM
> To: Frank Li <frank.li@nxp.com>; lpieralisi@kernel.org
> Cc: Frank Li <frank.li@nxp.com>; Aisheng Dong <aisheng.dong@nxp.com>;
> bhelgaas@google.com; devicetree@vger.kernel.org; festevam@gmail.com;
> imx@lists.linux.dev; jdmason@kudzu.us; kernel@pengutronix.de;
> kishon@ti.com; krzysztof.kozlowski+dt@linaro.org; kw@linux.com; linux-
> arm-kernel@lists.infradead.org; dl-linux-imx <linux-imx@nxp.com>; linux-
> kernel@vger.kernel.org; linux-pci@vger.kernel.org;
> lorenzo.pieralisi@arm.com; lznuaa@gmail.com;
> manivannan.sadhasivam@linaro.org; maz@kernel.org; ntb@lists.linux.dev;
> Peng Fan <peng.fan@nxp.com>; robh+dt@kernel.org;
> s.hauer@pengutronix.de; shawnguo@kernel.org
> Subject: [EXT] Re: [PATCH v13 2/2] PCI: endpoint: pci-epf-vntb: using platform
> MSI as doorbell
> 
> Caution: EXT Email
> 
> On Thu, Nov 24 2022 at 00:50, Frank Li wrote:
> > ┌────────────┐   ┌───────────────
> ────────────────────┐   ┌─────────
> ───────┐
> > │            │   │                                   │   │                │
> > │            │   │ PCI Endpoint                      │   │ PCI Host       │
> > │            │   │                                   │   │                │
> > │            │◄──┤ 1.platform_msi_domain_alloc_irqs()│   │
> │
> > │            │   │                                   │   │                │
> > │ MSI        ├──►│ 2.write_msi_msg()                 ├──►├─BAR<n>
> │
> > │ Controller │   │   update doorbell register address│   │                │
> > │            │   │   for BAR                         │   │                │
> > │            │   │                                   │   │ 3. Write BAR<n>│
> > │            │◄──┼──────────────────────
> ─────────────┼───┤                │
> > │            │   │                                   │   │                │
> > │            ├──►│ 4.Irq Handle                      │   │                │
> > │            │   │                                   │   │                │
> > │            │   │                                   │   │                │
> > └────────────┘   └───────────────
> ────────────────────┘   └─────────
> ───────┘
> >
> > Using platform MSI interrupt controller as endpoint(EP)'s doorbell.
> 
> Can you please explain what the MSI controller is in this picture? MSI
> controller is not a term which is common in the interrupt handling
> landscape despite the fact that it's pretty wide spread in device tree
> bindings presumably through intensive copy & pasta cargo cult.

I use irq-imx-mu-msi to do test. I supposed it should work for all kinds
general msi controller.  DTS part still not upstream yet because PCI EP
enable patches still be under review. 
https://lore.kernel.org/all/1663913220-9523-1-git-send-email-hongxing.zhu@nxp.com/

Our test platform have not GIC ITS supported yet. 

> 
> > Basic working follow as
> > 1. EP function driver call platform_msi_domain_alloc_irqs() alloc a
> > MSI irq from MSI controller with call back function write_msi_msg();
> > 2. write_msg_msg will config BAR and map to address defined in msi_msg;
> > 3. Host side trigger an IRQ at Endpoint by write to BAR region.
> 
> You're explaining what the code does, but fail to explain the underlying
> mechanisms.
> 
> Platform MSI is definitely the wrong mechanism here. Why?

This patch use Platform MSI.  I never said " Platform MSI is definitely the wrong mechanism here".

Base logic is that, get msi controller's message address by irq API. 
Map this physical address to DB BAR,  so PCI host write this DB bar, then
EP generate irq.

You can refer v14 version, which should be better description. 

> 
> This is about a PCIe endpoint, which is usually handled by a PCI/MSI
> interrupt domain. Obviously this usage does not fit into the way how the
> "global" PCI/MSI domains work.

PCI endpoint have not standard PCI configure space to enable/disable MSI irq and
MSI address (CAP 05).  That's reason why need "platform msi", or you called "global"
Now.   

> 
> There is upcoming work and at least the generic parts should show up in
> 6.2 which addresses exactly the problem you are trying to solve:
> 
> 
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
> ernel.org%2Fall%2F20221124225331.464480443%40linutronix.de&amp;data
> =05%7C01%7CFrank.Li%40nxp.com%7C6a07e33e56af45ffc1ff08dad174d02d
> %7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6380525969049530
> 06%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMz
> IiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=Q8jr
> eVGGLa2M4yhjGO7Njqwdm59XDC0GyLEwkr0k6B0%3D&amp;reserved=0
> 
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
> ernel.org%2Fall%2F20221124230505.073418677%40linutronix.de&amp;data
> =05%7C01%7CFrank.Li%40nxp.com%7C6a07e33e56af45ffc1ff08dad174d02d
> %7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6380525969049530
> 06%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMz
> IiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=Tc9p
> XNJ499ETFgNWQBNLViFk8D5GbvrrwYDlBW%2Bf2qg%3D&amp;reserved=0
> 
> plus the prove that the platform MSI mess can be replaced by this:
> 
> 
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
> ernel.org%2Fall%2F20221121135653.208611233%40linutronix.de&amp;data
> =05%7C01%7CFrank.Li%40nxp.com%7C6a07e33e56af45ffc1ff08dad174d02d
> %7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6380525969049530
> 06%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMz
> IiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=R5K
> NVfcGqxoCam%2FYhY57ihsloWGhGLM3Kh9IkyME4lk%3D&amp;reserved=0
> 
> NTB in it's current form should never have happened, but that's water
> down the bridge.
> 
> What you really want is:
> 
>   1) Convert your platform to the new MSI parent model
> 
>   2) Utilize PCI/IMS which is giving you exactly what you need with
>      proper PCI semantics

Sorry, I still don't understand yet.  This patch is just user of msi controller.
Your patches focus on the msi controller itself. 

Interface platform_msi_domain_alloc_irqs still exist at your devmsi-v2-part3. 


> 
> Thanks,
> 
>         tglx
Thomas Gleixner Nov. 28, 2022, 10:39 p.m. UTC | #7
Frank!

On Mon, Nov 28 2022 at 21:25, Frank Li wrote:

Can you please fix your mail client to not copy the whole CC list into
the reply? It's just pointless noise. A simple:

On Mon, Nov 28 200 at 1:15 PM Thomas Gleixner wrote:

instead of:

>> -----Original Message-----
>> From: Thomas Gleixner <tglx@linutronix.de>
>> Sent: Monday, November 28, 2022 1:15 PM
>> To: Frank Li <frank.li@nxp.com>; lpieralisi@kernel.org
>> Cc: Frank Li <frank.li@nxp.com>; Aisheng Dong <aisheng.dong@nxp.com>;
>> bhelgaas@google.com; devicetree@vger.kernel.org; festevam@gmail.com;
>> imx@lists.linux.dev; jdmason@kudzu.us; kernel@pengutronix.de;
>> kishon@ti.com; krzysztof.kozlowski+dt@linaro.org; kw@linux.com; linux-
>> arm-kernel@lists.infradead.org; dl-linux-imx <linux-imx@nxp.com>; linux-
>> kernel@vger.kernel.org; linux-pci@vger.kernel.org;
>> lorenzo.pieralisi@arm.com; lznuaa@gmail.com;
>> manivannan.sadhasivam@linaro.org; maz@kernel.org; ntb@lists.linux.dev;
>> Peng Fan <peng.fan@nxp.com>; robh+dt@kernel.org;
>> s.hauer@pengutronix.de; shawnguo@kernel.org
>> Subject: [EXT] Re: [PATCH v13 2/2] PCI: endpoint: pci-epf-vntb: using platform
>> MSI as doorbell

is completely sufficient.

>> Caution: EXT Email

We are neither interested in the oddities of NXP's mail categorization system.

>> On Thu, Nov 24 2022 at 00:50, Frank Li wrote:
>> >
>> > Using platform MSI interrupt controller as endpoint(EP)'s doorbell.
>> 
>> Can you please explain what the MSI controller is in this picture? MSI
>> controller is not a term which is common in the interrupt handling
>> landscape despite the fact that it's pretty wide spread in device tree
>> bindings presumably through intensive copy & pasta cargo cult.
>
> I use irq-imx-mu-msi to do test. I supposed it should work for all kinds
> general msi controller.

Sure it works by some definition of "works" because obviously that
implementation does not care about where a particular message originates
from.

But that's still wrong at the conceptual level because it very much
matters where a message originates from. PCIe devices and general
platform devices have very distinct mechanisms to transport that
information.

Just because it "works" does not prove that it is correct.

How are you going to do proper shielding with that approach?

> Our test platform have not GIC ITS supported yet.

And therefore the originating device is irrelevant, right? Get to the
point where you have ITS and it all falls apart.

>> You're explaining what the code does, but fail to explain the underlying
>> mechanisms.
>> 
>> Platform MSI is definitely the wrong mechanism here. Why?
>
> This patch use Platform MSI.  I never said " Platform MSI is
> definitely the wrong mechanism here".

I did not claim that _you_ said that. _I_ said that this is wrong. See
above.

> Base logic is that, get msi controller's message address by irq API. 
> Map this physical address to DB BAR,  so PCI host write this DB bar, then
> EP generate irq.

Again, you are explaining what your implementation is doing, but you are
not describing the conceptual level.

>> This is about a PCIe endpoint, which is usually handled by a PCI/MSI
>> interrupt domain. Obviously this usage does not fit into the way how the
>> "global" PCI/MSI domains work.
>
> PCI endpoint have not standard PCI configure space to enable/disable MSI irq and
> MSI address (CAP 05).

I'm very well aware of the fact that a PCI endpoint does not have the
standard MSI configuration space mechanism.

>  That's reason why need "platform msi", or you called "global"

Again: platform MSI does not convey the PCIe device originator. It might
be irrelevant for your actual platform, but that does not make it more
correct. Once you have the need to track the origin of a MSI message
then the distinction between platform and MSI matters.

Sure you can argue that you don't care, but that does neither make it
correct nor future proof and there is no point to rework this whole
thing 6 month down the road when you actually have to support GIC-ITS.

>> There is upcoming work and at least the generic parts should show up in
>> 6.2 which addresses exactly the problem you are trying to solve:
>> 
>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
>> ernel.org%2Fall%2F20221124225331.464480443%40linutronix.de&amp;data
>> =05%7C01%7CFrank.Li%40nxp.com%7C6a07e33e56af45ffc1ff08dad174d02d
>> %7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6380525969049530
>> 06%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMz
>> IiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=Q8jr
>> eVGGLa2M4yhjGO7Njqwdm59XDC0GyLEwkr0k6B0%3D&amp;reserved=0
>> 
>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
>> ernel.org%2Fall%2F20221124230505.073418677%40linutronix.de&amp;data
>> =05%7C01%7CFrank.Li%40nxp.com%7C6a07e33e56af45ffc1ff08dad174d02d
>> %7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6380525969049530
>> 06%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMz
>> IiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=Tc9p
>> XNJ499ETFgNWQBNLViFk8D5GbvrrwYDlBW%2Bf2qg%3D&amp;reserved=0
>> 
>> plus the prove that the platform MSI mess can be replaced by this:
>> 
>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
>> ernel.org%2Fall%2F20221121135653.208611233%40linutronix.de&amp;data
>> =05%7C01%7CFrank.Li%40nxp.com%7C6a07e33e56af45ffc1ff08dad174d02d
>> %7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6380525969049530
>> 06%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMz
>> IiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=R5K
>> NVfcGqxoCam%2FYhY57ihsloWGhGLM3Kh9IkyME4lk%3D&amp;reserved=0

Those outlook artifacts are helpful to our conversation in which way?
 
>> NTB in it's current form should never have happened, but that's water
>> down the bridge.
>> 
>> What you really want is:
>> 
>>   1) Convert your platform to the new MSI parent model
>> 
>>   2) Utilize PCI/IMS which is giving you exactly what you need with
>>      proper PCI semantics
>
> Sorry, I still don't understand yet. This patch is just user of msi
> controller.

As I explained to you before: The concept of MSI controller does not
exist in the kernel. It might exist in your NXP nomenclature, but that's
irrelevant here.

> Your patches focus on the msi controller itself. 

No. They focus on changing the hierarchy model from "global" MSI domains
to per device MSI domains so that the weird constructs of platform MSI
can be replaced by something which actually matches the hardware and
provides a proper abstraction for PCIe/NTB at the right level.

> Interface platform_msi_domain_alloc_irqs still exist at your devmsi-v2-part3. 

Sure, but at:

  git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git devmsi-v2-arm

platform_msi_domain_alloc_irqs() does not exist anymore.

You replied to exactly that series which builds on top of devmsi-v2-part3, no?

So what are you trying to tell me?

Thanks,

        tglx
Frank Li Jan. 9, 2023, 5:51 p.m. UTC | #8
> On Thu, Nov 24 2022 at 00:50, Frank Li wrote:
> > ┌────────────┐   ┌───────────────
> ────────────────────┐   ┌────────
> ────────┐
> > │            │   │                                   │   │                │
> > │            │   │ PCI Endpoint                      │   │ PCI Host       │
> > │            │   │                                   │   │                │
> > │            │◄──┤ 1.platform_msi_domain_alloc_irqs()│   │                │
> > │            │   │                                   │   │                │
> > │ MSI        ├──►│ 2.write_msi_msg()                 ├──►├─BAR<n>
> │
> > │ Controller │   │   update doorbell register address│   │                │
> > │            │   │   for BAR                         │   │                │
> > │            │   │                                   │   │ 3. Write BAR<n>│
> > │            │◄──┼──────────────────────
> ─────────────┼───┤                │
> > │            │   │                                   │   │                │
> > │            ├──►│ 4.Irq Handle                      │   │                │
> > │            │   │                                   │   │                │
> > │            │   │                                   │   │                │
> > └────────────┘   └───────────────
> ────────────────────┘   └────────
> ────────┘
> >
> > Using platform MSI interrupt controller as endpoint(EP)'s doorbell.
> 
> Can you please explain what the MSI controller is in this picture? MSI
> controller is not a term which is common in the interrupt handling
> landscape despite the fact that it's pretty wide spread in device tree
> bindings presumably through intensive copy & pasta cargo cult.
> 
> > Basic working follow as
> > 1. EP function driver call platform_msi_domain_alloc_irqs() alloc a
> > MSI irq from MSI controller with call back function write_msi_msg();
> > 2. write_msg_msg will config BAR and map to address defined in msi_msg;
> > 3. Host side trigger an IRQ at Endpoint by write to BAR region.
> 
> You're explaining what the code does, but fail to explain the underlying
> mechanisms.
> 
> Platform MSI is definitely the wrong mechanism here. Why?
> 
> This is about a PCIe endpoint, which is usually handled by a PCI/MSI
> interrupt domain. Obviously this usage does not fit into the way how the
> "global" PCI/MSI domains work.
> 
> There is upcoming work and at least the generic parts should show up in
> 6.2 which addresses exactly the problem you are trying to solve:
>
@tglx

I finally get a platform, ls1028, which use gic its and also support PCIe EP function.
I port my code into linux-next, (next-20230105) and understand your comments.

Its need a device ID to allocate msi irq for each devices.  

Epf device is dynamic created, where are field epc,  epc.parent point to PCIe EP
Controller, which have been probed by device-tree. 

I use a workaround to get msi irq number. 
	dev->of_node=  ntb->efp->epc->dev.parent->of_node;
	Platform_msi_domain_alloc_irqs(dev, ...);

Irq-gic-v3-its-pci-msi use of_node to get a ID. 

You said PCI/IMS can resolve this problem.  But I still not figure out it yet.
Can you give me sample code, which use PCI/IMS or provide a pseudo code?

Frank 

> 
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
> ernel.org%2Fall%2F20221124225331.464480443%40linutronix.de&amp;data=0
> 5%7C01%7CFrank.Li%40nxp.com%7C6a07e33e56af45ffc1ff08dad174d02d%7C
> 686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C638052596904953006%7C
> Unknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJB
> TiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=Q8jreVGG
> La2M4yhjGO7Njqwdm59XDC0GyLEwkr0k6B0%3D&amp;reserved=0
> 
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
> ernel.org%2Fall%2F20221124230505.073418677%40linutronix.de&amp;data=0
> 5%7C01%7CFrank.Li%40nxp.com%7C6a07e33e56af45ffc1ff08dad174d02d%7C
> 686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C638052596904953006%7C
> Unknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJB
> TiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=Tc9pXNJ49
> 9ETFgNWQBNLViFk8D5GbvrrwYDlBW%2Bf2qg%3D&amp;reserved=0
> 
> plus the prove that the platform MSI mess can be replaced by this:
> 
> 
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
> ernel.org%2Fall%2F20221121135653.208611233%40linutronix.de&amp;data=0
> 5%7C01%7CFrank.Li%40nxp.com%7C6a07e33e56af45ffc1ff08dad174d02d%7C
> 686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C638052596904953006%7C
> Unknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJB
> TiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=R5KNVfcG
> qxoCam%2FYhY57ihsloWGhGLM3Kh9IkyME4lk%3D&amp;reserved=0
> 
> NTB in it's current form should never have happened, but that's water
> down the bridge.
> 
> What you really want is:
> 
>   1) Convert your platform to the new MSI parent model
> 
>   2) Utilize PCI/IMS which is giving you exactly what you need with
>      proper PCI semantics
> 
> Thanks,
> 
>         tglx
diff mbox series

Patch

diff --git a/drivers/pci/endpoint/functions/pci-epf-vntb.c b/drivers/pci/endpoint/functions/pci-epf-vntb.c
index 0d744975f815..f770a068e58c 100644
--- a/drivers/pci/endpoint/functions/pci-epf-vntb.c
+++ b/drivers/pci/endpoint/functions/pci-epf-vntb.c
@@ -44,6 +44,7 @@ 
 #include <linux/pci-epc.h>
 #include <linux/pci-epf.h>
 #include <linux/ntb.h>
+#include <linux/msi.h>
 
 static struct workqueue_struct *kpcintb_workqueue;
 
@@ -137,11 +138,14 @@  struct epf_ntb {
 	struct epf_ntb_ctrl *reg;
 
 	u32 *epf_db;
+	phys_addr_t epf_db_phys;
 
 	phys_addr_t vpci_mw_phy[MAX_MW];
 	void __iomem *vpci_mw_addr[MAX_MW];
 
 	struct delayed_work cmd_handler;
+
+	int msi_virqbase;
 };
 
 #define to_epf_ntb(epf_group) container_of((epf_group), struct epf_ntb, group)
@@ -256,11 +260,13 @@  static void epf_ntb_cmd_handler(struct work_struct *work)
 
 	ntb = container_of(work, struct epf_ntb, cmd_handler.work);
 
-	for (i = 1; i < ntb->db_count; i++) {
-		if (ntb->epf_db[i]) {
-			ntb->db |= 1 << (i - 1);
-			ntb_db_event(&ntb->ntb, i);
-			ntb->epf_db[i] = 0;
+	if (!ntb->epf_db_phys) {
+		for (i = 1; i < ntb->db_count; i++) {
+			if (ntb->epf_db[i]) {
+				ntb->db |= 1 << (i - 1);
+				ntb_db_event(&ntb->ntb, i);
+				ntb->epf_db[i] = 0;
+			}
 		}
 	}
 
@@ -518,6 +524,28 @@  static int epf_ntb_configure_interrupt(struct epf_ntb *ntb)
 	return 0;
 }
 
+static int epf_ntb_db_size(struct epf_ntb *ntb)
+{
+	const struct pci_epc_features *epc_features;
+	size_t size = sizeof(u32) * ntb->db_count;
+	u32 align;
+
+	epc_features = pci_epc_get_features(ntb->epf->epc,
+					    ntb->epf->func_no,
+					    ntb->epf->vfunc_no);
+	align = epc_features->align;
+
+	if (size < 128)
+		size = 128;
+
+	if (align)
+		size = ALIGN(size, align);
+	else
+		size = roundup_pow_of_two(size);
+
+	return size;
+}
+
 /**
  * epf_ntb_db_bar_init() - Configure Doorbell window BARs
  * @ntb: NTB device that facilitates communication between HOST and VHOST
@@ -539,27 +567,26 @@  static int epf_ntb_db_bar_init(struct epf_ntb *ntb)
 					    ntb->epf->func_no,
 					    ntb->epf->vfunc_no);
 	align = epc_features->align;
-
-	if (size < 128)
-		size = 128;
-
-	if (align)
-		size = ALIGN(size, align);
-	else
-		size = roundup_pow_of_two(size);
+	size = epf_ntb_db_size(ntb);
 
 	barno = ntb->epf_ntb_bar[BAR_DB];
+	epf_bar = &ntb->epf->bar[barno];
 
-	mw_addr = pci_epf_alloc_space(ntb->epf, size, barno, align, 0);
-	if (!mw_addr) {
-		dev_err(dev, "Failed to allocate OB address\n");
-		return -ENOMEM;
+	if (ntb->epf_db_phys) {
+		mw_addr = NULL;
+		epf_bar->phys_addr = ntb->epf_db_phys;
+		epf_bar->barno = barno;
+		epf_bar->size = size;
+	} else {
+		mw_addr = pci_epf_alloc_space(ntb->epf, size, barno, align, 0);
+		if (!mw_addr) {
+			dev_err(dev, "Failed to allocate doorbell address\n");
+			return -ENOMEM;
+		}
 	}
 
 	ntb->epf_db = mw_addr;
 
-	epf_bar = &ntb->epf->bar[barno];
-
 	ret = pci_epc_set_bar(ntb->epf->epc, ntb->epf->func_no, ntb->epf->vfunc_no, epf_bar);
 	if (ret) {
 		dev_err(dev, "Doorbell BAR set failed\n");
@@ -728,6 +755,82 @@  static int epf_ntb_init_epc_bar(struct epf_ntb *ntb)
 	return 0;
 }
 
+static irqreturn_t epf_ntb_interrupt_handler(int irq, void *data)
+{
+	struct epf_ntb *ntb = data;
+	int index;
+
+	index = irq - ntb->msi_virqbase;
+	ntb->db |= 1 << (index - 1);
+	ntb_db_event(&ntb->ntb, index);
+
+	return IRQ_HANDLED;
+}
+
+#ifdef CONFIG_GENERIC_MSI_IRQ_DOMAIN
+static void epf_ntb_write_msi_msg(struct msi_desc *desc, struct msi_msg *msg)
+{
+	struct epf_ntb *ntb = dev_get_drvdata(desc->dev);
+	struct epf_ntb_ctrl *reg = ntb->reg;
+	int size = epf_ntb_db_size(ntb);
+	u64 addr;
+
+	addr = msg->address_hi;
+	addr <<= 32;
+	addr |= msg->address_lo;
+
+	reg->db_data[desc->msi_index] = msg->data;
+
+	if (!desc->msi_index)
+		ntb->epf_db_phys = round_down(addr, size);
+
+	reg->db_offset[desc->msi_index] = addr - ntb->epf_db_phys;
+}
+
+static void epf_ntb_epc_msi_init(struct epf_ntb *ntb)
+{
+	struct device *dev = &ntb->epf->dev;
+	struct irq_domain *domain;
+	int virq;
+	int ret;
+	int i;
+
+	domain = dev_get_msi_domain(ntb->epf->epc->dev.parent);
+	if (!domain)
+		return;
+
+	dev_set_msi_domain(dev, domain);
+
+	if (platform_msi_domain_alloc_irqs(&ntb->epf->dev,
+		ntb->db_count,
+		epf_ntb_write_msi_msg)) {
+		dev_err(dev, "Can't allocate MSI, falling back to polling mode\n");
+		return;
+	}
+	dev_dbg(dev, "Using MSI as doorbell\n");
+
+	for (i = 0; i < ntb->db_count; i++) {
+		virq = msi_get_virq(dev, i);
+		ret = devm_request_irq(dev, virq,
+			       epf_ntb_interrupt_handler, 0,
+			       "pci_epf_vntb", ntb);
+
+		if (ret) {
+			dev_err(dev, "Failed to request doorbell IRQ! Falling back to polling mode");
+			ntb->epf_db_phys = 0;
+			platform_msi_domain_free_irqs(&ntb->epf->dev);
+			break;
+		}
+
+		if (!i)
+			ntb->msi_virqbase = virq; /* msi start virq number */
+	}
+}
+#else
+static void epf_ntb_epc_msi_init(struct epf_ntb *ntb)
+{
+}
+#endif /* CONFIG_GENERIC_MSI_IRQ_DOMAIN */
 /**
  * epf_ntb_epc_init() - Initialize NTB interface
  * @ntb: NTB device that facilitates communication between HOST and VHOST
@@ -1336,14 +1439,15 @@  static int epf_ntb_bind(struct pci_epf *epf)
 		goto err_bar_alloc;
 	}
 
+	epf_set_drvdata(epf, ntb);
+	epf_ntb_epc_msi_init(ntb);
+
 	ret = epf_ntb_epc_init(ntb);
 	if (ret) {
 		dev_err(dev, "Failed to initialize EPC\n");
 		goto err_bar_alloc;
 	}
 
-	epf_set_drvdata(epf, ntb);
-
 	pci_space[0] = (ntb->vntb_pid << 16) | ntb->vntb_vid;
 	pci_vntb_table[0].vendor = ntb->vntb_vid;
 	pci_vntb_table[0].device = ntb->vntb_pid;