mbox series

[RFC,0/8] Qualcomm Cloud AI 100 driver

Message ID 1589465266-20056-1-git-send-email-jhugo@codeaurora.org (mailing list archive)
Headers show
Series Qualcomm Cloud AI 100 driver | expand

Message

Jeffrey Hugo May 14, 2020, 2:07 p.m. UTC
Introduction:
Qualcomm Cloud AI 100 is a PCIe adapter card which contains a dedicated
SoC ASIC for the purpose of efficently running Deep Learning inference
workloads in a data center environment.

The offical press release can be found at -
https://www.qualcomm.com/news/releases/2019/04/09/qualcomm-brings-power-efficient-artificial-intelligence-inference

The offical product website is -
https://www.qualcomm.com/products/datacenter-artificial-intelligence

At the time of the offical press release, numerious technology news sites
also covered the product.  Doing a search of your favorite site is likely
to find their coverage of it.

It is our goal to have the kernel driver for the product fully upstream.
The purpose of this RFC is to start that process.  We are still doing
development (see below), and thus not quite looking to gain acceptance quite
yet, but now that we have a working driver we beleive we are at the stage
where meaningful conversation with the community can occur.

Design:

+--------------------------------+
|       AI application           |
|       (userspace)              |
+-------------+------------------+
              |
              | Char dev interface
              |
              |
+-------------+------------------+
|       QAIC driver              |
|       (kernel space)           |
|                                |
+----+------------------+--------+
     |                  |
     |                  |
     |                  |
     |                  |
     |Control path      | Data path
     |(MHI bus)         |
     |                  |
     |                  |
     |                  |
     |                  |
+--------------------------------+
| +--------+      +------------+ |
| | MHI HW |      |DMA Bridge  | |
| +--------+      |(DMA engine)| |
|                 +------------+ |
|                                |
|                                |
|                                |
|  Qualcomm Cloud AI 100 device  |
|                                |
|                                |
+--------------------------------+

A Qualcomm Cloud AI 100 device (QAIC device from here on) is a PCIe hardware
accelerator for AI inference workloads.  Over the PCIe bus fabric, a QAIC
device exposes two interfaces via PCI BARs - a MHI hardware region and a
DMA engine hardware region.

Before workloads can be run, a QAIC device needs to be initialized.  Similar
to other Qualcomm products with incorperate MHI, device firmware needs to be
loaded onto the device from the host.  This occurs in two stages.  First,
a secondary bootloader (SBL) needs to be loaded onto the device.  This occurs
via the BHI protocol, and is handled by the MHI bus.  Once the SBL loaded
and running, it activates the Sahara protocol.  The Sahara protocol is used
with a userspace application to load and initialize the remaining firmware.
The Sahara protocol and associated userspace application are outside the
scope of this series as they have no direct interaction with the QAIC driver.

Once a QAIC device is fully initialized, workloads can be sent to the device
and run.  This involves a per-device instance char dev that the QAIC driver
exposes to userspace.  Running a workload involves two phases - configuring the
device, and interacting with the workload.

To configure the device, commands are sent via a MHI channel.  This is referred
to as the control path.  A command is a single message.  A message contains
one or more transactions.  Transactions are operations that the device
is requested to perform.  Most commands are opaque to the kernel driver, however
some are not.  For example, if the user application wishes to DMA data to the
device, it requires the assistance of the kernel driver to translate the data
addresses to an address space that the device can understand.  In this instance
the transaction for DMAing the data is visible to the kernel driver, and the
driver will do the required transformation when encoding the message.

To interact with the workload, the workload is assigned a DMA Bridge Channel
(dbc).  This is dedicated hardware within the DMA engine.  Interacting with the
workload consists of sending it input data, and receiving output data.  The
user application requests appropiate buffers from the kernel driver, prepares
the buffers, and directs the kernel driver to queue them to the hardware.

The kernel driver is required to support multiple QAIC devices, and also N
users per device.

Status:
This series introduces the driver for QAIC devices, and builds up the minimum
functionality for running workloads.  Several features which have been omitted
or are still planned are indicated in the future work section.

Before exiting the RFC phase, and attempting full acceptance, we wish to
complete two features which are currently under development as we expect there
to be userspace interface changes as a result.

The first feature is a variable length control message between the kernel driver
and the device.  This allows us to support the total number of DMA transactions
we require for certain platforms, while minimizing memory usage.  The interface
impact of this would be to allow us to drop the size of the manage buffer
between userspace and the kernel driver from the current 16k, much of which is
wasted.

The second feature is an optimization and extension of the data path interface.
We plan to move the bulk of the data in the qaic_execute structure to the
qaic_mem_req structure, which optimized our critical path processing.  We also
plan to extend the qaic_execute structure to allow for a batch submit of
multiple buffers as an optimization and convenience for userspace.  

Future work:
For simplicity, we have omitted work related to the following features, and
intend to submit in future series:

-debugfs
-trace points
-hwmon (device telemetry)

We are also investigating what it might mean to support dma_bufs.  We expect
that such support would come as an extension of the interface.

Jeffrey Hugo (8):
  qaic: Add skeleton driver
  qaic: Add and init a basic mhi controller
  qaic: Create char dev
  qaic: Implement control path
  qaic: Implement data path
  qaic: Implement PCI link status error handlers
  qaic: Implement MHI error status handler
  MAINTAINERS: Add entry for QAIC driver

 MAINTAINERS                        |    7 +
 drivers/misc/Kconfig               |    1 +
 drivers/misc/Makefile              |    1 +
 drivers/misc/qaic/Kconfig          |   20 +
 drivers/misc/qaic/Makefile         |   12 +
 drivers/misc/qaic/mhi_controller.c |  538 +++++++++++++++++++
 drivers/misc/qaic/mhi_controller.h |   20 +
 drivers/misc/qaic/qaic.h           |  111 ++++
 drivers/misc/qaic/qaic_control.c   | 1015 ++++++++++++++++++++++++++++++++++++
 drivers/misc/qaic/qaic_data.c      |  952 +++++++++++++++++++++++++++++++++
 drivers/misc/qaic/qaic_drv.c       |  699 +++++++++++++++++++++++++
 include/uapi/misc/qaic.h           |  246 +++++++++
 12 files changed, 3622 insertions(+)
 create mode 100644 drivers/misc/qaic/Kconfig
 create mode 100644 drivers/misc/qaic/Makefile
 create mode 100644 drivers/misc/qaic/mhi_controller.c
 create mode 100644 drivers/misc/qaic/mhi_controller.h
 create mode 100644 drivers/misc/qaic/qaic.h
 create mode 100644 drivers/misc/qaic/qaic_control.c
 create mode 100644 drivers/misc/qaic/qaic_data.c
 create mode 100644 drivers/misc/qaic/qaic_drv.c
 create mode 100644 include/uapi/misc/qaic.h

Comments

Dave Airlie May 19, 2020, 5:08 a.m. UTC | #1
On Fri, 15 May 2020 at 00:12, Jeffrey Hugo <jhugo@codeaurora.org> wrote:
>
> Introduction:
> Qualcomm Cloud AI 100 is a PCIe adapter card which contains a dedicated
> SoC ASIC for the purpose of efficently running Deep Learning inference
> workloads in a data center environment.
>
> The offical press release can be found at -
> https://www.qualcomm.com/news/releases/2019/04/09/qualcomm-brings-power-efficient-artificial-intelligence-inference
>
> The offical product website is -
> https://www.qualcomm.com/products/datacenter-artificial-intelligence
>
> At the time of the offical press release, numerious technology news sites
> also covered the product.  Doing a search of your favorite site is likely
> to find their coverage of it.
>
> It is our goal to have the kernel driver for the product fully upstream.
> The purpose of this RFC is to start that process.  We are still doing
> development (see below), and thus not quite looking to gain acceptance quite
> yet, but now that we have a working driver we beleive we are at the stage
> where meaningful conversation with the community can occur.


Hi Jeffery,

Just wondering what the userspace/testing plans for this driver.

This introduces a new user facing API for a device without pointers to
users or tests for that API.

Although this isn't a graphics driver, and Greg will likely merge
anything to the kernel you throw at him, I do wonder how to validate
the uapi from a security perspective. It's always interesting when
someone wraps a DMA engine with user ioctls, and without enough
information to decide if the DMA engine is secure against userspace
misprogramming it.

Also if we don't understand the programming API on board the device,
we can't tell if the "core" on the device are able to reprogram the
device engines either.

Figuring this out is difficult at the best of times, it helps if there
is access to the complete device documentation or user space side
drivers in order to faciliate this.

The other area I mention is testing the uAPI, how do you envisage
regression testing and long term sustainability of the uAPI?

Thanks,
Dave.
Manivannan Sadhasivam May 19, 2020, 6:57 a.m. UTC | #2
Hi Jeff,

On Thu, May 14, 2020 at 08:07:38AM -0600, Jeffrey Hugo wrote:
> Introduction:
> Qualcomm Cloud AI 100 is a PCIe adapter card which contains a dedicated
> SoC ASIC for the purpose of efficently running Deep Learning inference
> workloads in a data center environment.
> 
> The offical press release can be found at -
> https://www.qualcomm.com/news/releases/2019/04/09/qualcomm-brings-power-efficient-artificial-intelligence-inference
> 
> The offical product website is -
> https://www.qualcomm.com/products/datacenter-artificial-intelligence
> 
> At the time of the offical press release, numerious technology news sites
> also covered the product.  Doing a search of your favorite site is likely
> to find their coverage of it.
> 
> It is our goal to have the kernel driver for the product fully upstream.
> The purpose of this RFC is to start that process.  We are still doing
> development (see below), and thus not quite looking to gain acceptance quite
> yet, but now that we have a working driver we beleive we are at the stage
> where meaningful conversation with the community can occur.
> 
> Design:

Can you add documentation in next revision with all this information (or more)?
In restructured text ofc. Eventhough it is an RFC series, adding documentation
doesn't hurt and it will help reviewers to understand the hardware better.

Thanks,
Mani

> 
> +--------------------------------+
> |       AI application           |
> |       (userspace)              |
> +-------------+------------------+
>               |
>               | Char dev interface
>               |
>               |
> +-------------+------------------+
> |       QAIC driver              |
> |       (kernel space)           |
> |                                |
> +----+------------------+--------+
>      |                  |
>      |                  |
>      |                  |
>      |                  |
>      |Control path      | Data path
>      |(MHI bus)         |
>      |                  |
>      |                  |
>      |                  |
>      |                  |
> +--------------------------------+
> | +--------+      +------------+ |
> | | MHI HW |      |DMA Bridge  | |
> | +--------+      |(DMA engine)| |
> |                 +------------+ |
> |                                |
> |                                |
> |                                |
> |  Qualcomm Cloud AI 100 device  |
> |                                |
> |                                |
> +--------------------------------+
> 
> A Qualcomm Cloud AI 100 device (QAIC device from here on) is a PCIe hardware
> accelerator for AI inference workloads.  Over the PCIe bus fabric, a QAIC
> device exposes two interfaces via PCI BARs - a MHI hardware region and a
> DMA engine hardware region.
> 
> Before workloads can be run, a QAIC device needs to be initialized.  Similar
> to other Qualcomm products with incorperate MHI, device firmware needs to be
> loaded onto the device from the host.  This occurs in two stages.  First,
> a secondary bootloader (SBL) needs to be loaded onto the device.  This occurs
> via the BHI protocol, and is handled by the MHI bus.  Once the SBL loaded
> and running, it activates the Sahara protocol.  The Sahara protocol is used
> with a userspace application to load and initialize the remaining firmware.
> The Sahara protocol and associated userspace application are outside the
> scope of this series as they have no direct interaction with the QAIC driver.
> 
> Once a QAIC device is fully initialized, workloads can be sent to the device
> and run.  This involves a per-device instance char dev that the QAIC driver
> exposes to userspace.  Running a workload involves two phases - configuring the
> device, and interacting with the workload.
> 
> To configure the device, commands are sent via a MHI channel.  This is referred
> to as the control path.  A command is a single message.  A message contains
> one or more transactions.  Transactions are operations that the device
> is requested to perform.  Most commands are opaque to the kernel driver, however
> some are not.  For example, if the user application wishes to DMA data to the
> device, it requires the assistance of the kernel driver to translate the data
> addresses to an address space that the device can understand.  In this instance
> the transaction for DMAing the data is visible to the kernel driver, and the
> driver will do the required transformation when encoding the message.
> 
> To interact with the workload, the workload is assigned a DMA Bridge Channel
> (dbc).  This is dedicated hardware within the DMA engine.  Interacting with the
> workload consists of sending it input data, and receiving output data.  The
> user application requests appropiate buffers from the kernel driver, prepares
> the buffers, and directs the kernel driver to queue them to the hardware.
> 
> The kernel driver is required to support multiple QAIC devices, and also N
> users per device.
> 
> Status:
> This series introduces the driver for QAIC devices, and builds up the minimum
> functionality for running workloads.  Several features which have been omitted
> or are still planned are indicated in the future work section.
> 
> Before exiting the RFC phase, and attempting full acceptance, we wish to
> complete two features which are currently under development as we expect there
> to be userspace interface changes as a result.
> 
> The first feature is a variable length control message between the kernel driver
> and the device.  This allows us to support the total number of DMA transactions
> we require for certain platforms, while minimizing memory usage.  The interface
> impact of this would be to allow us to drop the size of the manage buffer
> between userspace and the kernel driver from the current 16k, much of which is
> wasted.
> 
> The second feature is an optimization and extension of the data path interface.
> We plan to move the bulk of the data in the qaic_execute structure to the
> qaic_mem_req structure, which optimized our critical path processing.  We also
> plan to extend the qaic_execute structure to allow for a batch submit of
> multiple buffers as an optimization and convenience for userspace.  
> 
> Future work:
> For simplicity, we have omitted work related to the following features, and
> intend to submit in future series:
> 
> -debugfs
> -trace points
> -hwmon (device telemetry)
> 
> We are also investigating what it might mean to support dma_bufs.  We expect
> that such support would come as an extension of the interface.
> 
> Jeffrey Hugo (8):
>   qaic: Add skeleton driver
>   qaic: Add and init a basic mhi controller
>   qaic: Create char dev
>   qaic: Implement control path
>   qaic: Implement data path
>   qaic: Implement PCI link status error handlers
>   qaic: Implement MHI error status handler
>   MAINTAINERS: Add entry for QAIC driver
> 
>  MAINTAINERS                        |    7 +
>  drivers/misc/Kconfig               |    1 +
>  drivers/misc/Makefile              |    1 +
>  drivers/misc/qaic/Kconfig          |   20 +
>  drivers/misc/qaic/Makefile         |   12 +
>  drivers/misc/qaic/mhi_controller.c |  538 +++++++++++++++++++
>  drivers/misc/qaic/mhi_controller.h |   20 +
>  drivers/misc/qaic/qaic.h           |  111 ++++
>  drivers/misc/qaic/qaic_control.c   | 1015 ++++++++++++++++++++++++++++++++++++
>  drivers/misc/qaic/qaic_data.c      |  952 +++++++++++++++++++++++++++++++++
>  drivers/misc/qaic/qaic_drv.c       |  699 +++++++++++++++++++++++++
>  include/uapi/misc/qaic.h           |  246 +++++++++
>  12 files changed, 3622 insertions(+)
>  create mode 100644 drivers/misc/qaic/Kconfig
>  create mode 100644 drivers/misc/qaic/Makefile
>  create mode 100644 drivers/misc/qaic/mhi_controller.c
>  create mode 100644 drivers/misc/qaic/mhi_controller.h
>  create mode 100644 drivers/misc/qaic/qaic.h
>  create mode 100644 drivers/misc/qaic/qaic_control.c
>  create mode 100644 drivers/misc/qaic/qaic_data.c
>  create mode 100644 drivers/misc/qaic/qaic_drv.c
>  create mode 100644 include/uapi/misc/qaic.h
> 
> -- 
> Qualcomm Technologies, Inc. is a member of the
> Code Aurora Forum, a Linux Foundation Collaborative Project.
Jeffrey Hugo May 19, 2020, 2:16 p.m. UTC | #3
On 5/19/2020 12:57 AM, Manivannan Sadhasivam wrote:
> Hi Jeff,
> 
> On Thu, May 14, 2020 at 08:07:38AM -0600, Jeffrey Hugo wrote:
>> Introduction:
>> Qualcomm Cloud AI 100 is a PCIe adapter card which contains a dedicated
>> SoC ASIC for the purpose of efficently running Deep Learning inference
>> workloads in a data center environment.
>>
>> The offical press release can be found at -
>> https://www.qualcomm.com/news/releases/2019/04/09/qualcomm-brings-power-efficient-artificial-intelligence-inference
>>
>> The offical product website is -
>> https://www.qualcomm.com/products/datacenter-artificial-intelligence
>>
>> At the time of the offical press release, numerious technology news sites
>> also covered the product.  Doing a search of your favorite site is likely
>> to find their coverage of it.
>>
>> It is our goal to have the kernel driver for the product fully upstream.
>> The purpose of this RFC is to start that process.  We are still doing
>> development (see below), and thus not quite looking to gain acceptance quite
>> yet, but now that we have a working driver we beleive we are at the stage
>> where meaningful conversation with the community can occur.
>>
>> Design:
> 
> Can you add documentation in next revision with all this information (or more)?
> In restructured text ofc. Eventhough it is an RFC series, adding documentation
> doesn't hurt and it will help reviewers to understand the hardware better.

Sorry, saw this hit my inbox as I was sending out the next rev.  There 
will be another rev.

Sure.  I'm open to doing that.  Hmm, Documentation/misc-devices seem good?

Do you have specific additional information you think would be good?
Jeffrey Hugo May 19, 2020, 2:57 p.m. UTC | #4
On 5/18/2020 11:08 PM, Dave Airlie wrote:
> On Fri, 15 May 2020 at 00:12, Jeffrey Hugo <jhugo@codeaurora.org> wrote:
>>
>> Introduction:
>> Qualcomm Cloud AI 100 is a PCIe adapter card which contains a dedicated
>> SoC ASIC for the purpose of efficently running Deep Learning inference
>> workloads in a data center environment.
>>
>> The offical press release can be found at -
>> https://www.qualcomm.com/news/releases/2019/04/09/qualcomm-brings-power-efficient-artificial-intelligence-inference
>>
>> The offical product website is -
>> https://www.qualcomm.com/products/datacenter-artificial-intelligence
>>
>> At the time of the offical press release, numerious technology news sites
>> also covered the product.  Doing a search of your favorite site is likely
>> to find their coverage of it.
>>
>> It is our goal to have the kernel driver for the product fully upstream.
>> The purpose of this RFC is to start that process.  We are still doing
>> development (see below), and thus not quite looking to gain acceptance quite
>> yet, but now that we have a working driver we beleive we are at the stage
>> where meaningful conversation with the community can occur.
> 
> 
> Hi Jeffery,
> 
> Just wondering what the userspace/testing plans for this driver.
> 
> This introduces a new user facing API for a device without pointers to
> users or tests for that API.

We have daily internal testing, although I don't expect you to take my 
word for that.

I would like to get one of these devices into the hands of Linaro, so 
that it can be put into KernelCI.  Similar to other Qualcomm products. 
I'm trying to convince the powers that be to make this happen.

Regarding what the community could do on its own, everything but the 
Linux driver is considered proprietary - that includes the on device 
firmware and the entire userspace stack.  This is a decision above my 
pay grade.

I've asked for authorization to develop and publish a simple userspace 
application that might enable the community to do such testing, but 
obtaining that authorization has been slow.

> Although this isn't a graphics driver, and Greg will likely merge
> anything to the kernel you throw at him, I do wonder how to validate
> the uapi from a security perspective. It's always interesting when
> someone wraps a DMA engine with user ioctls, and without enough
> information to decide if the DMA engine is secure against userspace
> misprogramming it.

I'm curious, what information might you be looking for?  Are you 
concerned about the device attacking the host, or the host attacking the 
device?

> Also if we don't understand the programming API on board the device,
> we can't tell if the "core" on the device are able to reprogram the
> device engines either.

So, you are looking for details about the messaging protocol which are 
considered opaque to the kernel driver?  Or something else?

> Figuring this out is difficult at the best of times, it helps if there
> is access to the complete device documentation or user space side
> drivers in order to faciliate this.

Regarding access to documentation, sadly that isn't going to happen now, 
or in the near future.  Again, above my pay grade.  The only public 
"documentation" is what you can see from my emails.

I understand your position, and if I can "bound" the information you are 
looking for, I can see what I can do about getting you what you want. 
No promises, but I will try.

> The other area I mention is testing the uAPI, how do you envisage
> regression testing and long term sustainability of the uAPI?

Can you clarify what you mean by "uAPI"?  Are you referring to the 
interface between the device and the kernel driver?
Greg Kroah-Hartman May 19, 2020, 5:33 p.m. UTC | #5
On Tue, May 19, 2020 at 03:08:42PM +1000, Dave Airlie wrote:
> On Fri, 15 May 2020 at 00:12, Jeffrey Hugo <jhugo@codeaurora.org> wrote:
> >
> > Introduction:
> > Qualcomm Cloud AI 100 is a PCIe adapter card which contains a dedicated
> > SoC ASIC for the purpose of efficently running Deep Learning inference
> > workloads in a data center environment.
> >
> > The offical press release can be found at -
> > https://www.qualcomm.com/news/releases/2019/04/09/qualcomm-brings-power-efficient-artificial-intelligence-inference
> >
> > The offical product website is -
> > https://www.qualcomm.com/products/datacenter-artificial-intelligence
> >
> > At the time of the offical press release, numerious technology news sites
> > also covered the product.  Doing a search of your favorite site is likely
> > to find their coverage of it.
> >
> > It is our goal to have the kernel driver for the product fully upstream.
> > The purpose of this RFC is to start that process.  We are still doing
> > development (see below), and thus not quite looking to gain acceptance quite
> > yet, but now that we have a working driver we beleive we are at the stage
> > where meaningful conversation with the community can occur.
> 
> 
> Hi Jeffery,
> 
> Just wondering what the userspace/testing plans for this driver.
> 
> This introduces a new user facing API for a device without pointers to
> users or tests for that API.
> 
> Although this isn't a graphics driver, and Greg will likely merge
> anything to the kernel you throw at him, I do wonder how to validate
> the uapi from a security perspective. It's always interesting when
> someone wraps a DMA engine with user ioctls, and without enough
> information to decide if the DMA engine is secure against userspace
> misprogramming it.

Hey, I'll not merge just anything!

Oh, well, maybe, if it's in staging :)

> Also if we don't understand the programming API on board the device,
> we can't tell if the "core" on the device are able to reprogram the
> device engines either.
> 
> Figuring this out is difficult at the best of times, it helps if there
> is access to the complete device documentation or user space side
> drivers in order to faciliate this.
> 
> The other area I mention is testing the uAPI, how do you envisage
> regression testing and long term sustainability of the uAPI?

I agree with this request, we should have some code that we can run in
order to test that things work properly.

thanks,

greg k-h
Greg Kroah-Hartman May 19, 2020, 5:41 p.m. UTC | #6
On Tue, May 19, 2020 at 08:57:38AM -0600, Jeffrey Hugo wrote:
> On 5/18/2020 11:08 PM, Dave Airlie wrote:
> > On Fri, 15 May 2020 at 00:12, Jeffrey Hugo <jhugo@codeaurora.org> wrote:
> > > 
> > > Introduction:
> > > Qualcomm Cloud AI 100 is a PCIe adapter card which contains a dedicated
> > > SoC ASIC for the purpose of efficently running Deep Learning inference
> > > workloads in a data center environment.
> > > 
> > > The offical press release can be found at -
> > > https://www.qualcomm.com/news/releases/2019/04/09/qualcomm-brings-power-efficient-artificial-intelligence-inference
> > > 
> > > The offical product website is -
> > > https://www.qualcomm.com/products/datacenter-artificial-intelligence
> > > 
> > > At the time of the offical press release, numerious technology news sites
> > > also covered the product.  Doing a search of your favorite site is likely
> > > to find their coverage of it.
> > > 
> > > It is our goal to have the kernel driver for the product fully upstream.
> > > The purpose of this RFC is to start that process.  We are still doing
> > > development (see below), and thus not quite looking to gain acceptance quite
> > > yet, but now that we have a working driver we beleive we are at the stage
> > > where meaningful conversation with the community can occur.
> > 
> > 
> > Hi Jeffery,
> > 
> > Just wondering what the userspace/testing plans for this driver.
> > 
> > This introduces a new user facing API for a device without pointers to
> > users or tests for that API.
> 
> We have daily internal testing, although I don't expect you to take my word
> for that.
> 
> I would like to get one of these devices into the hands of Linaro, so that
> it can be put into KernelCI.  Similar to other Qualcomm products. I'm trying
> to convince the powers that be to make this happen.
> 
> Regarding what the community could do on its own, everything but the Linux
> driver is considered proprietary - that includes the on device firmware and
> the entire userspace stack.  This is a decision above my pay grade.

Ok, that's a decision you are going to have to push upward on, as we
really can't take this without a working, open, userspace.

Especially given the copyright owner of this code, that would be just
crazy and foolish to not have open userspace code as well.  Firmware
would also be wonderful as well, go poke your lawyers about derivative
work issues and the like for fun conversations :)

So without that changed, I'm not going to take this, and push to object
that anyone else take this.

I'm not going to be able to review any of this code anymore until that
changes, sorry.

thanks,

greg k-h
Jeffrey Hugo May 19, 2020, 6:07 p.m. UTC | #7
On 5/19/2020 11:41 AM, Greg Kroah-Hartman wrote:
> On Tue, May 19, 2020 at 08:57:38AM -0600, Jeffrey Hugo wrote:
>> On 5/18/2020 11:08 PM, Dave Airlie wrote:
>>> On Fri, 15 May 2020 at 00:12, Jeffrey Hugo <jhugo@codeaurora.org> wrote:
>>>>
>>>> Introduction:
>>>> Qualcomm Cloud AI 100 is a PCIe adapter card which contains a dedicated
>>>> SoC ASIC for the purpose of efficently running Deep Learning inference
>>>> workloads in a data center environment.
>>>>
>>>> The offical press release can be found at -
>>>> https://www.qualcomm.com/news/releases/2019/04/09/qualcomm-brings-power-efficient-artificial-intelligence-inference
>>>>
>>>> The offical product website is -
>>>> https://www.qualcomm.com/products/datacenter-artificial-intelligence
>>>>
>>>> At the time of the offical press release, numerious technology news sites
>>>> also covered the product.  Doing a search of your favorite site is likely
>>>> to find their coverage of it.
>>>>
>>>> It is our goal to have the kernel driver for the product fully upstream.
>>>> The purpose of this RFC is to start that process.  We are still doing
>>>> development (see below), and thus not quite looking to gain acceptance quite
>>>> yet, but now that we have a working driver we beleive we are at the stage
>>>> where meaningful conversation with the community can occur.
>>>
>>>
>>> Hi Jeffery,
>>>
>>> Just wondering what the userspace/testing plans for this driver.
>>>
>>> This introduces a new user facing API for a device without pointers to
>>> users or tests for that API.
>>
>> We have daily internal testing, although I don't expect you to take my word
>> for that.
>>
>> I would like to get one of these devices into the hands of Linaro, so that
>> it can be put into KernelCI.  Similar to other Qualcomm products. I'm trying
>> to convince the powers that be to make this happen.
>>
>> Regarding what the community could do on its own, everything but the Linux
>> driver is considered proprietary - that includes the on device firmware and
>> the entire userspace stack.  This is a decision above my pay grade.
> 
> Ok, that's a decision you are going to have to push upward on, as we
> really can't take this without a working, open, userspace.

Fair enough.  I hope that your position may have made things easier for me.

I hope this doesn't widen the rift as it were, but what is the "bar" for 
this userspace?

Is a simple test application that adds two numbers on the hardware 
acceptable?

What is the bar "working"?  I intend to satisfy this request in good 
faith, but I wonder, if no one has the hardware besides our customers, 
and possibly KernelCI, can you really say that I've provided a working 
userspace?

> Especially given the copyright owner of this code, that would be just
> crazy and foolish to not have open userspace code as well.  Firmware
> would also be wonderful as well, go poke your lawyers about derivative
> work issues and the like for fun conversations :)

Those are the kind of conversations I try to avoid  :)

> So without that changed, I'm not going to take this, and push to object
> that anyone else take this.
> 
> I'm not going to be able to review any of this code anymore until that
> changes, sorry.
> 
> thanks,
> 
> greg k-h
>
Greg Kroah-Hartman May 19, 2020, 6:12 p.m. UTC | #8
On Tue, May 19, 2020 at 12:07:03PM -0600, Jeffrey Hugo wrote:
> On 5/19/2020 11:41 AM, Greg Kroah-Hartman wrote:
> > On Tue, May 19, 2020 at 08:57:38AM -0600, Jeffrey Hugo wrote:
> > > On 5/18/2020 11:08 PM, Dave Airlie wrote:
> > > > On Fri, 15 May 2020 at 00:12, Jeffrey Hugo <jhugo@codeaurora.org> wrote:
> > > > > 
> > > > > Introduction:
> > > > > Qualcomm Cloud AI 100 is a PCIe adapter card which contains a dedicated
> > > > > SoC ASIC for the purpose of efficently running Deep Learning inference
> > > > > workloads in a data center environment.
> > > > > 
> > > > > The offical press release can be found at -
> > > > > https://www.qualcomm.com/news/releases/2019/04/09/qualcomm-brings-power-efficient-artificial-intelligence-inference
> > > > > 
> > > > > The offical product website is -
> > > > > https://www.qualcomm.com/products/datacenter-artificial-intelligence
> > > > > 
> > > > > At the time of the offical press release, numerious technology news sites
> > > > > also covered the product.  Doing a search of your favorite site is likely
> > > > > to find their coverage of it.
> > > > > 
> > > > > It is our goal to have the kernel driver for the product fully upstream.
> > > > > The purpose of this RFC is to start that process.  We are still doing
> > > > > development (see below), and thus not quite looking to gain acceptance quite
> > > > > yet, but now that we have a working driver we beleive we are at the stage
> > > > > where meaningful conversation with the community can occur.
> > > > 
> > > > 
> > > > Hi Jeffery,
> > > > 
> > > > Just wondering what the userspace/testing plans for this driver.
> > > > 
> > > > This introduces a new user facing API for a device without pointers to
> > > > users or tests for that API.
> > > 
> > > We have daily internal testing, although I don't expect you to take my word
> > > for that.
> > > 
> > > I would like to get one of these devices into the hands of Linaro, so that
> > > it can be put into KernelCI.  Similar to other Qualcomm products. I'm trying
> > > to convince the powers that be to make this happen.
> > > 
> > > Regarding what the community could do on its own, everything but the Linux
> > > driver is considered proprietary - that includes the on device firmware and
> > > the entire userspace stack.  This is a decision above my pay grade.
> > 
> > Ok, that's a decision you are going to have to push upward on, as we
> > really can't take this without a working, open, userspace.
> 
> Fair enough.  I hope that your position may have made things easier for me.
> 
> I hope this doesn't widen the rift as it were, but what is the "bar" for
> this userspace?
> 
> Is a simple test application that adds two numbers on the hardware
> acceptable?

Make it the real library that you use for your applications that anyone
can then also use as well if they have the hardware.  Why would you want
something "crippled"?

> What is the bar "working"?  I intend to satisfy this request in good faith,
> but I wonder, if no one has the hardware besides our customers, and possibly
> KernelCI, can you really say that I've provided a working userspace?

How do you know who your customers really are, or who they sell the
chips to?  I could end up with one of these... :)

> > Especially given the copyright owner of this code, that would be just
> > crazy and foolish to not have open userspace code as well.  Firmware
> > would also be wonderful as well, go poke your lawyers about derivative
> > work issues and the like for fun conversations :)
> 
> Those are the kind of conversations I try to avoid  :)

Sounds like you are going to now have to have them, have fun!

greg k-h
Jeffrey Hugo May 19, 2020, 6:26 p.m. UTC | #9
On 5/19/2020 12:12 PM, Greg Kroah-Hartman wrote:
> On Tue, May 19, 2020 at 12:07:03PM -0600, Jeffrey Hugo wrote:
>> On 5/19/2020 11:41 AM, Greg Kroah-Hartman wrote:
>>> On Tue, May 19, 2020 at 08:57:38AM -0600, Jeffrey Hugo wrote:
>>>> On 5/18/2020 11:08 PM, Dave Airlie wrote:
>>>>> On Fri, 15 May 2020 at 00:12, Jeffrey Hugo <jhugo@codeaurora.org> wrote:
>>>>>>
>>>>>> Introduction:
>>>>>> Qualcomm Cloud AI 100 is a PCIe adapter card which contains a dedicated
>>>>>> SoC ASIC for the purpose of efficently running Deep Learning inference
>>>>>> workloads in a data center environment.
>>>>>>
>>>>>> The offical press release can be found at -
>>>>>> https://www.qualcomm.com/news/releases/2019/04/09/qualcomm-brings-power-efficient-artificial-intelligence-inference
>>>>>>
>>>>>> The offical product website is -
>>>>>> https://www.qualcomm.com/products/datacenter-artificial-intelligence
>>>>>>
>>>>>> At the time of the offical press release, numerious technology news sites
>>>>>> also covered the product.  Doing a search of your favorite site is likely
>>>>>> to find their coverage of it.
>>>>>>
>>>>>> It is our goal to have the kernel driver for the product fully upstream.
>>>>>> The purpose of this RFC is to start that process.  We are still doing
>>>>>> development (see below), and thus not quite looking to gain acceptance quite
>>>>>> yet, but now that we have a working driver we beleive we are at the stage
>>>>>> where meaningful conversation with the community can occur.
>>>>>
>>>>>
>>>>> Hi Jeffery,
>>>>>
>>>>> Just wondering what the userspace/testing plans for this driver.
>>>>>
>>>>> This introduces a new user facing API for a device without pointers to
>>>>> users or tests for that API.
>>>>
>>>> We have daily internal testing, although I don't expect you to take my word
>>>> for that.
>>>>
>>>> I would like to get one of these devices into the hands of Linaro, so that
>>>> it can be put into KernelCI.  Similar to other Qualcomm products. I'm trying
>>>> to convince the powers that be to make this happen.
>>>>
>>>> Regarding what the community could do on its own, everything but the Linux
>>>> driver is considered proprietary - that includes the on device firmware and
>>>> the entire userspace stack.  This is a decision above my pay grade.
>>>
>>> Ok, that's a decision you are going to have to push upward on, as we
>>> really can't take this without a working, open, userspace.
>>
>> Fair enough.  I hope that your position may have made things easier for me.
>>
>> I hope this doesn't widen the rift as it were, but what is the "bar" for
>> this userspace?
>>
>> Is a simple test application that adds two numbers on the hardware
>> acceptable?
> 
> Make it the real library that you use for your applications that anyone
> can then also use as well if they have the hardware.  Why would you want
> something "crippled"?

It makes it easier to dance around real or perceived IP issues, and thus 
I can likely more successfully "push upward" as you put it.

> 
>> What is the bar "working"?  I intend to satisfy this request in good faith,
>> but I wonder, if no one has the hardware besides our customers, and possibly
>> KernelCI, can you really say that I've provided a working userspace?
> 
> How do you know who your customers really are, or who they sell the
> chips to?  I could end up with one of these... :)

At this time, I don't think that is going to happen, but I would like to 
see it regardless.

>>> Especially given the copyright owner of this code, that would be just
>>> crazy and foolish to not have open userspace code as well.  Firmware
>>> would also be wonderful as well, go poke your lawyers about derivative
>>> work issues and the like for fun conversations :)
>>
>> Those are the kind of conversations I try to avoid  :)
> 
> Sounds like you are going to now have to have them, have fun!

Honestly, I fail to see where you think there is a derivative work, so, 
I'm not really sure what discussions I need to revisit with our lawyers.
Greg Kroah-Hartman May 20, 2020, 5:32 a.m. UTC | #10
On Tue, May 19, 2020 at 12:26:01PM -0600, Jeffrey Hugo wrote:
> On 5/19/2020 12:12 PM, Greg Kroah-Hartman wrote:
> > > > Especially given the copyright owner of this code, that would be just
> > > > crazy and foolish to not have open userspace code as well.  Firmware
> > > > would also be wonderful as well, go poke your lawyers about derivative
> > > > work issues and the like for fun conversations :)
> > > 
> > > Those are the kind of conversations I try to avoid  :)
> > 
> > Sounds like you are going to now have to have them, have fun!
> 
> Honestly, I fail to see where you think there is a derivative work, so, I'm
> not really sure what discussions I need to revisit with our lawyers.

Given that we are not lawyers, why don't we leave those types of
discussions up to the lawyers, and not depend on people like me and you
for that?  :)

If your lawyers think that the code division is fine as-is, that's
great, I'd be glad to review it if they add their signed-off-by: on it
verifying that the api divide is approved by them.

thanks!

greg k-h