diff mbox

[02/15] drm/i915: Embedded microcontroller (uC) firmware loading support

Message ID 1434393394-21002-3-git-send-email-david.s.gordon@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Dave Gordon June 15, 2015, 6:36 p.m. UTC
Current devices may contain one or more programmable microcontrollers
that need to have a firmware image (aka "binary blob") loaded from an
external medium and transferred to the device's memory.

This file provides generic support functions for doing this; they can
then be used by each uC-specific loader, thus reducing code duplication
and testing effort.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Alex Dai <yu.dai@intel.com>
---
 drivers/gpu/drm/i915/Makefile          |    3 +
 drivers/gpu/drm/i915/intel_uc_loader.c |  312 ++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_uc_loader.h |   82 +++++++++
 3 files changed, 397 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/intel_uc_loader.c
 create mode 100644 drivers/gpu/drm/i915/intel_uc_loader.h

Comments

Daniel Vetter June 17, 2015, 12:05 p.m. UTC | #1
On Mon, Jun 15, 2015 at 07:36:20PM +0100, Dave Gordon wrote:
> Current devices may contain one or more programmable microcontrollers
> that need to have a firmware image (aka "binary blob") loaded from an
> external medium and transferred to the device's memory.
> 
> This file provides generic support functions for doing this; they can
> then be used by each uC-specific loader, thus reducing code duplication
> and testing effort.
> 
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> Signed-off-by: Alex Dai <yu.dai@intel.com>

Given that I'm just shredding the synchronization used by the dmc loader
I'm not convinced this is a good idea. Abstraction has cost, and a bit of
copy-paste for similar sounding but slightly different things doesn't
sound awful to me. And the critical bit in all the firmware loading I've
seen thus far is in synchronizing the loading with other operations,
hiding that isn't a good idea. Worse if we enforce stuff like requiring
dev->struct_mutex.
-Daniel


> ---
>  drivers/gpu/drm/i915/Makefile          |    3 +
>  drivers/gpu/drm/i915/intel_uc_loader.c |  312 ++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/intel_uc_loader.h |   82 +++++++++
>  3 files changed, 397 insertions(+)
>  create mode 100644 drivers/gpu/drm/i915/intel_uc_loader.c
>  create mode 100644 drivers/gpu/drm/i915/intel_uc_loader.h
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index b7ddf48..607fa2a 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -38,6 +38,9 @@ i915-y += i915_cmd_parser.o \
>  	  intel_ringbuffer.o \
>  	  intel_uncore.o
>  
> +# generic ancilliary microcontroller support
> +i915-y += intel_uc_loader.o
> +
>  # autogenerated null render state
>  i915-y += intel_renderstate_gen6.o \
>  	  intel_renderstate_gen7.o \
> diff --git a/drivers/gpu/drm/i915/intel_uc_loader.c b/drivers/gpu/drm/i915/intel_uc_loader.c
> new file mode 100644
> index 0000000..26f0fbe
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/intel_uc_loader.c
> @@ -0,0 +1,312 @@
> +/*
> + * Copyright © 2014 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> + * IN THE SOFTWARE.
> + *
> + * Author:
> + *	Dave Gordon <david.s.gordon@intel.com>
> + */
> +#include <linux/firmware.h>
> +#include "i915_drv.h"
> +#include "intel_uc_loader.h"
> +
> +/**
> + * DOC: Generic embedded microcontroller (uC) firmware loading support
> + *
> + * The functions in this file provide a generic way to load the firmware that
> + * may be required by an embedded microcontroller (uC).
> + *
> + * The function intel_uc_fw_init() should be called early, and will initiate
> + * an asynchronous request to fetch the firmware image (aka "binary blob").
> + * When the image has been fetched into memory, the kernel will call back to
> + * uc_fw_fetch_callback() whose function is simply to record the completion
> + * status, and stash the firmware blob for later.
> + *
> + * At some convenient point after GEM initialisation, the driver should call
> + * intel_uc_fw_check(); this will check whether the asynchronous thread has
> + * completed and wait for it if not, check whether the image was successfully
> + * fetched; and then allow the callback() function (if provided) to validate
> + * the image and/or save the data in a GEM object.
> + *
> + * Thereafter the uC-specific code can transfer the data in the GEM object
> + * to the uC's memory (in some uC-specific way, not handled here).
> + *
> + * During driver shutdown, or if driver load is aborted, intel_uc_fw_fini()
> + * should be called to release any remaining resources.
> + */
> +
> +
> +/*
> + * Called once per uC, late in driver initialisation. GEM is now ready, and so
> + * we can now create a GEM object to hold the uC firmware. But first, we must
> + * synchronise with the firmware-fetching thread that was initiated during
> + * early driver load, in intel_uc_fw_init(), and see whether it successfully
> + * fetched the firmware blob.
> + */
> +static void
> +uc_fw_fetch_wait(struct intel_uc_fw *uc_fw,
> +		 bool callback(struct intel_uc_fw *))
> +{
> +	struct drm_device *dev = uc_fw->uc_dev;
> +	struct drm_i915_gem_object *obj;
> +	const struct firmware *fw;
> +
> +	DRM_DEBUG_DRIVER("before waiting: %s fw fetch status %d, fw %p\n",
> +		uc_fw->uc_name, uc_fw->uc_fw_fetch_status, uc_fw->uc_fw_blob);
> +
> +	WARN_ON(!mutex_is_locked(&dev->struct_mutex));
> +	WARN_ON(uc_fw->uc_fw_fetch_status != INTEL_UC_FIRMWARE_PENDING);
> +
> +	wait_for_completion(&uc_fw->uc_fw_fetched);
> +
> +	DRM_DEBUG_DRIVER("after waiting: %s fw fetch status %d, fw %p\n",
> +		uc_fw->uc_name, uc_fw->uc_fw_fetch_status, uc_fw->uc_fw_blob);
> +
> +	fw = uc_fw->uc_fw_blob;
> +	if (!fw) {
> +		/* no firmware found; try again in case FS was not mounted */
> +		DRM_DEBUG_DRIVER("retry fetching %s fw from <%s>\n",
> +			uc_fw->uc_name, uc_fw->uc_fw_path);
> +		if (request_firmware(&fw, uc_fw->uc_fw_path, &dev->pdev->dev))
> +			goto fail;
> +		if (!fw)
> +			goto fail;
> +		DRM_DEBUG_DRIVER("fetch %s fw from <%s> succeeded, fw %p\n",
> +			uc_fw->uc_name, uc_fw->uc_fw_path, fw);
> +		uc_fw->uc_fw_blob = fw;
> +	}
> +
> +	/* Callback to the optional uC-specific function, if supplied */
> +	if (callback && !callback(uc_fw))
> +		goto fail;
> +
> +	/* Callback may have done the object allocation & write itself */
> +	obj = uc_fw->uc_fw_obj;
> +	if (!obj) {
> +		size_t pages = round_up(fw->size, PAGE_SIZE);
> +		obj = i915_gem_alloc_object(dev, pages);
> +		if (!obj)
> +			goto fail;
> +
> +		uc_fw->uc_fw_obj = obj;
> +		uc_fw->uc_fw_size = fw->size;
> +		if (i915_gem_object_write(obj, fw->data, fw->size))
> +			goto fail;
> +	}
> +
> +	DRM_DEBUG_DRIVER("%s fw fetch status SUCCESS\n", uc_fw->uc_name);
> +	release_firmware(fw);
> +	uc_fw->uc_fw_blob = NULL;
> +	uc_fw->uc_fw_fetch_status = INTEL_UC_FIRMWARE_SUCCESS;
> +	return;
> +
> +fail:
> +	DRM_DEBUG_DRIVER("%s fw fetch status FAIL; fw %p, obj %p\n",
> +		uc_fw->uc_name, fw, uc_fw->uc_fw_obj);
> +	DRM_ERROR("Failed to fetch %s firmware from <%s>\n",
> +		  uc_fw->uc_name, uc_fw->uc_fw_path);
> +
> +	obj = uc_fw->uc_fw_obj;
> +	if (obj)
> +		drm_gem_object_unreference(&obj->base);
> +	uc_fw->uc_fw_obj = NULL;
> +
> +	release_firmware(fw);		/* OK even if fw is NULL */
> +	uc_fw->uc_fw_blob = NULL;
> +	uc_fw->uc_fw_fetch_status = INTEL_UC_FIRMWARE_FAIL;
> +}
> +
> +/**
> + * intel_uc_fw_check() - check the status of the firmware fetching process
> + * @uc_fw:	intel_uc_fw structure
> + * @callback:	optional callback function to validate and/or save the image
> + *
> + * If the fetch is still PENDING, wait for completion first, then check and
> + * return the outcome. Subsequent calls will just return the same outcome
> + * based on the recorded fetch status, without triggering another fetch
> + * and without calling @callback().
> + *
> + * After this call, @uc_fw->uc_fw_fetch_status will show whether the firmware
> + * image was successfully fetched and transferred to a GEM object. If it is
> + * INTEL_UC_FIRMWARE_SUCCESS, @uc_fw->uc_fw_obj will be point to the GEM
> + * object, and the size of the image will be in @uc_fw->uc_fw_size.  For any
> + * other status value, these members are undefined.
> + *
> + * The @callback() parameter allows the uC-specific code to validate the
> + * image before it is saved, and also to override the default save mechanism
> + * if required. When it is called, @uc_fw->uc_fw_blob refers to the fetched
> + * firmware image, and @uc_fw->uc_fw_obj is NULL.
> + *
> + * If @callback() returns FALSE, the fetched image is considered invalid.
> + * The fetch status will be set to FAIL, and this function will return -EIO.
> + *
> + * If @callback() returns TRUE but doesn't set @uc_fw->uc_fw_obj, the image
> + * is considered good; it will be saved in a GEM object as described above.
> + * This is the default if no @callback() is supplied.
> + *
> + * If @callback() returns TRUE after setting @uc_fw->uc_fw_obj, this means
> + * that the image has already been saved by @callback() itself. This allows
> + * @callback() to customise the format of the data in the GEM object, for
> + * example if it needs to save only a portion of the loaded image.
> + *
> + * In all cases the firmware blob is released before this function returns.
> + *
> + * Return:	non-zero code on error
> + */
> +int
> +intel_uc_fw_check(struct intel_uc_fw *uc_fw,
> +		  bool callback(struct intel_uc_fw *))
> +{
> +	WARN_ON(!mutex_is_locked(&uc_fw->uc_dev->struct_mutex));
> +
> +	if (uc_fw->uc_fw_fetch_status == INTEL_UC_FIRMWARE_PENDING) {
> +		/* We only come here once */
> +		uc_fw_fetch_wait(uc_fw, callback);
> +		/* state must now be FAIL or SUCCESS */
> +	}
> +
> +	DRM_DEBUG_DRIVER("%s fw fetch status %d\n",
> +		uc_fw->uc_name, uc_fw->uc_fw_fetch_status);
> +
> +	switch (uc_fw->uc_fw_fetch_status) {
> +	case INTEL_UC_FIRMWARE_FAIL:
> +		/* something went wrong :( */
> +		return -EIO;
> +
> +	case INTEL_UC_FIRMWARE_NONE:
> +		/* no firmware, nothing to do (not an error) */
> +		return 0;
> +
> +	case INTEL_UC_FIRMWARE_PENDING:
> +	default:
> +		/* "can't happen" */
> +		WARN_ONCE(1, "%s fw <%s> invalid uc_fw_fetch_status %d!\n",
> +			uc_fw->uc_name, uc_fw->uc_fw_path,
> +			uc_fw->uc_fw_fetch_status);
> +		return -ENXIO;
> +
> +	case INTEL_UC_FIRMWARE_SUCCESS:
> +		return 0;
> +	}
> +}
> +
> +/*
> + * Callback from the kernel's asynchronous firmware-fetching subsystem.
> + * All we have to do here is stash the blob and signal completion.
> + * Error checking (e.g. no firmware found) is left to mainline code.
> + * We don't have (and don't want or need to acquire) the struct_mutex here.
> + */
> +static void
> +uc_fw_fetch_callback(const struct firmware *fw, void *context)
> +{
> +	struct intel_uc_fw *uc_fw = context;
> +
> +	WARN_ON(uc_fw->uc_fw_fetch_status != INTEL_UC_FIRMWARE_PENDING);
> +	DRM_DEBUG_DRIVER("%s firmware fetch from <%s> status %d, fw %p\n",
> +			uc_fw->uc_name, uc_fw->uc_fw_path,
> +			uc_fw->uc_fw_fetch_status, fw);
> +
> +	uc_fw->uc_fw_blob = fw;
> +	complete(&uc_fw->uc_fw_fetched);
> +}
> +
> +/**
> + * intel_uc_fw_init() - initiate the fetching of firmware
> + * @dev:	drm device
> + * @uc_fw:	intel_uc_fw structure
> + * @name:	human-readable device name (e.g. "GuC") for messages
> + * @fw_path:	(trailing parts of) path to firmware (e.g. "i915/guc_fw.bin")
> + * 		@fw_path == NULL means "no firmware expected" (not an error),
> + * 		@fw_path == "" (empty string) means "firmware unknown" i.e.
> + * 		the uC requires firmware, but the driver doesn't know where
> + * 		to find the proper version. This will be logged as an error.
> + *
> + * This is called just once per uC, during driver loading. It is therefore
> + * automatically single-threaded and does not need to acquire any mutexes
> + * or spinlocks. OTOH, GEM is not yet fully initialised, so we can't do
> + * very much here.
> + *
> + * The main task here is to initiate the fetching of the uC firmware into
> + * memory, using the standard kernel firmware fetching support.  The actual
> + * fetching will then proceed asynchronously and in parallel with the rest
> + * of driver initialisation; later in the loading process we will synchronise
> + * with the firmware-fetching thread before transferring the firmware image
> + * firstly into a GEM object and then into the uC's memory.
> + */
> +void
> +intel_uc_fw_init(struct drm_device *dev, struct intel_uc_fw *uc_fw,
> +		 const char *name, const char *fw_path)
> +{
> +	uc_fw->uc_dev = dev;
> +	uc_fw->uc_name = name;
> +	uc_fw->uc_fw_path = fw_path;
> +	uc_fw->uc_fw_fetch_status = INTEL_UC_FIRMWARE_NONE;
> +	uc_fw->uc_fw_load_status = INTEL_UC_FIRMWARE_NONE;
> +	init_completion(&uc_fw->uc_fw_fetched);
> +
> +	if (fw_path == NULL)
> +		return;
> +
> +	if (*fw_path == '\0') {
> +		DRM_ERROR("No %s firmware known for this platform\n", name);
> +		uc_fw->uc_fw_fetch_status = INTEL_UC_FIRMWARE_FAIL;
> +		return;
> +	}
> +
> +	uc_fw->uc_fw_fetch_status = INTEL_UC_FIRMWARE_PENDING;
> +
> +	if (request_firmware_nowait(THIS_MODULE, true, fw_path,
> +				    &dev->pdev->dev,
> +				    GFP_KERNEL, uc_fw,
> +				    uc_fw_fetch_callback)) {
> +		DRM_ERROR("Failed to request %s firmware from <%s>\n",
> +			  name, fw_path);
> +		uc_fw->uc_fw_fetch_status = INTEL_UC_FIRMWARE_FAIL;
> +		return;
> +	}
> +
> +	/* firmware fetch initiated, callback will signal completion */
> +	DRM_DEBUG_DRIVER("initiated fetching %s firmware from <%s>\n",
> +		name, fw_path);
> +}
> +
> +/**
> + * intel_uc_fw_fini() - clean up all uC firmware-related data
> + * @uc_fw:	intel_uc_fw structure
> + */
> +void
> +intel_uc_fw_fini(struct intel_uc_fw *uc_fw)
> +{
> +	WARN_ON(!mutex_is_locked(&uc_fw->uc_dev->struct_mutex));
> +
> +	/*
> +	 * Generally, the blob should have been released earlier, but
> +	 * if the driver load were aborted after the fetch had been
> +	 * initiated but not completed it might still be around
> +	 */
> +	if (uc_fw->uc_fw_fetch_status == INTEL_UC_FIRMWARE_PENDING)
> +		wait_for_completion(&uc_fw->uc_fw_fetched);
> +	release_firmware(uc_fw->uc_fw_blob);	/* OK even if NULL */
> +	uc_fw->uc_fw_blob = NULL;
> +
> +	if (uc_fw->uc_fw_obj)
> +		drm_gem_object_unreference(&uc_fw->uc_fw_obj->base);
> +	uc_fw->uc_fw_obj = NULL;
> +}
> diff --git a/drivers/gpu/drm/i915/intel_uc_loader.h b/drivers/gpu/drm/i915/intel_uc_loader.h
> new file mode 100644
> index 0000000..22502ea
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/intel_uc_loader.h
> @@ -0,0 +1,82 @@
> +/*
> + * Copyright © 2014 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> + * IN THE SOFTWARE.
> + *
> + * Author:
> + *	Dave Gordon <david.s.gordon@intel.com>
> + */
> +#ifndef _INTEL_UC_LOADER_H
> +#define _INTEL_UC_LOADER_H
> +
> +/*
> + * Microcontroller (uC) firmware loading support
> + */
> +
> +/*
> + * These values are used to track the stages of getting the required firmware
> + * into an onboard microcontroller. The common code tracks the phases of
> + * fetching the firmware (aka "binary blob") from an external file into a GEM
> + * object in the 'uc_fw_fetch_status' field below; the uC-specific DMA code
> + * uses the 'uc_fw_load_status' field to track the transfer from GEM object
> + * to uC memory.
> + *
> + * For the first (fetch) stage, the interpretation of the values is:
> + * NONE - no firmware is being fetched e.g. because there is no uC
> + * PENDING - firmware fetch initiated; callback will complete 'uc_fw_fetched'
> + * SUCCESS - uC firmware fetched into a GEM object and ready for use
> + * FAIL - something went wrong; uC firmware is not available
> + *
> + * The second (load) stage is simpler as there is no asynchronous handoff:
> + * NONE - no firmware is being loaded e.g. because there is no uC
> + * PENDING - firmware DMA load in progress
> + * SUCCESS - uC firmware loaded into uC memory and ready for use
> + * FAIL - something went wrong; uC firmware is not available
> + */
> +enum intel_uc_fw_status {
> +	INTEL_UC_FIRMWARE_FAIL = -1,
> +	INTEL_UC_FIRMWARE_NONE = 0,
> +	INTEL_UC_FIRMWARE_PENDING,
> +	INTEL_UC_FIRMWARE_SUCCESS
> +};
> +
> +/*
> + * This structure encapsulates all the data needed during the process of
> + * fetching, caching, and loading the firmware image into the uC.
> + */
> +struct intel_uc_fw {
> +	struct drm_device *		uc_dev;
> +	const char *			uc_name;
> +	const char *			uc_fw_path;
> +	const struct firmware *		uc_fw_blob;
> +	struct completion		uc_fw_fetched;
> +	size_t				uc_fw_size;
> +	struct drm_i915_gem_object *	uc_fw_obj;
> +	enum intel_uc_fw_status		uc_fw_fetch_status;
> +	enum intel_uc_fw_status		uc_fw_load_status;
> +};
> +
> +void intel_uc_fw_init(struct drm_device *dev, struct intel_uc_fw *uc_fw,
> +		const char *uc_name, const char *fw_path);
> +int intel_uc_fw_check(struct intel_uc_fw *uc_fw,
> +		bool callback(struct intel_uc_fw *));
> +void intel_uc_fw_fini(struct intel_uc_fw *uc_fw);
> +
> +#endif
> -- 
> 1.7.9.5
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Dave Gordon June 18, 2015, 12:11 p.m. UTC | #2
On 17/06/15 13:05, Daniel Vetter wrote:
> On Mon, Jun 15, 2015 at 07:36:20PM +0100, Dave Gordon wrote:
>> Current devices may contain one or more programmable microcontrollers
>> that need to have a firmware image (aka "binary blob") loaded from an
>> external medium and transferred to the device's memory.
>>
>> This file provides generic support functions for doing this; they can
>> then be used by each uC-specific loader, thus reducing code duplication
>> and testing effort.
>>
>> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
>> Signed-off-by: Alex Dai <yu.dai@intel.com>
> 
> Given that I'm just shredding the synchronization used by the dmc loader
> I'm not convinced this is a good idea. Abstraction has cost, and a bit of
> copy-paste for similar sounding but slightly different things doesn't
> sound awful to me. And the critical bit in all the firmware loading I've
> seen thus far is in synchronizing the loading with other operations,
> hiding that isn't a good idea. Worse if we enforce stuff like requiring
> dev->struct_mutex.
> -Daniel

It's precisely because it's in some sense "trivial-but-tricky" that we
should write it once, get it right, and use it everywhere. Copypaste
/does/ sound awful; I've seen how the code this was derived from had
already been cloned into three flavours, all different and all wrong.

It's a very simple abstraction: one early call to kick things off as
early as possible, no locking required. One late call with the
struct_mutex held to complete the synchronisation and actually do the
work, thus guaranteeing that the transfer to the target uC is done in a
controlled fashion, at a time of the caller's choice, and by the
driver's mainline thread, NOT by an asynchronous thread racing with
other activity (which was one of the things wrong with the original
version).

We should convert the DMC loader to use this too, so there need be only
one bit of code in the whole driver that needs to understand how to use
completions to get correct handover from a free-running no-locks-held
thread to the properly disciplined environment of driver mainline for
purposes of programming the h/w.

.Dave.

>> ---
>>  drivers/gpu/drm/i915/Makefile          |    3 +
>>  drivers/gpu/drm/i915/intel_uc_loader.c |  312 ++++++++++++++++++++++++++++++++
>>  drivers/gpu/drm/i915/intel_uc_loader.h |   82 +++++++++
>>  3 files changed, 397 insertions(+)
>>  create mode 100644 drivers/gpu/drm/i915/intel_uc_loader.c
>>  create mode 100644 drivers/gpu/drm/i915/intel_uc_loader.h
>>
>> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
>> index b7ddf48..607fa2a 100644
>> --- a/drivers/gpu/drm/i915/Makefile
>> +++ b/drivers/gpu/drm/i915/Makefile
>> @@ -38,6 +38,9 @@ i915-y += i915_cmd_parser.o \
>>  	  intel_ringbuffer.o \
>>  	  intel_uncore.o
>>  
>> +# generic ancilliary microcontroller support
>> +i915-y += intel_uc_loader.o
>> +
>>  # autogenerated null render state
>>  i915-y += intel_renderstate_gen6.o \
>>  	  intel_renderstate_gen7.o \
>> diff --git a/drivers/gpu/drm/i915/intel_uc_loader.c b/drivers/gpu/drm/i915/intel_uc_loader.c
>> new file mode 100644
>> index 0000000..26f0fbe
>> --- /dev/null
>> +++ b/drivers/gpu/drm/i915/intel_uc_loader.c
>> @@ -0,0 +1,312 @@
>> +/*
>> + * Copyright © 2014 Intel Corporation
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a
>> + * copy of this software and associated documentation files (the "Software"),
>> + * to deal in the Software without restriction, including without limitation
>> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice (including the next
>> + * paragraph) shall be included in all copies or substantial portions of the
>> + * Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
>> + * IN THE SOFTWARE.
>> + *
>> + * Author:
>> + *	Dave Gordon <david.s.gordon@intel.com>
>> + */
>> +#include <linux/firmware.h>
>> +#include "i915_drv.h"
>> +#include "intel_uc_loader.h"
>> +
>> +/**
>> + * DOC: Generic embedded microcontroller (uC) firmware loading support
>> + *
>> + * The functions in this file provide a generic way to load the firmware that
>> + * may be required by an embedded microcontroller (uC).
>> + *
>> + * The function intel_uc_fw_init() should be called early, and will initiate
>> + * an asynchronous request to fetch the firmware image (aka "binary blob").
>> + * When the image has been fetched into memory, the kernel will call back to
>> + * uc_fw_fetch_callback() whose function is simply to record the completion
>> + * status, and stash the firmware blob for later.
>> + *
>> + * At some convenient point after GEM initialisation, the driver should call
>> + * intel_uc_fw_check(); this will check whether the asynchronous thread has
>> + * completed and wait for it if not, check whether the image was successfully
>> + * fetched; and then allow the callback() function (if provided) to validate
>> + * the image and/or save the data in a GEM object.
>> + *
>> + * Thereafter the uC-specific code can transfer the data in the GEM object
>> + * to the uC's memory (in some uC-specific way, not handled here).
>> + *
>> + * During driver shutdown, or if driver load is aborted, intel_uc_fw_fini()
>> + * should be called to release any remaining resources.
>> + */
>> +
>> +
>> +/*
>> + * Called once per uC, late in driver initialisation. GEM is now ready, and so
>> + * we can now create a GEM object to hold the uC firmware. But first, we must
>> + * synchronise with the firmware-fetching thread that was initiated during
>> + * early driver load, in intel_uc_fw_init(), and see whether it successfully
>> + * fetched the firmware blob.
>> + */
>> +static void
>> +uc_fw_fetch_wait(struct intel_uc_fw *uc_fw,
>> +		 bool callback(struct intel_uc_fw *))
>> +{
>> +	struct drm_device *dev = uc_fw->uc_dev;
>> +	struct drm_i915_gem_object *obj;
>> +	const struct firmware *fw;
>> +
>> +	DRM_DEBUG_DRIVER("before waiting: %s fw fetch status %d, fw %p\n",
>> +		uc_fw->uc_name, uc_fw->uc_fw_fetch_status, uc_fw->uc_fw_blob);
>> +
>> +	WARN_ON(!mutex_is_locked(&dev->struct_mutex));
>> +	WARN_ON(uc_fw->uc_fw_fetch_status != INTEL_UC_FIRMWARE_PENDING);
>> +
>> +	wait_for_completion(&uc_fw->uc_fw_fetched);
>> +
>> +	DRM_DEBUG_DRIVER("after waiting: %s fw fetch status %d, fw %p\n",
>> +		uc_fw->uc_name, uc_fw->uc_fw_fetch_status, uc_fw->uc_fw_blob);
>> +
>> +	fw = uc_fw->uc_fw_blob;
>> +	if (!fw) {
>> +		/* no firmware found; try again in case FS was not mounted */
>> +		DRM_DEBUG_DRIVER("retry fetching %s fw from <%s>\n",
>> +			uc_fw->uc_name, uc_fw->uc_fw_path);
>> +		if (request_firmware(&fw, uc_fw->uc_fw_path, &dev->pdev->dev))
>> +			goto fail;
>> +		if (!fw)
>> +			goto fail;
>> +		DRM_DEBUG_DRIVER("fetch %s fw from <%s> succeeded, fw %p\n",
>> +			uc_fw->uc_name, uc_fw->uc_fw_path, fw);
>> +		uc_fw->uc_fw_blob = fw;
>> +	}
>> +
>> +	/* Callback to the optional uC-specific function, if supplied */
>> +	if (callback && !callback(uc_fw))
>> +		goto fail;
>> +
>> +	/* Callback may have done the object allocation & write itself */
>> +	obj = uc_fw->uc_fw_obj;
>> +	if (!obj) {
>> +		size_t pages = round_up(fw->size, PAGE_SIZE);
>> +		obj = i915_gem_alloc_object(dev, pages);
>> +		if (!obj)
>> +			goto fail;
>> +
>> +		uc_fw->uc_fw_obj = obj;
>> +		uc_fw->uc_fw_size = fw->size;
>> +		if (i915_gem_object_write(obj, fw->data, fw->size))
>> +			goto fail;
>> +	}
>> +
>> +	DRM_DEBUG_DRIVER("%s fw fetch status SUCCESS\n", uc_fw->uc_name);
>> +	release_firmware(fw);
>> +	uc_fw->uc_fw_blob = NULL;
>> +	uc_fw->uc_fw_fetch_status = INTEL_UC_FIRMWARE_SUCCESS;
>> +	return;
>> +
>> +fail:
>> +	DRM_DEBUG_DRIVER("%s fw fetch status FAIL; fw %p, obj %p\n",
>> +		uc_fw->uc_name, fw, uc_fw->uc_fw_obj);
>> +	DRM_ERROR("Failed to fetch %s firmware from <%s>\n",
>> +		  uc_fw->uc_name, uc_fw->uc_fw_path);
>> +
>> +	obj = uc_fw->uc_fw_obj;
>> +	if (obj)
>> +		drm_gem_object_unreference(&obj->base);
>> +	uc_fw->uc_fw_obj = NULL;
>> +
>> +	release_firmware(fw);		/* OK even if fw is NULL */
>> +	uc_fw->uc_fw_blob = NULL;
>> +	uc_fw->uc_fw_fetch_status = INTEL_UC_FIRMWARE_FAIL;
>> +}
>> +
>> +/**
>> + * intel_uc_fw_check() - check the status of the firmware fetching process
>> + * @uc_fw:	intel_uc_fw structure
>> + * @callback:	optional callback function to validate and/or save the image
>> + *
>> + * If the fetch is still PENDING, wait for completion first, then check and
>> + * return the outcome. Subsequent calls will just return the same outcome
>> + * based on the recorded fetch status, without triggering another fetch
>> + * and without calling @callback().
>> + *
>> + * After this call, @uc_fw->uc_fw_fetch_status will show whether the firmware
>> + * image was successfully fetched and transferred to a GEM object. If it is
>> + * INTEL_UC_FIRMWARE_SUCCESS, @uc_fw->uc_fw_obj will be point to the GEM
>> + * object, and the size of the image will be in @uc_fw->uc_fw_size.  For any
>> + * other status value, these members are undefined.
>> + *
>> + * The @callback() parameter allows the uC-specific code to validate the
>> + * image before it is saved, and also to override the default save mechanism
>> + * if required. When it is called, @uc_fw->uc_fw_blob refers to the fetched
>> + * firmware image, and @uc_fw->uc_fw_obj is NULL.
>> + *
>> + * If @callback() returns FALSE, the fetched image is considered invalid.
>> + * The fetch status will be set to FAIL, and this function will return -EIO.
>> + *
>> + * If @callback() returns TRUE but doesn't set @uc_fw->uc_fw_obj, the image
>> + * is considered good; it will be saved in a GEM object as described above.
>> + * This is the default if no @callback() is supplied.
>> + *
>> + * If @callback() returns TRUE after setting @uc_fw->uc_fw_obj, this means
>> + * that the image has already been saved by @callback() itself. This allows
>> + * @callback() to customise the format of the data in the GEM object, for
>> + * example if it needs to save only a portion of the loaded image.
>> + *
>> + * In all cases the firmware blob is released before this function returns.
>> + *
>> + * Return:	non-zero code on error
>> + */
>> +int
>> +intel_uc_fw_check(struct intel_uc_fw *uc_fw,
>> +		  bool callback(struct intel_uc_fw *))
>> +{
>> +	WARN_ON(!mutex_is_locked(&uc_fw->uc_dev->struct_mutex));
>> +
>> +	if (uc_fw->uc_fw_fetch_status == INTEL_UC_FIRMWARE_PENDING) {
>> +		/* We only come here once */
>> +		uc_fw_fetch_wait(uc_fw, callback);
>> +		/* state must now be FAIL or SUCCESS */
>> +	}
>> +
>> +	DRM_DEBUG_DRIVER("%s fw fetch status %d\n",
>> +		uc_fw->uc_name, uc_fw->uc_fw_fetch_status);
>> +
>> +	switch (uc_fw->uc_fw_fetch_status) {
>> +	case INTEL_UC_FIRMWARE_FAIL:
>> +		/* something went wrong :( */
>> +		return -EIO;
>> +
>> +	case INTEL_UC_FIRMWARE_NONE:
>> +		/* no firmware, nothing to do (not an error) */
>> +		return 0;
>> +
>> +	case INTEL_UC_FIRMWARE_PENDING:
>> +	default:
>> +		/* "can't happen" */
>> +		WARN_ONCE(1, "%s fw <%s> invalid uc_fw_fetch_status %d!\n",
>> +			uc_fw->uc_name, uc_fw->uc_fw_path,
>> +			uc_fw->uc_fw_fetch_status);
>> +		return -ENXIO;
>> +
>> +	case INTEL_UC_FIRMWARE_SUCCESS:
>> +		return 0;
>> +	}
>> +}
>> +
>> +/*
>> + * Callback from the kernel's asynchronous firmware-fetching subsystem.
>> + * All we have to do here is stash the blob and signal completion.
>> + * Error checking (e.g. no firmware found) is left to mainline code.
>> + * We don't have (and don't want or need to acquire) the struct_mutex here.
>> + */
>> +static void
>> +uc_fw_fetch_callback(const struct firmware *fw, void *context)
>> +{
>> +	struct intel_uc_fw *uc_fw = context;
>> +
>> +	WARN_ON(uc_fw->uc_fw_fetch_status != INTEL_UC_FIRMWARE_PENDING);
>> +	DRM_DEBUG_DRIVER("%s firmware fetch from <%s> status %d, fw %p\n",
>> +			uc_fw->uc_name, uc_fw->uc_fw_path,
>> +			uc_fw->uc_fw_fetch_status, fw);
>> +
>> +	uc_fw->uc_fw_blob = fw;
>> +	complete(&uc_fw->uc_fw_fetched);
>> +}
>> +
>> +/**
>> + * intel_uc_fw_init() - initiate the fetching of firmware
>> + * @dev:	drm device
>> + * @uc_fw:	intel_uc_fw structure
>> + * @name:	human-readable device name (e.g. "GuC") for messages
>> + * @fw_path:	(trailing parts of) path to firmware (e.g. "i915/guc_fw.bin")
>> + * 		@fw_path == NULL means "no firmware expected" (not an error),
>> + * 		@fw_path == "" (empty string) means "firmware unknown" i.e.
>> + * 		the uC requires firmware, but the driver doesn't know where
>> + * 		to find the proper version. This will be logged as an error.
>> + *
>> + * This is called just once per uC, during driver loading. It is therefore
>> + * automatically single-threaded and does not need to acquire any mutexes
>> + * or spinlocks. OTOH, GEM is not yet fully initialised, so we can't do
>> + * very much here.
>> + *
>> + * The main task here is to initiate the fetching of the uC firmware into
>> + * memory, using the standard kernel firmware fetching support.  The actual
>> + * fetching will then proceed asynchronously and in parallel with the rest
>> + * of driver initialisation; later in the loading process we will synchronise
>> + * with the firmware-fetching thread before transferring the firmware image
>> + * firstly into a GEM object and then into the uC's memory.
>> + */
>> +void
>> +intel_uc_fw_init(struct drm_device *dev, struct intel_uc_fw *uc_fw,
>> +		 const char *name, const char *fw_path)
>> +{
>> +	uc_fw->uc_dev = dev;
>> +	uc_fw->uc_name = name;
>> +	uc_fw->uc_fw_path = fw_path;
>> +	uc_fw->uc_fw_fetch_status = INTEL_UC_FIRMWARE_NONE;
>> +	uc_fw->uc_fw_load_status = INTEL_UC_FIRMWARE_NONE;
>> +	init_completion(&uc_fw->uc_fw_fetched);
>> +
>> +	if (fw_path == NULL)
>> +		return;
>> +
>> +	if (*fw_path == '\0') {
>> +		DRM_ERROR("No %s firmware known for this platform\n", name);
>> +		uc_fw->uc_fw_fetch_status = INTEL_UC_FIRMWARE_FAIL;
>> +		return;
>> +	}
>> +
>> +	uc_fw->uc_fw_fetch_status = INTEL_UC_FIRMWARE_PENDING;
>> +
>> +	if (request_firmware_nowait(THIS_MODULE, true, fw_path,
>> +				    &dev->pdev->dev,
>> +				    GFP_KERNEL, uc_fw,
>> +				    uc_fw_fetch_callback)) {
>> +		DRM_ERROR("Failed to request %s firmware from <%s>\n",
>> +			  name, fw_path);
>> +		uc_fw->uc_fw_fetch_status = INTEL_UC_FIRMWARE_FAIL;
>> +		return;
>> +	}
>> +
>> +	/* firmware fetch initiated, callback will signal completion */
>> +	DRM_DEBUG_DRIVER("initiated fetching %s firmware from <%s>\n",
>> +		name, fw_path);
>> +}
>> +
>> +/**
>> + * intel_uc_fw_fini() - clean up all uC firmware-related data
>> + * @uc_fw:	intel_uc_fw structure
>> + */
>> +void
>> +intel_uc_fw_fini(struct intel_uc_fw *uc_fw)
>> +{
>> +	WARN_ON(!mutex_is_locked(&uc_fw->uc_dev->struct_mutex));
>> +
>> +	/*
>> +	 * Generally, the blob should have been released earlier, but
>> +	 * if the driver load were aborted after the fetch had been
>> +	 * initiated but not completed it might still be around
>> +	 */
>> +	if (uc_fw->uc_fw_fetch_status == INTEL_UC_FIRMWARE_PENDING)
>> +		wait_for_completion(&uc_fw->uc_fw_fetched);
>> +	release_firmware(uc_fw->uc_fw_blob);	/* OK even if NULL */
>> +	uc_fw->uc_fw_blob = NULL;
>> +
>> +	if (uc_fw->uc_fw_obj)
>> +		drm_gem_object_unreference(&uc_fw->uc_fw_obj->base);
>> +	uc_fw->uc_fw_obj = NULL;
>> +}
>> diff --git a/drivers/gpu/drm/i915/intel_uc_loader.h b/drivers/gpu/drm/i915/intel_uc_loader.h
>> new file mode 100644
>> index 0000000..22502ea
>> --- /dev/null
>> +++ b/drivers/gpu/drm/i915/intel_uc_loader.h
>> @@ -0,0 +1,82 @@
>> +/*
>> + * Copyright © 2014 Intel Corporation
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a
>> + * copy of this software and associated documentation files (the "Software"),
>> + * to deal in the Software without restriction, including without limitation
>> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice (including the next
>> + * paragraph) shall be included in all copies or substantial portions of the
>> + * Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
>> + * IN THE SOFTWARE.
>> + *
>> + * Author:
>> + *	Dave Gordon <david.s.gordon@intel.com>
>> + */
>> +#ifndef _INTEL_UC_LOADER_H
>> +#define _INTEL_UC_LOADER_H
>> +
>> +/*
>> + * Microcontroller (uC) firmware loading support
>> + */
>> +
>> +/*
>> + * These values are used to track the stages of getting the required firmware
>> + * into an onboard microcontroller. The common code tracks the phases of
>> + * fetching the firmware (aka "binary blob") from an external file into a GEM
>> + * object in the 'uc_fw_fetch_status' field below; the uC-specific DMA code
>> + * uses the 'uc_fw_load_status' field to track the transfer from GEM object
>> + * to uC memory.
>> + *
>> + * For the first (fetch) stage, the interpretation of the values is:
>> + * NONE - no firmware is being fetched e.g. because there is no uC
>> + * PENDING - firmware fetch initiated; callback will complete 'uc_fw_fetched'
>> + * SUCCESS - uC firmware fetched into a GEM object and ready for use
>> + * FAIL - something went wrong; uC firmware is not available
>> + *
>> + * The second (load) stage is simpler as there is no asynchronous handoff:
>> + * NONE - no firmware is being loaded e.g. because there is no uC
>> + * PENDING - firmware DMA load in progress
>> + * SUCCESS - uC firmware loaded into uC memory and ready for use
>> + * FAIL - something went wrong; uC firmware is not available
>> + */
>> +enum intel_uc_fw_status {
>> +	INTEL_UC_FIRMWARE_FAIL = -1,
>> +	INTEL_UC_FIRMWARE_NONE = 0,
>> +	INTEL_UC_FIRMWARE_PENDING,
>> +	INTEL_UC_FIRMWARE_SUCCESS
>> +};
>> +
>> +/*
>> + * This structure encapsulates all the data needed during the process of
>> + * fetching, caching, and loading the firmware image into the uC.
>> + */
>> +struct intel_uc_fw {
>> +	struct drm_device *		uc_dev;
>> +	const char *			uc_name;
>> +	const char *			uc_fw_path;
>> +	const struct firmware *		uc_fw_blob;
>> +	struct completion		uc_fw_fetched;
>> +	size_t				uc_fw_size;
>> +	struct drm_i915_gem_object *	uc_fw_obj;
>> +	enum intel_uc_fw_status		uc_fw_fetch_status;
>> +	enum intel_uc_fw_status		uc_fw_load_status;
>> +};
>> +
>> +void intel_uc_fw_init(struct drm_device *dev, struct intel_uc_fw *uc_fw,
>> +		const char *uc_name, const char *fw_path);
>> +int intel_uc_fw_check(struct intel_uc_fw *uc_fw,
>> +		bool callback(struct intel_uc_fw *));
>> +void intel_uc_fw_fini(struct intel_uc_fw *uc_fw);
>> +
>> +#endif
>> -- 
>> 1.7.9.5
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
Daniel Vetter June 18, 2015, 2:49 p.m. UTC | #3
On Thu, Jun 18, 2015 at 01:11:34PM +0100, Dave Gordon wrote:
> On 17/06/15 13:05, Daniel Vetter wrote:
> > On Mon, Jun 15, 2015 at 07:36:20PM +0100, Dave Gordon wrote:
> >> Current devices may contain one or more programmable microcontrollers
> >> that need to have a firmware image (aka "binary blob") loaded from an
> >> external medium and transferred to the device's memory.
> >>
> >> This file provides generic support functions for doing this; they can
> >> then be used by each uC-specific loader, thus reducing code duplication
> >> and testing effort.
> >>
> >> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> >> Signed-off-by: Alex Dai <yu.dai@intel.com>
> > 
> > Given that I'm just shredding the synchronization used by the dmc loader
> > I'm not convinced this is a good idea. Abstraction has cost, and a bit of
> > copy-paste for similar sounding but slightly different things doesn't
> > sound awful to me. And the critical bit in all the firmware loading I've
> > seen thus far is in synchronizing the loading with other operations,
> > hiding that isn't a good idea. Worse if we enforce stuff like requiring
> > dev->struct_mutex.
> > -Daniel
> 
> It's precisely because it's in some sense "trivial-but-tricky" that we
> should write it once, get it right, and use it everywhere. Copypaste
> /does/ sound awful; I've seen how the code this was derived from had
> already been cloned into three flavours, all different and all wrong.
> 
> It's a very simple abstraction: one early call to kick things off as
> early as possible, no locking required. One late call with the
> struct_mutex held to complete the synchronisation and actually do the
> work, thus guaranteeing that the transfer to the target uC is done in a
> controlled fashion, at a time of the caller's choice, and by the
> driver's mainline thread, NOT by an asynchronous thread racing with
> other activity (which was one of the things wrong with the original
> version).

Yeah I've seen the origins of this in the display code, and that code gets
the syncing wrong. The only thing that one has do to is grab a runtime pm
reference for the appropriate power well to prevent dc5 entry, and release
it when the firmware is loaded and initialized.

Which means any kind of firmware loader which requires/uses
dev->struct_mutex get stuff wrong and is not appropriate everywhere.

> We should convert the DMC loader to use this too, so there need be only
> one bit of code in the whole driver that needs to understand how to use
> completions to get correct handover from a free-running no-locks-held
> thread to the properly disciplined environment of driver mainline for
> purposes of programming the h/w.

Nack on using this for dmc, since I want them to convert it to the above
synchronization, since that's how all the other async power initialization
is done.

Guc is different since we really must have it ready for execbuf, and for
that usecase a completion at drm_open time sounds like the right thing.

As a rule of thumb for refactoring and share infastructure we use the
following recipe in drm:
- first driver implements things as straightforward as possible
- 2nd user copypastes
- 3rd one has the duty to figure out whether some refactoring is in order
  or not.

Imo that approach leads a really good balance between avoiding
overengineering and having maintainable code.
-Daniel
Chris Wilson June 18, 2015, 3:27 p.m. UTC | #4
On Thu, Jun 18, 2015 at 04:49:49PM +0200, Daniel Vetter wrote:
> Guc is different since we really must have it ready for execbuf, and for
> that usecase a completion at drm_open time sounds like the right thing.

But do we? It would be nice if we had a definite answer that the hw was
ready before we started using it in anger, but I don't see any reason
why we would have to delay userspace for a slow microcode update...

(This presupposes that userspace batches are unaffected by GuC/execlist
setup, which for userspace sanity I hope they are - or at least using
predicate registers and conditional execution.)
-Chris
Daniel Vetter June 18, 2015, 3:35 p.m. UTC | #5
On Thu, Jun 18, 2015 at 04:27:52PM +0100, Chris Wilson wrote:
> On Thu, Jun 18, 2015 at 04:49:49PM +0200, Daniel Vetter wrote:
> > Guc is different since we really must have it ready for execbuf, and for
> > that usecase a completion at drm_open time sounds like the right thing.
> 
> But do we? It would be nice if we had a definite answer that the hw was
> ready before we started using it in anger, but I don't see any reason
> why we would have to delay userspace for a slow microcode update...
> 
> (This presupposes that userspace batches are unaffected by GuC/execlist
> setup, which for userspace sanity I hope they are - or at least using
> predicate registers and conditional execution.)

Well I figured a wait_completion or flush_work unconditionally in execbuf
is not to your liking, and it's better to keep that in open. But I think
we should be able to get away with this at execbuf time. Might even be
better since this wouldn't block sw-rendered boot-splashs.

But either way should be suitable I think.
-Daniel
Chris Wilson June 18, 2015, 3:49 p.m. UTC | #6
On Thu, Jun 18, 2015 at 05:35:29PM +0200, Daniel Vetter wrote:
> On Thu, Jun 18, 2015 at 04:27:52PM +0100, Chris Wilson wrote:
> > On Thu, Jun 18, 2015 at 04:49:49PM +0200, Daniel Vetter wrote:
> > > Guc is different since we really must have it ready for execbuf, and for
> > > that usecase a completion at drm_open time sounds like the right thing.
> > 
> > But do we? It would be nice if we had a definite answer that the hw was
> > ready before we started using it in anger, but I don't see any reason
> > why we would have to delay userspace for a slow microcode update...
> > 
> > (This presupposes that userspace batches are unaffected by GuC/execlist
> > setup, which for userspace sanity I hope they are - or at least using
> > predicate registers and conditional execution.)
> 
> Well I figured a wait_completion or flush_work unconditionally in execbuf
> is not to your liking, and it's better to keep that in open. But I think
> we should be able to get away with this at execbuf time. Might even be
> better since this wouldn't block sw-rendered boot-splashs.
> 
> But either way should be suitable I think.

I am optimistic that we can make the request interface robust enough to be
able queue up not only the ring initialisation and ppgtt initialisation
requests, but also userspace requests. If it all works out, we only need
to truly worry about microcode completion in hangcheck.
-Chris
Dave Gordon June 19, 2015, 8:43 a.m. UTC | #7
On 18/06/15 15:49, Daniel Vetter wrote:
> On Thu, Jun 18, 2015 at 01:11:34PM +0100, Dave Gordon wrote:
>> On 17/06/15 13:05, Daniel Vetter wrote:
>>> On Mon, Jun 15, 2015 at 07:36:20PM +0100, Dave Gordon wrote:
>>>> Current devices may contain one or more programmable microcontrollers
>>>> that need to have a firmware image (aka "binary blob") loaded from an
>>>> external medium and transferred to the device's memory.
>>>>
>>>> This file provides generic support functions for doing this; they can
>>>> then be used by each uC-specific loader, thus reducing code duplication
>>>> and testing effort.
>>>>
>>>> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
>>>> Signed-off-by: Alex Dai <yu.dai@intel.com>
>>>
>>> Given that I'm just shredding the synchronization used by the dmc loader
>>> I'm not convinced this is a good idea. Abstraction has cost, and a bit of
>>> copy-paste for similar sounding but slightly different things doesn't
>>> sound awful to me. And the critical bit in all the firmware loading I've
>>> seen thus far is in synchronizing the loading with other operations,
>>> hiding that isn't a good idea. Worse if we enforce stuff like requiring
>>> dev->struct_mutex.
>>> -Daniel
>>
>> It's precisely because it's in some sense "trivial-but-tricky" that we
>> should write it once, get it right, and use it everywhere. Copypaste
>> /does/ sound awful; I've seen how the code this was derived from had
>> already been cloned into three flavours, all different and all wrong.
>>
>> It's a very simple abstraction: one early call to kick things off as
>> early as possible, no locking required. One late call with the
>> struct_mutex held to complete the synchronisation and actually do the
>> work, thus guaranteeing that the transfer to the target uC is done in a
>> controlled fashion, at a time of the caller's choice, and by the
>> driver's mainline thread, NOT by an asynchronous thread racing with
>> other activity (which was one of the things wrong with the original
>> version).
> 
> Yeah I've seen the origins of this in the display code, and that code gets
> the syncing wrong. The only thing that one has do to is grab a runtime pm
> reference for the appropriate power well to prevent dc5 entry, and release
> it when the firmware is loaded and initialized.

Agreed.

> Which means any kind of firmware loader which requires/uses
> dev->struct_mutex get stuff wrong and is not appropriate everywhere.

BUT, the loading of the firmware into any uC MUST be done in a
controlled manner i.e. at a time when no other thread is touching the
h/w. Otherwise the f/w load and whatever else is concurrently accessing
the h/w could in some cases interfere disastrously. Examples of
interference might be:

* interleaved accesses to the ELSP (in the case of the GuC)
* incorrect handover of power management (DMC, GuC)
* erroneous management of forcewake state

In general the f/w that is just starting on the uC may have certain
expectations about the initial state of the h/w, which may not be met if
other threads are accessing various bits of h/w while the uC is booting up.

So we absolutely need to guarantee that the f/w load is done by a thread
which has exclusive ownership of any bit of the h/w that the f/w is
going to make assumptions about. With the current locking structure of
the driver, that means holding the struct_mutex (it shouldn't really,
there should be a separate mutex for h/w register access vs.
driver-private data structures, but there isn't).

>> We should convert the DMC loader to use this too, so there need be only
>> one bit of code in the whole driver that needs to understand how to use
>> completions to get correct handover from a free-running no-locks-held
>> thread to the properly disciplined environment of driver mainline for
>> purposes of programming the h/w.
> 
> Nack on using this for dmc, since I want them to convert it to the above
> synchronization, since that's how all the other async power initialization
> is done.
> 
> Guc is different since we really must have it ready for execbuf, and for
> that usecase a completion at drm_open time sounds like the right thing.
> 
> As a rule of thumb for refactoring and share infastructure we use the
> following recipe in drm:
> - first driver implements things as straightforward as possible
> - 2nd user copypastes
> - 3rd one has the duty to figure out whether some refactoring is in order
>   or not.
> Imo that approach leads a really good balance between avoiding
> overengineering and having maintainable code.
> -Daniel

We've already been through these phases; the code has already been
cloned twice (and then changed, but not enough to fix the problems with
the original) and then when I found the issues with the GuC loader and
noticed the hilarious ownership dance it was doing during handover I
realised it was time to fix it in one place rather than several, and
posted a patchset to the internal mailing list on 2015-02-24 with this
commentary:

> The GuC loader uses an asynchronous thread to fetch the firmware image
> (aka "binary blob") from a file and load it into the GuC's memory.
> Unfortunately the GuC loading occurs *after* the internally-generated
> batches used to initialise contexts have already been submitted using
> direct access to the ELSP.  Also, the firmware ends up being loaded at
> an indeterminate time, with consequent potential for confusion in the
> switchover from ELSP- to GuC-based submission.
> 
> This patch series therefore reorganises the GuC loader to ensure that
> the loading process occurs both early enough and at a well-defined
> point in the sequence of operations during driver initialisation,
> specifically *before* any batches are submitted to hardware.
> 
> [PATCH 1/3] GuC: reorganise source before rewriting this code
> [PATCH 2/3] GuC: load firmware image from main thread
> [PATCH 3/3] GuC: update names & comments ("load" => "fetch")

followed by [PATCH 0/2] unify and tidy firmware loading code
on 2015-03-02.

For the DMC module, the basic conversion process is to separate
intel_csr_load_program() from finish_csr_load(). The latter would remain
as the callback in the async thread loading process that has to validate
the loaded image; the former would then become the callback for the
synchronous post-handover transfer of the image to the h/w.

BTW, the existing DMC loader probably won't work on Android :(

.Dave.
Daniel Vetter June 24, 2015, 10:29 a.m. UTC | #8
On Fri, Jun 19, 2015 at 09:43:11AM +0100, Dave Gordon wrote:
> On 18/06/15 15:49, Daniel Vetter wrote:
> > On Thu, Jun 18, 2015 at 01:11:34PM +0100, Dave Gordon wrote:
> >> On 17/06/15 13:05, Daniel Vetter wrote:
> >>> On Mon, Jun 15, 2015 at 07:36:20PM +0100, Dave Gordon wrote:
> >>>> Current devices may contain one or more programmable microcontrollers
> >>>> that need to have a firmware image (aka "binary blob") loaded from an
> >>>> external medium and transferred to the device's memory.
> >>>>
> >>>> This file provides generic support functions for doing this; they can
> >>>> then be used by each uC-specific loader, thus reducing code duplication
> >>>> and testing effort.
> >>>>
> >>>> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> >>>> Signed-off-by: Alex Dai <yu.dai@intel.com>
> >>>
> >>> Given that I'm just shredding the synchronization used by the dmc loader
> >>> I'm not convinced this is a good idea. Abstraction has cost, and a bit of
> >>> copy-paste for similar sounding but slightly different things doesn't
> >>> sound awful to me. And the critical bit in all the firmware loading I've
> >>> seen thus far is in synchronizing the loading with other operations,
> >>> hiding that isn't a good idea. Worse if we enforce stuff like requiring
> >>> dev->struct_mutex.
> >>> -Daniel
> >>
> >> It's precisely because it's in some sense "trivial-but-tricky" that we
> >> should write it once, get it right, and use it everywhere. Copypaste
> >> /does/ sound awful; I've seen how the code this was derived from had
> >> already been cloned into three flavours, all different and all wrong.
> >>
> >> It's a very simple abstraction: one early call to kick things off as
> >> early as possible, no locking required. One late call with the
> >> struct_mutex held to complete the synchronisation and actually do the
> >> work, thus guaranteeing that the transfer to the target uC is done in a
> >> controlled fashion, at a time of the caller's choice, and by the
> >> driver's mainline thread, NOT by an asynchronous thread racing with
> >> other activity (which was one of the things wrong with the original
> >> version).
> > 
> > Yeah I've seen the origins of this in the display code, and that code gets
> > the syncing wrong. The only thing that one has do to is grab a runtime pm
> > reference for the appropriate power well to prevent dc5 entry, and release
> > it when the firmware is loaded and initialized.
> 
> Agreed.
> 
> > Which means any kind of firmware loader which requires/uses
> > dev->struct_mutex get stuff wrong and is not appropriate everywhere.
> 
> BUT, the loading of the firmware into any uC MUST be done in a
> controlled manner i.e. at a time when no other thread is touching the
> h/w. Otherwise the f/w load and whatever else is concurrently accessing
> the h/w could in some cases interfere disastrously. Examples of
> interference might be:
> 
> * interleaved accesses to the ELSP (in the case of the GuC)
> * incorrect handover of power management (DMC, GuC)
> * erroneous management of forcewake state
> 
> In general the f/w that is just starting on the uC may have certain
> expectations about the initial state of the h/w, which may not be met if
> other threads are accessing various bits of h/w while the uC is booting up.
> 
> So we absolutely need to guarantee that the f/w load is done by a thread
> which has exclusive ownership of any bit of the h/w that the f/w is
> going to make assumptions about. With the current locking structure of
> the driver, that means holding the struct_mutex (it shouldn't really,
> there should be a separate mutex for h/w register access vs.
> driver-private data structures, but there isn't).

If you really need this guarantee (and I seriously hope not) then the only
option is a synchronous firmware load at driver init _before_ we launch
any of the asynchronous setup code. And there is already a lot of that,
and we're adding more all the time.

What I expect we need is synchronization of just the revelant part with
the firmware loading, which necessarily needs to be somewhat async to be
able to support cros/android requirements. And yes that needs to be done
in a controlled manner, but most likely we need very specific solutions
for the problem at hand. Unconditionally holding dev->struct_mutex isn't
that solution.

The other problem with dev->struct_mutex is that it's a giantic lock with
ill defined coverage and semantics. It's imo the biggest piece of
technical debt we carry around in i915.ko, and we pay the price for that
dearly&daily. Which means that since a few years any kind of code
which extended dev->struct_mutex to anything not clearly core gem data
structures was rejected.

> >> We should convert the DMC loader to use this too, so there need be only
> >> one bit of code in the whole driver that needs to understand how to use
> >> completions to get correct handover from a free-running no-locks-held
> >> thread to the properly disciplined environment of driver mainline for
> >> purposes of programming the h/w.
> > 
> > Nack on using this for dmc, since I want them to convert it to the above
> > synchronization, since that's how all the other async power initialization
> > is done.
> > 
> > Guc is different since we really must have it ready for execbuf, and for
> > that usecase a completion at drm_open time sounds like the right thing.
> > 
> > As a rule of thumb for refactoring and share infastructure we use the
> > following recipe in drm:
> > - first driver implements things as straightforward as possible
> > - 2nd user copypastes
> > - 3rd one has the duty to figure out whether some refactoring is in order
> >   or not.
> > Imo that approach leads a really good balance between avoiding
> > overengineering and having maintainable code.
> 
> We've already been through these phases; the code has already been
> cloned twice (and then changed, but not enough to fix the problems with
> the original) and then when I found the issues with the GuC loader and
> noticed the hilarious ownership dance it was doing during handover I
> realised it was time to fix it in one place rather than several, and
> posted a patchset to the internal mailing list on 2015-02-24 with this
> commentary:
> 
> > The GuC loader uses an asynchronous thread to fetch the firmware image
> > (aka "binary blob") from a file and load it into the GuC's memory.
> > Unfortunately the GuC loading occurs *after* the internally-generated
> > batches used to initialise contexts have already been submitted using
> > direct access to the ELSP.  Also, the firmware ends up being loaded at
> > an indeterminate time, with consequent potential for confusion in the
> > switchover from ELSP- to GuC-based submission.
> > 
> > This patch series therefore reorganises the GuC loader to ensure that
> > the loading process occurs both early enough and at a well-defined
> > point in the sequence of operations during driver initialisation,
> > specifically *before* any batches are submitted to hardware.
> > 
> > [PATCH 1/3] GuC: reorganise source before rewriting this code
> > [PATCH 2/3] GuC: load firmware image from main thread
> > [PATCH 3/3] GuC: update names & comments ("load" => "fetch")
> 
> followed by [PATCH 0/2] unify and tidy firmware loading code
> on 2015-03-02.
> 
> For the DMC module, the basic conversion process is to separate
> intel_csr_load_program() from finish_csr_load(). The latter would remain
> as the callback in the async thread loading process that has to validate
> the loaded image; the former would then become the callback for the
> synchronous post-handover transfer of the image to the h/w.
> 
> BTW, the existing DMC loader probably won't work on Android :(

Yeah I completely missed out on this fun since I presumed that firmware
loading is easy and simple. And if you look around on other drm drivers it
indeed is, they all use a synchronous request_firmware and if the firmware
isn't there, they just fall over (fully in the case of radeon.ko,
partially in the case of nouveau.ko since they have all the support in
place for handling a kms-only accel-less gpu in userspace anyway, like we
do). But for a bunch of reasons (afaik it's "you can't include a blob in a
gpl-ed kernel image" we need async firmware loading for cros&android).

That leaves us with a situation where we should have done a special design
discussion about asynchronous firmware, but somehow failed do to that.
Which leaves us in a very ugly position.

I talked with a bunch of people over the past few days to figure out how
this is supposed to work and also figure out why it's being done like that
today. I think I have a reasonable good plan for moving forward too. I'll
start a new top-level thread here to discuss this.

Thanks, Daneil
Dave Gordon July 6, 2015, 12:44 p.m. UTC | #9
On 24/06/15 11:29, Daniel Vetter wrote:
> On Fri, Jun 19, 2015 at 09:43:11AM +0100, Dave Gordon wrote:
>> On 18/06/15 15:49, Daniel Vetter wrote:
>>> On Thu, Jun 18, 2015 at 01:11:34PM +0100, Dave Gordon wrote:
>>>> On 17/06/15 13:05, Daniel Vetter wrote:
>>>>> On Mon, Jun 15, 2015 at 07:36:20PM +0100, Dave Gordon wrote:
>>>>>> Current devices may contain one or more programmable microcontrollers
>>>>>> that need to have a firmware image (aka "binary blob") loaded from an
>>>>>> external medium and transferred to the device's memory.
>>>>>>
>>>>>> This file provides generic support functions for doing this; they can
>>>>>> then be used by each uC-specific loader, thus reducing code duplication
>>>>>> and testing effort.
>>>>>>
>>>>>> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
>>>>>> Signed-off-by: Alex Dai <yu.dai@intel.com>
>>>>>
>>>>> Given that I'm just shredding the synchronization used by the dmc loader
>>>>> I'm not convinced this is a good idea. Abstraction has cost, and a bit of
>>>>> copy-paste for similar sounding but slightly different things doesn't
>>>>> sound awful to me. And the critical bit in all the firmware loading I've
>>>>> seen thus far is in synchronizing the loading with other operations,
>>>>> hiding that isn't a good idea. Worse if we enforce stuff like requiring
>>>>> dev->struct_mutex.
>>>>> -Daniel
>>>>
>>>> It's precisely because it's in some sense "trivial-but-tricky" that we
>>>> should write it once, get it right, and use it everywhere. Copypaste
>>>> /does/ sound awful; I've seen how the code this was derived from had
>>>> already been cloned into three flavours, all different and all wrong.
>>>>
>>>> It's a very simple abstraction: one early call to kick things off as
>>>> early as possible, no locking required. One late call with the
>>>> struct_mutex held to complete the synchronisation and actually do the
>>>> work, thus guaranteeing that the transfer to the target uC is done in a
>>>> controlled fashion, at a time of the caller's choice, and by the
>>>> driver's mainline thread, NOT by an asynchronous thread racing with
>>>> other activity (which was one of the things wrong with the original
>>>> version).
>>>
>>> Yeah I've seen the origins of this in the display code, and that code gets
>>> the syncing wrong. The only thing that one has do to is grab a runtime pm
>>> reference for the appropriate power well to prevent dc5 entry, and release
>>> it when the firmware is loaded and initialized.
>>
>> Agreed.
>>
>>> Which means any kind of firmware loader which requires/uses
>>> dev->struct_mutex get stuff wrong and is not appropriate everywhere.
>>
>> BUT, the loading of the firmware into any uC MUST be done in a
>> controlled manner i.e. at a time when no other thread is touching the
>> h/w. Otherwise the f/w load and whatever else is concurrently accessing
>> the h/w could in some cases interfere disastrously. Examples of
>> interference might be:
>>
>> * interleaved accesses to the ELSP (in the case of the GuC)
>> * incorrect handover of power management (DMC, GuC)
>> * erroneous management of forcewake state
>>
>> In general the f/w that is just starting on the uC may have certain
>> expectations about the initial state of the h/w, which may not be met if
>> other threads are accessing various bits of h/w while the uC is booting up.
>>
>> So we absolutely need to guarantee that the f/w load is done by a thread
>> which has exclusive ownership of any bit of the h/w that the f/w is
>> going to make assumptions about. With the current locking structure of
>> the driver, that means holding the struct_mutex (it shouldn't really,
>> there should be a separate mutex for h/w register access vs.
>> driver-private data structures, but there isn't).
>
> If you really need this guarantee (and I seriously hope not) then the only
> option is a synchronous firmware load at driver init _before_ we launch
> any of the asynchronous setup code. And there is already a lot of that,
> and we're adding more all the time.
>
> What I expect we need is synchronization of just the revelant part with
> the firmware loading, which necessarily needs to be somewhat async to be
> able to support cros/android requirements. And yes that needs to be done
> in a controlled manner, but most likely we need very specific solutions
> for the problem at hand. Unconditionally holding dev->struct_mutex isn't
> that solution.
>
> The other problem with dev->struct_mutex is that it's a giantic lock with
> ill defined coverage and semantics. It's imo the biggest piece of
> technical debt we carry around in i915.ko, and we pay the price for that
> dearly&daily. Which means that since a few years any kind of code
> which extended dev->struct_mutex to anything not clearly core gem data
> structures was rejected.

Oh, I quite agree that the struct_mutex is an abomination and would 
certainly like to eliminate it. But at the moment it's the only 
sufficiently large-scale synchronisation operation available to ensure 
that (for example) we don't try to load the f/w at the same time that 
another thread is trying to reset the h/w.

None of this loader code really needs the struct_mutex specifically; the 
WARN_ON macros were just there to help callers know what degree of 
synchronisation they need to organise before calling these functions.

[snip]

>> BTW, the existing DMC loader probably won't work on Android :(
>
> Yeah I completely missed out on this fun since I presumed that firmware
> loading is easy and simple. And if you look around on other drm drivers it
> indeed is, they all use a synchronous request_firmware and if the firmware
> isn't there, they just fall over (fully in the case of radeon.ko,
> partially in the case of nouveau.ko since they have all the support in
> place for handling a kms-only accel-less gpu in userspace anyway, like we
> do). But for a bunch of reasons (afaik it's "you can't include a blob in a
> gpl-ed kernel image" we need async firmware loading for cros&android).
>
> That leaves us with a situation where we should have done a special design
> discussion about asynchronous firmware, but somehow failed do to that.
> Which leaves us in a very ugly position.
>
> I talked with a bunch of people over the past few days to figure out how
> this is supposed to work and also figure out why it's being done like that
> today. I think I have a reasonable good plan for moving forward too. I'll
> start a new top-level thread here to discuss this.
>
> Thanks, Daniel

It really isn't "asynchronous", it's just "deferred" -- but implying 
that everything that relies on having the firmware available also has to 
be deferred. For the DMC, that means we can't have full PM; and for the 
GuC, we can't submit any batches at all to any engine until the f/w load 
is done.

Really, it would be simpler if we didn't support automatic firmware 
loading in the kernel at all, and had a userland startup process whose 
job was to locate and transfer the required firmware before the device 
could be used. But that would just give us the opposite problem, because 
the display device is one of the things that /should/ be usable even 
before the rootfs is mounted.

Anyway, I've posted a simplified v3 which only supports synchronous 
fetch, without the prefetch capability. See separate thread ...

.Dave.
Daniel Vetter July 6, 2015, 1:24 p.m. UTC | #10
On Mon, Jul 06, 2015 at 01:44:10PM +0100, Dave Gordon wrote:
> On 24/06/15 11:29, Daniel Vetter wrote:
> >On Fri, Jun 19, 2015 at 09:43:11AM +0100, Dave Gordon wrote:
> >>On 18/06/15 15:49, Daniel Vetter wrote:
> >>>On Thu, Jun 18, 2015 at 01:11:34PM +0100, Dave Gordon wrote:
> >>>>On 17/06/15 13:05, Daniel Vetter wrote:
> >>>>>On Mon, Jun 15, 2015 at 07:36:20PM +0100, Dave Gordon wrote:
> >>>>>>Current devices may contain one or more programmable microcontrollers
> >>>>>>that need to have a firmware image (aka "binary blob") loaded from an
> >>>>>>external medium and transferred to the device's memory.
> >>>>>>
> >>>>>>This file provides generic support functions for doing this; they can
> >>>>>>then be used by each uC-specific loader, thus reducing code duplication
> >>>>>>and testing effort.
> >>>>>>
> >>>>>>Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> >>>>>>Signed-off-by: Alex Dai <yu.dai@intel.com>
> >>>>>
> >>>>>Given that I'm just shredding the synchronization used by the dmc loader
> >>>>>I'm not convinced this is a good idea. Abstraction has cost, and a bit of
> >>>>>copy-paste for similar sounding but slightly different things doesn't
> >>>>>sound awful to me. And the critical bit in all the firmware loading I've
> >>>>>seen thus far is in synchronizing the loading with other operations,
> >>>>>hiding that isn't a good idea. Worse if we enforce stuff like requiring
> >>>>>dev->struct_mutex.
> >>>>>-Daniel
> >>>>
> >>>>It's precisely because it's in some sense "trivial-but-tricky" that we
> >>>>should write it once, get it right, and use it everywhere. Copypaste
> >>>>/does/ sound awful; I've seen how the code this was derived from had
> >>>>already been cloned into three flavours, all different and all wrong.
> >>>>
> >>>>It's a very simple abstraction: one early call to kick things off as
> >>>>early as possible, no locking required. One late call with the
> >>>>struct_mutex held to complete the synchronisation and actually do the
> >>>>work, thus guaranteeing that the transfer to the target uC is done in a
> >>>>controlled fashion, at a time of the caller's choice, and by the
> >>>>driver's mainline thread, NOT by an asynchronous thread racing with
> >>>>other activity (which was one of the things wrong with the original
> >>>>version).
> >>>
> >>>Yeah I've seen the origins of this in the display code, and that code gets
> >>>the syncing wrong. The only thing that one has do to is grab a runtime pm
> >>>reference for the appropriate power well to prevent dc5 entry, and release
> >>>it when the firmware is loaded and initialized.
> >>
> >>Agreed.
> >>
> >>>Which means any kind of firmware loader which requires/uses
> >>>dev->struct_mutex get stuff wrong and is not appropriate everywhere.
> >>
> >>BUT, the loading of the firmware into any uC MUST be done in a
> >>controlled manner i.e. at a time when no other thread is touching the
> >>h/w. Otherwise the f/w load and whatever else is concurrently accessing
> >>the h/w could in some cases interfere disastrously. Examples of
> >>interference might be:
> >>
> >>* interleaved accesses to the ELSP (in the case of the GuC)
> >>* incorrect handover of power management (DMC, GuC)
> >>* erroneous management of forcewake state
> >>
> >>In general the f/w that is just starting on the uC may have certain
> >>expectations about the initial state of the h/w, which may not be met if
> >>other threads are accessing various bits of h/w while the uC is booting up.
> >>
> >>So we absolutely need to guarantee that the f/w load is done by a thread
> >>which has exclusive ownership of any bit of the h/w that the f/w is
> >>going to make assumptions about. With the current locking structure of
> >>the driver, that means holding the struct_mutex (it shouldn't really,
> >>there should be a separate mutex for h/w register access vs.
> >>driver-private data structures, but there isn't).
> >
> >If you really need this guarantee (and I seriously hope not) then the only
> >option is a synchronous firmware load at driver init _before_ we launch
> >any of the asynchronous setup code. And there is already a lot of that,
> >and we're adding more all the time.
> >
> >What I expect we need is synchronization of just the revelant part with
> >the firmware loading, which necessarily needs to be somewhat async to be
> >able to support cros/android requirements. And yes that needs to be done
> >in a controlled manner, but most likely we need very specific solutions
> >for the problem at hand. Unconditionally holding dev->struct_mutex isn't
> >that solution.
> >
> >The other problem with dev->struct_mutex is that it's a giantic lock with
> >ill defined coverage and semantics. It's imo the biggest piece of
> >technical debt we carry around in i915.ko, and we pay the price for that
> >dearly&daily. Which means that since a few years any kind of code
> >which extended dev->struct_mutex to anything not clearly core gem data
> >structures was rejected.
> 
> Oh, I quite agree that the struct_mutex is an abomination and would
> certainly like to eliminate it. But at the moment it's the only sufficiently
> large-scale synchronisation operation available to ensure that (for example)
> we don't try to load the f/w at the same time that another thread is trying
> to reset the h/w.

I guess this is the crux here - for me part of the big problems around
dev->struct_mutex is that it doesn't just protect data structures to
ensure they're consistent, it also intermingles a lot of lifetime rules
(e.g. holding struct_mutex synchronizes against any final gem bo unref
too). And my experience from fixing up a few horribly misguided locking
designs in drm subsystem is that in general using locks to do
synchronization leads to serious maintainance headaches a few years down
the road. So if you state that only struct_mutex is a big enough lock to
synchronize async guc init then that kills your design for me already. Yes
there's a grey area like always in design topics, but if givin how massive
(and massively undocumented) struct_mutex is I'm against going into that
grey area with extreme prejudice.

I guess I need to write a whitepaper or something about this (it's
commonly accepted wisdom in the drm subsystem), but the summary is
relative simple: Don't use locks to ensure ordering of concurrent
operations or solve lifetime problems. For ordering use waitqueues,
completions or the work/timer specific things to order stuff. For lifetime
problems use refcounting.

Ofc because struct_mutex is such a sprawling beast that's a bit tricky,
but the simple ordering problem should be solved with a flush_work of the
async guc loader or similar. You still need to grab the struct_mutex for
the async loader (because that touches shared gem state, i.e. it's about
data consistency), but not because of ordering issues.

> None of this loader code really needs the struct_mutex specifically; the
> WARN_ON macros were just there to help callers know what degree of
> synchronisation they need to organise before calling these functions.
> 
> [snip]
> 
> >>BTW, the existing DMC loader probably won't work on Android :(
> >
> >Yeah I completely missed out on this fun since I presumed that firmware
> >loading is easy and simple. And if you look around on other drm drivers it
> >indeed is, they all use a synchronous request_firmware and if the firmware
> >isn't there, they just fall over (fully in the case of radeon.ko,
> >partially in the case of nouveau.ko since they have all the support in
> >place for handling a kms-only accel-less gpu in userspace anyway, like we
> >do). But for a bunch of reasons (afaik it's "you can't include a blob in a
> >gpl-ed kernel image" we need async firmware loading for cros&android).
> >
> >That leaves us with a situation where we should have done a special design
> >discussion about asynchronous firmware, but somehow failed do to that.
> >Which leaves us in a very ugly position.
> >
> >I talked with a bunch of people over the past few days to figure out how
> >this is supposed to work and also figure out why it's being done like that
> >today. I think I have a reasonable good plan for moving forward too. I'll
> >start a new top-level thread here to discuss this.
> >
> >Thanks, Daniel
> 
> It really isn't "asynchronous", it's just "deferred" -- but implying that
> everything that relies on having the firmware available also has to be
> deferred. For the DMC, that means we can't have full PM; and for the GuC, we
> can't submit any batches at all to any engine until the f/w load is done.

Yeah I'm ok with just calling it "deferred", as long as we agree on the
meaning of "not run synchronously from driver load". And since there's no
reason to wait for userspace to do anything we might as well run anything
we defer in an async worker (and insert any necessary waits there), which
is why I just call all deferred driver load work "async".

> Really, it would be simpler if we didn't support automatic firmware loading
> in the kernel at all, and had a userland startup process whose job was to
> locate and transfer the required firmware before the device could be used.
> But that would just give us the opposite problem, because the display device
> is one of the things that /should/ be usable even before the rootfs is
> mounted.

request_firmware supports this userland process you're describing above,
if you enable CONFIG_FW_LOADER_USER_HELPER_FALLBACK. Which is what I
suggested Android should use (cros might run into a nack from google
upstream). Unfortunately (special snowflake driver that no one wanted to
convert) request_firmware_nowait does _not_ support this. That's why we
need to use request_firmware + our own worker (otherwise request_firmware
would just block everyone).

Wrt using kms apis before we have all the firmware loaded: That's
definitely I use case we should try to support, unfortunately the current
proposal of blocking for guc loading in gem_open will break that - there's
only one device node for both kms and rendering. Some of the other
proposals for async^Wdeferred gem init (both stalling in add_request or
reusing the reset-in-progress logic) would solve that, but have their own
sets of downsides. These kind of problems are exactly why I want a proper
design discussion about async^Wdeferred gem init: We need a clear set of
requirements, then weigh the different options against each another and
decide upon one of the 3-4 proposed thus far.

> Anyway, I've posted a simplified v3 which only supports synchronous fetch,
> without the prefetch capability. See separate thread ...

Will take a look, thanks.
-Daniel
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index b7ddf48..607fa2a 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -38,6 +38,9 @@  i915-y += i915_cmd_parser.o \
 	  intel_ringbuffer.o \
 	  intel_uncore.o
 
+# generic ancilliary microcontroller support
+i915-y += intel_uc_loader.o
+
 # autogenerated null render state
 i915-y += intel_renderstate_gen6.o \
 	  intel_renderstate_gen7.o \
diff --git a/drivers/gpu/drm/i915/intel_uc_loader.c b/drivers/gpu/drm/i915/intel_uc_loader.c
new file mode 100644
index 0000000..26f0fbe
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_uc_loader.c
@@ -0,0 +1,312 @@ 
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Author:
+ *	Dave Gordon <david.s.gordon@intel.com>
+ */
+#include <linux/firmware.h>
+#include "i915_drv.h"
+#include "intel_uc_loader.h"
+
+/**
+ * DOC: Generic embedded microcontroller (uC) firmware loading support
+ *
+ * The functions in this file provide a generic way to load the firmware that
+ * may be required by an embedded microcontroller (uC).
+ *
+ * The function intel_uc_fw_init() should be called early, and will initiate
+ * an asynchronous request to fetch the firmware image (aka "binary blob").
+ * When the image has been fetched into memory, the kernel will call back to
+ * uc_fw_fetch_callback() whose function is simply to record the completion
+ * status, and stash the firmware blob for later.
+ *
+ * At some convenient point after GEM initialisation, the driver should call
+ * intel_uc_fw_check(); this will check whether the asynchronous thread has
+ * completed and wait for it if not, check whether the image was successfully
+ * fetched; and then allow the callback() function (if provided) to validate
+ * the image and/or save the data in a GEM object.
+ *
+ * Thereafter the uC-specific code can transfer the data in the GEM object
+ * to the uC's memory (in some uC-specific way, not handled here).
+ *
+ * During driver shutdown, or if driver load is aborted, intel_uc_fw_fini()
+ * should be called to release any remaining resources.
+ */
+
+
+/*
+ * Called once per uC, late in driver initialisation. GEM is now ready, and so
+ * we can now create a GEM object to hold the uC firmware. But first, we must
+ * synchronise with the firmware-fetching thread that was initiated during
+ * early driver load, in intel_uc_fw_init(), and see whether it successfully
+ * fetched the firmware blob.
+ */
+static void
+uc_fw_fetch_wait(struct intel_uc_fw *uc_fw,
+		 bool callback(struct intel_uc_fw *))
+{
+	struct drm_device *dev = uc_fw->uc_dev;
+	struct drm_i915_gem_object *obj;
+	const struct firmware *fw;
+
+	DRM_DEBUG_DRIVER("before waiting: %s fw fetch status %d, fw %p\n",
+		uc_fw->uc_name, uc_fw->uc_fw_fetch_status, uc_fw->uc_fw_blob);
+
+	WARN_ON(!mutex_is_locked(&dev->struct_mutex));
+	WARN_ON(uc_fw->uc_fw_fetch_status != INTEL_UC_FIRMWARE_PENDING);
+
+	wait_for_completion(&uc_fw->uc_fw_fetched);
+
+	DRM_DEBUG_DRIVER("after waiting: %s fw fetch status %d, fw %p\n",
+		uc_fw->uc_name, uc_fw->uc_fw_fetch_status, uc_fw->uc_fw_blob);
+
+	fw = uc_fw->uc_fw_blob;
+	if (!fw) {
+		/* no firmware found; try again in case FS was not mounted */
+		DRM_DEBUG_DRIVER("retry fetching %s fw from <%s>\n",
+			uc_fw->uc_name, uc_fw->uc_fw_path);
+		if (request_firmware(&fw, uc_fw->uc_fw_path, &dev->pdev->dev))
+			goto fail;
+		if (!fw)
+			goto fail;
+		DRM_DEBUG_DRIVER("fetch %s fw from <%s> succeeded, fw %p\n",
+			uc_fw->uc_name, uc_fw->uc_fw_path, fw);
+		uc_fw->uc_fw_blob = fw;
+	}
+
+	/* Callback to the optional uC-specific function, if supplied */
+	if (callback && !callback(uc_fw))
+		goto fail;
+
+	/* Callback may have done the object allocation & write itself */
+	obj = uc_fw->uc_fw_obj;
+	if (!obj) {
+		size_t pages = round_up(fw->size, PAGE_SIZE);
+		obj = i915_gem_alloc_object(dev, pages);
+		if (!obj)
+			goto fail;
+
+		uc_fw->uc_fw_obj = obj;
+		uc_fw->uc_fw_size = fw->size;
+		if (i915_gem_object_write(obj, fw->data, fw->size))
+			goto fail;
+	}
+
+	DRM_DEBUG_DRIVER("%s fw fetch status SUCCESS\n", uc_fw->uc_name);
+	release_firmware(fw);
+	uc_fw->uc_fw_blob = NULL;
+	uc_fw->uc_fw_fetch_status = INTEL_UC_FIRMWARE_SUCCESS;
+	return;
+
+fail:
+	DRM_DEBUG_DRIVER("%s fw fetch status FAIL; fw %p, obj %p\n",
+		uc_fw->uc_name, fw, uc_fw->uc_fw_obj);
+	DRM_ERROR("Failed to fetch %s firmware from <%s>\n",
+		  uc_fw->uc_name, uc_fw->uc_fw_path);
+
+	obj = uc_fw->uc_fw_obj;
+	if (obj)
+		drm_gem_object_unreference(&obj->base);
+	uc_fw->uc_fw_obj = NULL;
+
+	release_firmware(fw);		/* OK even if fw is NULL */
+	uc_fw->uc_fw_blob = NULL;
+	uc_fw->uc_fw_fetch_status = INTEL_UC_FIRMWARE_FAIL;
+}
+
+/**
+ * intel_uc_fw_check() - check the status of the firmware fetching process
+ * @uc_fw:	intel_uc_fw structure
+ * @callback:	optional callback function to validate and/or save the image
+ *
+ * If the fetch is still PENDING, wait for completion first, then check and
+ * return the outcome. Subsequent calls will just return the same outcome
+ * based on the recorded fetch status, without triggering another fetch
+ * and without calling @callback().
+ *
+ * After this call, @uc_fw->uc_fw_fetch_status will show whether the firmware
+ * image was successfully fetched and transferred to a GEM object. If it is
+ * INTEL_UC_FIRMWARE_SUCCESS, @uc_fw->uc_fw_obj will be point to the GEM
+ * object, and the size of the image will be in @uc_fw->uc_fw_size.  For any
+ * other status value, these members are undefined.
+ *
+ * The @callback() parameter allows the uC-specific code to validate the
+ * image before it is saved, and also to override the default save mechanism
+ * if required. When it is called, @uc_fw->uc_fw_blob refers to the fetched
+ * firmware image, and @uc_fw->uc_fw_obj is NULL.
+ *
+ * If @callback() returns FALSE, the fetched image is considered invalid.
+ * The fetch status will be set to FAIL, and this function will return -EIO.
+ *
+ * If @callback() returns TRUE but doesn't set @uc_fw->uc_fw_obj, the image
+ * is considered good; it will be saved in a GEM object as described above.
+ * This is the default if no @callback() is supplied.
+ *
+ * If @callback() returns TRUE after setting @uc_fw->uc_fw_obj, this means
+ * that the image has already been saved by @callback() itself. This allows
+ * @callback() to customise the format of the data in the GEM object, for
+ * example if it needs to save only a portion of the loaded image.
+ *
+ * In all cases the firmware blob is released before this function returns.
+ *
+ * Return:	non-zero code on error
+ */
+int
+intel_uc_fw_check(struct intel_uc_fw *uc_fw,
+		  bool callback(struct intel_uc_fw *))
+{
+	WARN_ON(!mutex_is_locked(&uc_fw->uc_dev->struct_mutex));
+
+	if (uc_fw->uc_fw_fetch_status == INTEL_UC_FIRMWARE_PENDING) {
+		/* We only come here once */
+		uc_fw_fetch_wait(uc_fw, callback);
+		/* state must now be FAIL or SUCCESS */
+	}
+
+	DRM_DEBUG_DRIVER("%s fw fetch status %d\n",
+		uc_fw->uc_name, uc_fw->uc_fw_fetch_status);
+
+	switch (uc_fw->uc_fw_fetch_status) {
+	case INTEL_UC_FIRMWARE_FAIL:
+		/* something went wrong :( */
+		return -EIO;
+
+	case INTEL_UC_FIRMWARE_NONE:
+		/* no firmware, nothing to do (not an error) */
+		return 0;
+
+	case INTEL_UC_FIRMWARE_PENDING:
+	default:
+		/* "can't happen" */
+		WARN_ONCE(1, "%s fw <%s> invalid uc_fw_fetch_status %d!\n",
+			uc_fw->uc_name, uc_fw->uc_fw_path,
+			uc_fw->uc_fw_fetch_status);
+		return -ENXIO;
+
+	case INTEL_UC_FIRMWARE_SUCCESS:
+		return 0;
+	}
+}
+
+/*
+ * Callback from the kernel's asynchronous firmware-fetching subsystem.
+ * All we have to do here is stash the blob and signal completion.
+ * Error checking (e.g. no firmware found) is left to mainline code.
+ * We don't have (and don't want or need to acquire) the struct_mutex here.
+ */
+static void
+uc_fw_fetch_callback(const struct firmware *fw, void *context)
+{
+	struct intel_uc_fw *uc_fw = context;
+
+	WARN_ON(uc_fw->uc_fw_fetch_status != INTEL_UC_FIRMWARE_PENDING);
+	DRM_DEBUG_DRIVER("%s firmware fetch from <%s> status %d, fw %p\n",
+			uc_fw->uc_name, uc_fw->uc_fw_path,
+			uc_fw->uc_fw_fetch_status, fw);
+
+	uc_fw->uc_fw_blob = fw;
+	complete(&uc_fw->uc_fw_fetched);
+}
+
+/**
+ * intel_uc_fw_init() - initiate the fetching of firmware
+ * @dev:	drm device
+ * @uc_fw:	intel_uc_fw structure
+ * @name:	human-readable device name (e.g. "GuC") for messages
+ * @fw_path:	(trailing parts of) path to firmware (e.g. "i915/guc_fw.bin")
+ * 		@fw_path == NULL means "no firmware expected" (not an error),
+ * 		@fw_path == "" (empty string) means "firmware unknown" i.e.
+ * 		the uC requires firmware, but the driver doesn't know where
+ * 		to find the proper version. This will be logged as an error.
+ *
+ * This is called just once per uC, during driver loading. It is therefore
+ * automatically single-threaded and does not need to acquire any mutexes
+ * or spinlocks. OTOH, GEM is not yet fully initialised, so we can't do
+ * very much here.
+ *
+ * The main task here is to initiate the fetching of the uC firmware into
+ * memory, using the standard kernel firmware fetching support.  The actual
+ * fetching will then proceed asynchronously and in parallel with the rest
+ * of driver initialisation; later in the loading process we will synchronise
+ * with the firmware-fetching thread before transferring the firmware image
+ * firstly into a GEM object and then into the uC's memory.
+ */
+void
+intel_uc_fw_init(struct drm_device *dev, struct intel_uc_fw *uc_fw,
+		 const char *name, const char *fw_path)
+{
+	uc_fw->uc_dev = dev;
+	uc_fw->uc_name = name;
+	uc_fw->uc_fw_path = fw_path;
+	uc_fw->uc_fw_fetch_status = INTEL_UC_FIRMWARE_NONE;
+	uc_fw->uc_fw_load_status = INTEL_UC_FIRMWARE_NONE;
+	init_completion(&uc_fw->uc_fw_fetched);
+
+	if (fw_path == NULL)
+		return;
+
+	if (*fw_path == '\0') {
+		DRM_ERROR("No %s firmware known for this platform\n", name);
+		uc_fw->uc_fw_fetch_status = INTEL_UC_FIRMWARE_FAIL;
+		return;
+	}
+
+	uc_fw->uc_fw_fetch_status = INTEL_UC_FIRMWARE_PENDING;
+
+	if (request_firmware_nowait(THIS_MODULE, true, fw_path,
+				    &dev->pdev->dev,
+				    GFP_KERNEL, uc_fw,
+				    uc_fw_fetch_callback)) {
+		DRM_ERROR("Failed to request %s firmware from <%s>\n",
+			  name, fw_path);
+		uc_fw->uc_fw_fetch_status = INTEL_UC_FIRMWARE_FAIL;
+		return;
+	}
+
+	/* firmware fetch initiated, callback will signal completion */
+	DRM_DEBUG_DRIVER("initiated fetching %s firmware from <%s>\n",
+		name, fw_path);
+}
+
+/**
+ * intel_uc_fw_fini() - clean up all uC firmware-related data
+ * @uc_fw:	intel_uc_fw structure
+ */
+void
+intel_uc_fw_fini(struct intel_uc_fw *uc_fw)
+{
+	WARN_ON(!mutex_is_locked(&uc_fw->uc_dev->struct_mutex));
+
+	/*
+	 * Generally, the blob should have been released earlier, but
+	 * if the driver load were aborted after the fetch had been
+	 * initiated but not completed it might still be around
+	 */
+	if (uc_fw->uc_fw_fetch_status == INTEL_UC_FIRMWARE_PENDING)
+		wait_for_completion(&uc_fw->uc_fw_fetched);
+	release_firmware(uc_fw->uc_fw_blob);	/* OK even if NULL */
+	uc_fw->uc_fw_blob = NULL;
+
+	if (uc_fw->uc_fw_obj)
+		drm_gem_object_unreference(&uc_fw->uc_fw_obj->base);
+	uc_fw->uc_fw_obj = NULL;
+}
diff --git a/drivers/gpu/drm/i915/intel_uc_loader.h b/drivers/gpu/drm/i915/intel_uc_loader.h
new file mode 100644
index 0000000..22502ea
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_uc_loader.h
@@ -0,0 +1,82 @@ 
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Author:
+ *	Dave Gordon <david.s.gordon@intel.com>
+ */
+#ifndef _INTEL_UC_LOADER_H
+#define _INTEL_UC_LOADER_H
+
+/*
+ * Microcontroller (uC) firmware loading support
+ */
+
+/*
+ * These values are used to track the stages of getting the required firmware
+ * into an onboard microcontroller. The common code tracks the phases of
+ * fetching the firmware (aka "binary blob") from an external file into a GEM
+ * object in the 'uc_fw_fetch_status' field below; the uC-specific DMA code
+ * uses the 'uc_fw_load_status' field to track the transfer from GEM object
+ * to uC memory.
+ *
+ * For the first (fetch) stage, the interpretation of the values is:
+ * NONE - no firmware is being fetched e.g. because there is no uC
+ * PENDING - firmware fetch initiated; callback will complete 'uc_fw_fetched'
+ * SUCCESS - uC firmware fetched into a GEM object and ready for use
+ * FAIL - something went wrong; uC firmware is not available
+ *
+ * The second (load) stage is simpler as there is no asynchronous handoff:
+ * NONE - no firmware is being loaded e.g. because there is no uC
+ * PENDING - firmware DMA load in progress
+ * SUCCESS - uC firmware loaded into uC memory and ready for use
+ * FAIL - something went wrong; uC firmware is not available
+ */
+enum intel_uc_fw_status {
+	INTEL_UC_FIRMWARE_FAIL = -1,
+	INTEL_UC_FIRMWARE_NONE = 0,
+	INTEL_UC_FIRMWARE_PENDING,
+	INTEL_UC_FIRMWARE_SUCCESS
+};
+
+/*
+ * This structure encapsulates all the data needed during the process of
+ * fetching, caching, and loading the firmware image into the uC.
+ */
+struct intel_uc_fw {
+	struct drm_device *		uc_dev;
+	const char *			uc_name;
+	const char *			uc_fw_path;
+	const struct firmware *		uc_fw_blob;
+	struct completion		uc_fw_fetched;
+	size_t				uc_fw_size;
+	struct drm_i915_gem_object *	uc_fw_obj;
+	enum intel_uc_fw_status		uc_fw_fetch_status;
+	enum intel_uc_fw_status		uc_fw_load_status;
+};
+
+void intel_uc_fw_init(struct drm_device *dev, struct intel_uc_fw *uc_fw,
+		const char *uc_name, const char *fw_path);
+int intel_uc_fw_check(struct intel_uc_fw *uc_fw,
+		bool callback(struct intel_uc_fw *));
+void intel_uc_fw_fini(struct intel_uc_fw *uc_fw);
+
+#endif