[8/8] ASoC: AMD: add AMD ASoC ACP-I2S driver
diff mbox

Message ID 1444320760-21936-8-git-send-email-alexander.deucher@amd.com
State New
Headers show

Commit Message

Alex Deucher Oct. 8, 2015, 4:12 p.m. UTC
From: Maruthi Srinivas Bayyavarapu <Maruthi.Bayyavarapu@amd.com>

ACP IP block consists of dedicated DMA and I2S blocks. The PCM driver
provides the platform DMA component to ALSA core.

Signed-off-by: Maruthi Bayyavarapu <maruthi.bayyavarapu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Murali Krishna Vemuri <murali-krishna.vemuri@amd.com>
---

v2: squash in Kconfig fix
v3: squash additional commits, convert to mfd, drop rt286 changes
v4: add major changes as below:
    1. remove i2s specific changes and add them to dwc i2s driver.
    2. add ACP DMA logic to PCM driver.

 sound/soc/Kconfig           |   1 +
 sound/soc/Makefile          |   1 +
 sound/soc/amd/Kconfig       |   4 +
 sound/soc/amd/Makefile      |   3 +
 sound/soc/amd/acp-pcm-dma.c | 518 +++++++++++++++++++++++++++++++
 sound/soc/amd/acp.c         | 736 ++++++++++++++++++++++++++++++++++++++++++++
 sound/soc/amd/acp.h         | 147 +++++++++
 7 files changed, 1410 insertions(+)
 create mode 100644 sound/soc/amd/Kconfig
 create mode 100644 sound/soc/amd/Makefile
 create mode 100644 sound/soc/amd/acp-pcm-dma.c
 create mode 100644 sound/soc/amd/acp.c
 create mode 100644 sound/soc/amd/acp.h

Comments

Mark Brown Oct. 22, 2015, 4:14 p.m. UTC | #1
On Thu, Oct 08, 2015 at 12:12:40PM -0400, Alex Deucher wrote:

> ACP IP block consists of dedicated DMA and I2S blocks. The PCM driver
> provides the platform DMA component to ALSA core.

Overall my main comment on a lot of this code is that it feels like we
have created a lot of infrastructure that parallels standard Linux
subsystems and interfaces without something that clearly shows why we're
doing that.  There may be good reasons but they've not been articulated
and it's making the code a lot more complex to follow and review.  We
end up with multiple layers of abstraction and indirection that aren't
explained.

This patch is also rather large and appears to contain multiple
components which could be split, there's at least the DMA driver and
this abstraction layer than the DMA driver builds on.

> +/* ACP DMA irq handler routine for playback, capture usecases */
> +int dma_irq_handler(struct device *dev)
> +{
> +	u16 dscr_idx;
> +	u32 intr_flag;

This says it's an interrupt handler but it's using some custom,
non-genirq interface?

> +		/* Let ACP know the Allocated memory */
> +		num_of_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
> +
> +		/* Fill the page table entries in ACP SRAM */
> +		rtd->pg = pg;
> +		rtd->size = size;
> +		rtd->num_of_pages = num_of_pages;
> +		rtd->direction = substream->stream;

We never reference num_of_pages other than to assign it into the page
table entry?

> +static int acp_dma_close(struct snd_pcm_substream *substream)
> +{
> +	struct snd_pcm_runtime *runtime = substream->runtime;
> +	struct audio_substream_data *rtd = runtime->private_data;
> +	struct snd_soc_pcm_runtime *prtd = substream->private_data;
> +
> +	kfree(rtd);
> +
> +	pm_runtime_mark_last_busy(prtd->platform->dev);

Why the _mark_last_busy() here?

> +	/* The following members gets populated in device 'open'
> +	 * function. Till then interrupts are disabled in 'acp_hw_init'
> +	 * and device doesn't generate any interrupts.
> +	 */
> +
> +	audio_drv_data->play_stream = NULL;
> +	audio_drv_data->capture_stream = NULL;
> +
> +	audio_drv_data->iprv->dev = &pdev->dev;
> +	audio_drv_data->iprv->acp_mmio = audio_drv_data->acp_mmio;
> +	audio_drv_data->iprv->enable_intr = acp_enable_external_interrupts;
> +	audio_drv_data->iprv->irq_handler = dma_irq_handler;

I do not that we never seem to reset any of these in teardown paths and
I am slightly worried about races with interrupt handling in teardown,

> +static int acp_pcm_suspend(struct device *dev)
> +{
> +	bool pm_rts;
> +	struct audio_drv_data *adata = dev_get_drvdata(dev);
> +
> +	pm_rts = pm_runtime_status_suspended(dev);
> +	if (pm_rts == false)
> +		acp_suspend(adata->acp_mmio);
> +
> +	return 0;
> +}

This appears to merely call into the parent/core device (at least it
looks like this is the intention, there's a bunch of infrastructure fo
the core device which appeaars to replicate standard infrastructure).
Isn't whatever this eventually ends up doing handled by the parent
device in the normal PM callbacks.

This parallel infrastructure seems like it needs some motivation,
especially given that when I look at the implementation of functions
like amd_acp_suspend() and amd_acp_resume() in the preceeding patch they
are empty and therefore do nothing (they're also not exported so I
expect we get build errors if this is a module and the core isn't).
The easiest thing is probably to remove the code until there is an
immplementation and then review at that time.

> +static int acp_pcm_resume(struct device *dev)
> +{
> +	bool pm_rts;
> +	struct snd_pcm_substream *stream;
> +	struct snd_pcm_runtime *rtd;
> +	struct audio_substream_data *sdata;
> +	struct audio_drv_data *adata = dev_get_drvdata(dev);
> +
> +	pm_rts = pm_runtime_status_suspended(dev);
> +	if (pm_rts == true) {
> +		/* Resumed from system wide suspend and there is
> +		 * no pending audio activity to resume. */
> +		pm_runtime_disable(dev);
> +		pm_runtime_set_active(dev);
> +		pm_runtime_enable(dev);

The above looks very strange - why are we bouncing runtime PM like this?

> +/* Initialize the dma descriptors location in SRAM and page size */
> +static void acp_dma_descr_init(void __iomem *acp_mmio)
> +{
> +	u32 sram_pte_offset = 0;
> +
> +	/* SRAM starts at 0x04000000. From that offset one page (4KB) left for
> +	 * filling DMA descriptors.sram_pte_offset = 0x04001000 , used for

This is a device relative address rather than an absolute address?  A
lot of these numbers seem kind of large...

> +u16 get_dscr_idx(void __iomem *acp_mmio, int direction)
> +{
> +	u16 dscr_idx;
> +
> +	if (direction == SNDRV_PCM_STREAM_PLAYBACK) {
> +		dscr_idx = acp_reg_read(acp_mmio, mmACP_DMA_CUR_DSCR_13);
> +		dscr_idx = (dscr_idx == PLAYBACK_START_DMA_DESCR_CH13) ?
> +				PLAYBACK_START_DMA_DESCR_CH12 :
> +				PLAYBACK_END_DMA_DESCR_CH12;

Please write normal if statements rather than using the ternery
operator.

> +/* Check whether ACP DMA interrupt (IOC) is generated or not */
> +u32 acp_get_intr_flag(void __iomem *acp_mmio)
> +{
> +	u32 ext_intr_status;
> +	u32 intr_gen;
> +
> +	ext_intr_status = acp_reg_read(acp_mmio, mmACP_EXTERNAL_INTR_STAT);
> +	intr_gen = (((ext_intr_status &
> +		      ACP_EXTERNAL_INTR_STAT__DMAIOCStat_MASK) >>
> +		     ACP_EXTERNAL_INTR_STAT__DMAIOCStat__SHIFT));
> +
> +	return intr_gen;
> +}

Looking at a lot of the interrupt code I can't help but think there's a
genirq interrupt controller lurking in here somewhere.

> +	/*Invalidating the DAGB cache */
> +	acp_reg_write(ENABLE, acp_mmio, mmACP_DAGB_ATU_CTRL);

/* spaces around comments please */

> +	if ((ch_num == ACP_TO_I2S_DMA_CH_NUM) ||
> +	    (ch_num == ACP_TO_SYSRAM_CH_NUM) ||
> +	    (ch_num == I2S_TO_ACP_DMA_CH_NUM))
> +		dma_ctrl |= ACP_DMA_CNTL_0__DMAChIOCEn_MASK;
> +	else
> +		dma_ctrl &= ~ACP_DMA_CNTL_0__DMAChIOCEn_MASK;

switch statement please.

> +	u32 delay_time = ACP_DMA_RESET_TIME;

> +	/* check the channel status bit for some time and return the status */
> +	while (0 < delay_time) {
> +		dma_ch_sts = acp_reg_read(acp_mmio, mmACP_DMA_CH_STS);
> +		if (!(dma_ch_sts & BIT(ch_num))) {
> +			/* clear the reset flag after successfully stopping
> +			   the dma transfer and break from the loop */
> +			dma_ctrl &= ~ACP_DMA_CNTL_0__DMAChRst_MASK;
> +
> +			acp_reg_write(dma_ctrl, acp_mmio, mmACP_DMA_CNTL_0
> +								+ ch_num);
> +			break;
> +		}
> +		delay_time--;
> +	}
> +}

This isn't really a time, it's a number of spins (the amount of time
involved presumably depending on clock speed).  If this were a time I'd
expect to see a delay or sleep in here.

We're also falling off the end of the loop silently if the hardware
fails to respond, if it's worth waiting for the hardware to do it's
thing I'd expect it's also worth displaying an error if that happens.
This is a common pattern in much of the rest of the driver.

> +/* power off a tile/block within ACP */
> +static void acp_suspend_tile(void __iomem *acp_mmio, int tile)
> +{
> +	u32 val = 0;
> +	u32 timeout = 0;
> +
> +	if ((tile  < ACP_TILE_P1) || (tile > ACP_TILE_DSP2))
> +		return;
> +
> +	val = acp_reg_read(acp_mmio, mmACP_PGFSM_READ_REG_0 + tile);
> +	val &= ACP_TILE_ON_MASK;

This is definitely looking like a SoC that could benefit from the
standard kernel power management infrastructure and/or DAPM.  There's a
lot of code here that looks like it's following very common SoC design
patterns and could benefit from using infrastructure more.

> +static void acp_resume_tile(void __iomem *acp_mmio, int tile)
> +{
> +	u32 val = 0;
> +	u32 timeout = 0;
> +
> +	if ((tile  < ACP_TILE_P1) || (tile > ACP_TILE_DSP2))
> +		return;

Not worth printing an error if the user passed in something invalid?

> +/* Shutdown unused SRAM memory banks in ACP IP */
> +static void acp_turnoff_sram_banks(void __iomem *acp_mmio)
> +{
> +	/* Bank 0 : used for DMA descriptors
> +	 * Bank 1 to 4 : used for playback
> +	 * Bank 5 to 8 : used for capture
> +	 * Each bank is 8kB and max size allocated for playback/ capture is
> +	 * 16kB(max period size) * 2(max periods) reserved for playback/capture
> +	 * in ALSA driver
> +	 * Turn off all SRAM banks except above banks during playback/capture
> +	 */
> +	u32 val, bank;

I didn't see any runtime management of the other SRAM bank power, seems
like that'd be a good idea?

> +	/* initiailizing Garlic Control DAGB register */
> +	acp_reg_write(ONION_CNTL_DEFAULT, acp_mmio, mmACP_AXI2DAGB_ONION_CNTL);
> +
> +	/* initiailizing Onion Control DAGB registers */
> +	acp_reg_write(GARLIC_CNTL_DEFAULT, acp_mmio,
> +			mmACP_AXI2DAGB_GARLIC_CNTL);

The comments don't match the code...

> +/* Update DMA postion in audio ring buffer at period level granularity.
> + * This will be used by ALSA PCM driver
> + */
> +u32 acp_update_dma_pointer(void __iomem *acp_mmio, int direction,
> +				  u32 period_size)
> +{
> +	u32 pos;
> +	u16 dscr;
> +	u32 mul;
> +	u32 dma_config;
> +
> +	pos = 0;
> +
> +	if (direction == SNDRV_PCM_STREAM_PLAYBACK) {
> +		dscr = acp_reg_read(acp_mmio, mmACP_DMA_CUR_DSCR_13);
> +
> +		mul = (dscr == PLAYBACK_START_DMA_DESCR_CH13) ? 0 : 1;
> +		pos =  (mul * period_size);

Don't limit the accuracy to period level if the harwdare can do better,
report the current position as accurately as possible please.  This is
also feeling like we've got an unneeded abstraction here - why was this
not just directly the pointer operation?

> +/* Wait for initial buffering to complete in HOST to SRAM DMA channel
> + * for plaback usecase
> + */
> +void prebuffer_audio(void __iomem *acp_mmio)
> +{
> +	u32 dma_ch_sts;
> +	u32 channel_mask = BIT(SYSRAM_TO_ACP_CH_NUM);
> +
> +	do {
> +		/* Read the channel status to poll dma transfer completion
> +		 * (System RAM to SRAM)
> +		 * In this case, it will be runtime->start_threshold
> +		 * (2 ALSA periods) of transfer. Rendering starts after this
> +		 * threshold is met.
> +		 */
> +		dma_ch_sts = acp_reg_read(acp_mmio, mmACP_DMA_CH_STS);
> +		udelay(20);
> +	} while (dma_ch_sts & channel_mask);

This will hang hard if the hardware fails to respond for some reason,
please have a timeout.  A cpu_relax() would also be friendly.

> +#define DISABLE					0
> +#define ENABLE					1

Please don't do this :(

> +#define STATUS_SUCCESS 0
> +#define STATUS_UNSUCCESSFUL -1

Please use normal Linux error codes.
maruthi srinivas Oct. 23, 2015, 6:50 p.m. UTC | #2
On Thu, Oct 22, 2015 at 9:44 PM, Mark Brown <broonie@kernel.org> wrote:
> On Thu, Oct 08, 2015 at 12:12:40PM -0400, Alex Deucher wrote:
>
>> ACP IP block consists of dedicated DMA and I2S blocks. The PCM driver
>> provides the platform DMA component to ALSA core.
>
> Overall my main comment on a lot of this code is that it feels like we
> have created a lot of infrastructure that parallels standard Linux
> subsystems and interfaces without something that clearly shows why we're
> doing that.  There may be good reasons but they've not been articulated
> and it's making the code a lot more complex to follow and review.  We
> end up with multiple layers of abstraction and indirection that aren't
> explained.
>
> This patch is also rather large and appears to contain multiple
> components which could be split, there's at least the DMA driver and
> this abstraction layer than the DMA driver builds on.
>
>> +/* ACP DMA irq handler routine for playback, capture usecases */
>> +int dma_irq_handler(struct device *dev)
>> +{
>> +     u16 dscr_idx;
>> +     u32 intr_flag;
>
> This says it's an interrupt handler but it's using some custom,
> non-genirq interface?
>
Irq handling is based on parent device's (part of other subsystem)
provided interfaces. I will coordinate with others for this.

Do you mean, using virtual irq assignment for MFD devices
(ACP is a MFD device) and registering irq handler for it ?

>> +             /* Let ACP know the Allocated memory */
>> +             num_of_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
>> +
>> +             /* Fill the page table entries in ACP SRAM */
>> +             rtd->pg = pg;
>> +             rtd->size = size;
>> +             rtd->num_of_pages = num_of_pages;
>> +             rtd->direction = substream->stream;
>
> We never reference num_of_pages other than to assign it into the page
> table entry?
>
Sorry, I didn't understand your comment.
I used 'num_of_pages' to configure ACP audio device for accessing system
memory. The implementation is in 'acp_pte_config' function in the patch.

>> +static int acp_dma_close(struct snd_pcm_substream *substream)
>> +{
>> +     struct snd_pcm_runtime *runtime = substream->runtime;
>> +     struct audio_substream_data *rtd = runtime->private_data;
>> +     struct snd_soc_pcm_runtime *prtd = substream->private_data;
>> +
>> +     kfree(rtd);
>> +
>> +     pm_runtime_mark_last_busy(prtd->platform->dev);
>
> Why the _mark_last_busy() here?

I want to power off ACP audio IP, when the audio usecase is not active for
sometime (run time PM). I felt, 'close' is the correct place to mark this.
>
>> +     /* The following members gets populated in device 'open'
>> +      * function. Till then interrupts are disabled in 'acp_hw_init'
>> +      * and device doesn't generate any interrupts.
>> +      */
>> +
>> +     audio_drv_data->play_stream = NULL;
>> +     audio_drv_data->capture_stream = NULL;
>> +
>> +     audio_drv_data->iprv->dev = &pdev->dev;
>> +     audio_drv_data->iprv->acp_mmio = audio_drv_data->acp_mmio;
>> +     audio_drv_data->iprv->enable_intr = acp_enable_external_interrupts;
>> +     audio_drv_data->iprv->irq_handler = dma_irq_handler;
>
> I do not that we never seem to reset any of these in teardown paths and
> I am slightly worried about races with interrupt handling in teardown,
>
I will recheck this.

>> +static int acp_pcm_suspend(struct device *dev)
>> +{
>> +     bool pm_rts;
>> +     struct audio_drv_data *adata = dev_get_drvdata(dev);
>> +
>> +     pm_rts = pm_runtime_status_suspended(dev);
>> +     if (pm_rts == false)
>> +             acp_suspend(adata->acp_mmio);
>> +
>> +     return 0;
>> +}
>
> This appears to merely call into the parent/core device (at least it
> looks like this is the intention, there's a bunch of infrastructure fo
> the core device which appeaars to replicate standard infrastructure).
> Isn't whatever this eventually ends up doing handled by the parent
> device in the normal PM callbacks.
>

ACP device (child) can power off itself, when it receives suspend
request. So, the intention is to call 'acp_suspend' (defined in same patch)
from the 'suspend' callback of ACP device.

> This parallel infrastructure seems like it needs some motivation,
> especially given that when I look at the implementation of functions
> like amd_acp_suspend() and amd_acp_resume() in the preceeding patch they
> are empty and therefore do nothing (they're also not exported so I
> expect we get build errors if this is a module and the core isn't).
> The easiest thing is probably to remove the code until there is an
> immplementation and then review at that time.
>

There were two different functions with same name in two drivers
interacting here.That might have introduced some confusion.
Sorry, I will modify that.

Also, amd_acp_*() can be removed later, as they are not expected to add any
functonality in future. ACP device can be suspended/resumed using acp_*_tile().

>> +static int acp_pcm_resume(struct device *dev)
>> +{
>> +     bool pm_rts;
>> +     struct snd_pcm_substream *stream;
>> +     struct snd_pcm_runtime *rtd;
>> +     struct audio_substream_data *sdata;
>> +     struct audio_drv_data *adata = dev_get_drvdata(dev);
>> +
>> +     pm_rts = pm_runtime_status_suspended(dev);
>> +     if (pm_rts == true) {
>> +             /* Resumed from system wide suspend and there is
>> +              * no pending audio activity to resume. */
>> +             pm_runtime_disable(dev);
>> +             pm_runtime_set_active(dev);
>> +             pm_runtime_enable(dev);
>
> The above looks very strange - why are we bouncing runtime PM like this?

Sorry, I didn't understand your comment. I felt, steps mentioned in
kernel documentation :
http://lxr.free-electrons.com/source/Documentation/power/runtime_pm.txt#L634
is applicable in this scenario. I maybe wrong, but felt that is applicable.

>
>> +/* Initialize the dma descriptors location in SRAM and page size */
>> +static void acp_dma_descr_init(void __iomem *acp_mmio)
>> +{
>> +     u32 sram_pte_offset = 0;
>> +
>> +     /* SRAM starts at 0x04000000. From that offset one page (4KB) left for
>> +      * filling DMA descriptors.sram_pte_offset = 0x04001000 , used for
>
> This is a device relative address rather than an absolute address?  A
> lot of these numbers seem kind of large...

That is SRAM block's offset address. Sorry. I didn't understand the
expected change, here.

>
>> +u16 get_dscr_idx(void __iomem *acp_mmio, int direction)
>> +{
>> +     u16 dscr_idx;
>> +
>> +     if (direction == SNDRV_PCM_STREAM_PLAYBACK) {
>> +             dscr_idx = acp_reg_read(acp_mmio, mmACP_DMA_CUR_DSCR_13);
>> +             dscr_idx = (dscr_idx == PLAYBACK_START_DMA_DESCR_CH13) ?
>> +                             PLAYBACK_START_DMA_DESCR_CH12 :
>> +                             PLAYBACK_END_DMA_DESCR_CH12;
>
> Please write normal if statements rather than using the ternery
> operator.
>
Ok.

>> +/* Check whether ACP DMA interrupt (IOC) is generated or not */
>> +u32 acp_get_intr_flag(void __iomem *acp_mmio)
>> +{
>> +     u32 ext_intr_status;
>> +     u32 intr_gen;
>> +
>> +     ext_intr_status = acp_reg_read(acp_mmio, mmACP_EXTERNAL_INTR_STAT);
>> +     intr_gen = (((ext_intr_status &
>> +                   ACP_EXTERNAL_INTR_STAT__DMAIOCStat_MASK) >>
>> +                  ACP_EXTERNAL_INTR_STAT__DMAIOCStat__SHIFT));
>> +
>> +     return intr_gen;
>> +}
>
> Looking at a lot of the interrupt code I can't help but think there's a
> genirq interrupt controller lurking in here somewhere.
>
I will coordinate with other subsystem driver author, that this driver
is dependent on.

>> +     /*Invalidating the DAGB cache */
>> +     acp_reg_write(ENABLE, acp_mmio, mmACP_DAGB_ATU_CTRL);
>
> /* spaces around comments please */
>
>> +     if ((ch_num == ACP_TO_I2S_DMA_CH_NUM) ||
>> +         (ch_num == ACP_TO_SYSRAM_CH_NUM) ||
>> +         (ch_num == I2S_TO_ACP_DMA_CH_NUM))
>> +             dma_ctrl |= ACP_DMA_CNTL_0__DMAChIOCEn_MASK;
>> +     else
>> +             dma_ctrl &= ~ACP_DMA_CNTL_0__DMAChIOCEn_MASK;
>
> switch statement please.

Ok.
>
>> +     u32 delay_time = ACP_DMA_RESET_TIME;
>
>> +     /* check the channel status bit for some time and return the status */
>> +     while (0 < delay_time) {
>> +             dma_ch_sts = acp_reg_read(acp_mmio, mmACP_DMA_CH_STS);
>> +             if (!(dma_ch_sts & BIT(ch_num))) {
>> +                     /* clear the reset flag after successfully stopping
>> +                        the dma transfer and break from the loop */
>> +                     dma_ctrl &= ~ACP_DMA_CNTL_0__DMAChRst_MASK;
>> +
>> +                     acp_reg_write(dma_ctrl, acp_mmio, mmACP_DMA_CNTL_0
>> +                                                             + ch_num);
>> +                     break;
>> +             }
>> +             delay_time--;
>> +     }
>> +}
>
> This isn't really a time, it's a number of spins (the amount of time
> involved presumably depending on clock speed).  If this were a time I'd
> expect to see a delay or sleep in here.
>
> We're also falling off the end of the loop silently if the hardware
> fails to respond, if it's worth waiting for the hardware to do it's
> thing I'd expect it's also worth displaying an error if that happens.
> This is a common pattern in much of the rest of the driver.
>
Ok, will modify.

>> +/* power off a tile/block within ACP */
>> +static void acp_suspend_tile(void __iomem *acp_mmio, int tile)
>> +{
>> +     u32 val = 0;
>> +     u32 timeout = 0;
>> +
>> +     if ((tile  < ACP_TILE_P1) || (tile > ACP_TILE_DSP2))
>> +             return;
>> +
>> +     val = acp_reg_read(acp_mmio, mmACP_PGFSM_READ_REG_0 + tile);
>> +     val &= ACP_TILE_ON_MASK;
>
> This is definitely looking like a SoC that could benefit from the
> standard kernel power management infrastructure and/or DAPM.  There's a
> lot of code here that looks like it's following very common SoC design
> patterns and could benefit from using infrastructure more.
>
Sorry, I didn't understand. Could you help in adding more pointers on this.
The device can get powered-off, with this helper function, which can
be called from device 'suspend' callback.

>> +static void acp_resume_tile(void __iomem *acp_mmio, int tile)
>> +{
>> +     u32 val = 0;
>> +     u32 timeout = 0;
>> +
>> +     if ((tile  < ACP_TILE_P1) || (tile > ACP_TILE_DSP2))
>> +             return;
>
> Not worth printing an error if the user passed in something invalid?

Ok.

>
>> +/* Shutdown unused SRAM memory banks in ACP IP */
>> +static void acp_turnoff_sram_banks(void __iomem *acp_mmio)
>> +{
>> +     /* Bank 0 : used for DMA descriptors
>> +      * Bank 1 to 4 : used for playback
>> +      * Bank 5 to 8 : used for capture
>> +      * Each bank is 8kB and max size allocated for playback/ capture is
>> +      * 16kB(max period size) * 2(max periods) reserved for playback/capture
>> +      * in ALSA driver
>> +      * Turn off all SRAM banks except above banks during playback/capture
>> +      */
>> +     u32 val, bank;
>
> I didn't see any runtime management of the other SRAM bank power, seems
> like that'd be a good idea?

SRAM banks are part of ACP IP. With ACP's runtime PM handling, all blocks
within ACP IP can be powered-off and on.

>
>> +     /* initiailizing Garlic Control DAGB register */
>> +     acp_reg_write(ONION_CNTL_DEFAULT, acp_mmio, mmACP_AXI2DAGB_ONION_CNTL);
>> +
>> +     /* initiailizing Onion Control DAGB registers */
>> +     acp_reg_write(GARLIC_CNTL_DEFAULT, acp_mmio,
>> +                     mmACP_AXI2DAGB_GARLIC_CNTL);
>
> The comments don't match the code...

Oops, I will correct it.

>
>> +/* Update DMA postion in audio ring buffer at period level granularity.
>> + * This will be used by ALSA PCM driver
>> + */
>> +u32 acp_update_dma_pointer(void __iomem *acp_mmio, int direction,
>> +                               u32 period_size)
>> +{
>> +     u32 pos;
>> +     u16 dscr;
>> +     u32 mul;
>> +     u32 dma_config;
>> +
>> +     pos = 0;
>> +
>> +     if (direction == SNDRV_PCM_STREAM_PLAYBACK) {
>> +             dscr = acp_reg_read(acp_mmio, mmACP_DMA_CUR_DSCR_13);
>> +
>> +             mul = (dscr == PLAYBACK_START_DMA_DESCR_CH13) ? 0 : 1;
>> +             pos =  (mul * period_size);
>
> Don't limit the accuracy to period level if the harwdare can do better,
> report the current position as accurately as possible please.  This is
> also feeling like we've got an unneeded abstraction here - why was this
> not just directly the pointer operation?
>
The current version of hardware has the limitation of accuracy reporting.
I will remove abstraction, if suggested.

>> +/* Wait for initial buffering to complete in HOST to SRAM DMA channel
>> + * for plaback usecase
>> + */
>> +void prebuffer_audio(void __iomem *acp_mmio)
>> +{
>> +     u32 dma_ch_sts;
>> +     u32 channel_mask = BIT(SYSRAM_TO_ACP_CH_NUM);
>> +
>> +     do {
>> +             /* Read the channel status to poll dma transfer completion
>> +              * (System RAM to SRAM)
>> +              * In this case, it will be runtime->start_threshold
>> +              * (2 ALSA periods) of transfer. Rendering starts after this
>> +              * threshold is met.
>> +              */
>> +             dma_ch_sts = acp_reg_read(acp_mmio, mmACP_DMA_CH_STS);
>> +             udelay(20);
>> +     } while (dma_ch_sts & channel_mask);
>
> This will hang hard if the hardware fails to respond for some reason,
> please have a timeout.  A cpu_relax() would also be friendly.
>
I will modify this.

>> +#define DISABLE                                      0
>> +#define ENABLE                                       1
>
> Please don't do this :(

Ok.
>
>> +#define STATUS_SUCCESS 0
>> +#define STATUS_UNSUCCESSFUL -1
>
> Please use normal Linux error codes.
>

Ok.
> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel@alsa-project.org
> http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
>
Mark Brown Oct. 23, 2015, 7:31 p.m. UTC | #3
On Sat, Oct 24, 2015 at 12:20:09AM +0530, maruthi srinivas wrote:
> On Thu, Oct 22, 2015 at 9:44 PM, Mark Brown <broonie@kernel.org> wrote:
> > On Thu, Oct 08, 2015 at 12:12:40PM -0400, Alex Deucher wrote:

> >> +/* ACP DMA irq handler routine for playback, capture usecases */
> >> +int dma_irq_handler(struct device *dev)
> >> +{
> >> +     u16 dscr_idx;
> >> +     u32 intr_flag;
> >
> > This says it's an interrupt handler but it's using some custom,
> > non-genirq interface?

> Irq handling is based on parent device's (part of other subsystem)
> provided interfaces. I will coordinate with others for this.

> Do you mean, using virtual irq assignment for MFD devices
> (ACP is a MFD device) and registering irq handler for it ?

Well, I'd expect that if we're exporting interrupts around the system
we'd be doing so using genirq rather than open coding something.

> >> +             /* Let ACP know the Allocated memory */
> >> +             num_of_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
> >> +
> >> +             /* Fill the page table entries in ACP SRAM */
> >> +             rtd->pg = pg;
> >> +             rtd->size = size;
> >> +             rtd->num_of_pages = num_of_pages;
> >> +             rtd->direction = substream->stream;

> > We never reference num_of_pages other than to assign it into the page
> > table entry?

> Sorry, I didn't understand your comment.
> I used 'num_of_pages' to configure ACP audio device for accessing system
> memory. The implementation is in 'acp_pte_config' function in the patch.

In the above code we have two blocks of code, one doing an assignment to
a local variable and the other initialising the struct but the local
variable in the first block is only ever referenced in the second block.

> >> +static int acp_dma_close(struct snd_pcm_substream *substream)
> >> +{
> >> +     struct snd_pcm_runtime *runtime = substream->runtime;
> >> +     struct audio_substream_data *rtd = runtime->private_data;
> >> +     struct snd_soc_pcm_runtime *prtd = substream->private_data;

> >> +     kfree(rtd);

> >> +     pm_runtime_mark_last_busy(prtd->platform->dev);

> > Why the _mark_last_busy() here?

> I want to power off ACP audio IP, when the audio usecase is not active for
> sometime (run time PM). I felt, 'close' is the correct place to mark this.

That's not what _mark_last_busy() does...  the core already takes
runtime PM references for you.

> >> +static int acp_pcm_suspend(struct device *dev)
> >> +{
> >> +     bool pm_rts;
> >> +     struct audio_drv_data *adata = dev_get_drvdata(dev);
> >> +
> >> +     pm_rts = pm_runtime_status_suspended(dev);
> >> +     if (pm_rts == false)
> >> +             acp_suspend(adata->acp_mmio);
> >> +
> >> +     return 0;
> >> +}

> > This appears to merely call into the parent/core device (at least it
> > looks like this is the intention, there's a bunch of infrastructure fo
> > the core device which appeaars to replicate standard infrastructure).
> > Isn't whatever this eventually ends up doing handled by the parent
> > device in the normal PM callbacks.

> ACP device (child) can power off itself, when it receives suspend
> request. So, the intention is to call 'acp_suspend' (defined in same patch)
> from the 'suspend' callback of ACP device.

This doesn't address why you're open coding this rather than using
standard infrastructure.

> >> +     pm_rts = pm_runtime_status_suspended(dev);
> >> +     if (pm_rts == true) {
> >> +             /* Resumed from system wide suspend and there is
> >> +              * no pending audio activity to resume. */
> >> +             pm_runtime_disable(dev);
> >> +             pm_runtime_set_active(dev);
> >> +             pm_runtime_enable(dev);
> >
> > The above looks very strange - why are we bouncing runtime PM like this?
> 
> Sorry, I didn't understand your comment. I felt, steps mentioned in
> kernel documentation :
> http://lxr.free-electrons.com/source/Documentation/power/runtime_pm.txt#L634
> is applicable in this scenario. I maybe wrong, but felt that is applicable.

Please document this clearly - your comment doesn't appear to relate to
the case where system resume powers things on at all.

> >> +/* Initialize the dma descriptors location in SRAM and page size */
> >> +static void acp_dma_descr_init(void __iomem *acp_mmio)
> >> +{
> >> +     u32 sram_pte_offset = 0;
> >> +
> >> +     /* SRAM starts at 0x04000000. From that offset one page (4KB) left for
> >> +      * filling DMA descriptors.sram_pte_offset = 0x04001000 , used for

> > This is a device relative address rather than an absolute address?  A
> > lot of these numbers seem kind of large...

> That is SRAM block's offset address. Sorry. I didn't understand the
> expected change, here.

Offset with regard to what?  I'm asking if these addresses are within
the IP or global.

> >> +/* power off a tile/block within ACP */
> >> +static void acp_suspend_tile(void __iomem *acp_mmio, int tile)
> >> +{
> >> +     u32 val = 0;
> >> +     u32 timeout = 0;
> >> +
> >> +     if ((tile  < ACP_TILE_P1) || (tile > ACP_TILE_DSP2))
> >> +             return;
> >> +
> >> +     val = acp_reg_read(acp_mmio, mmACP_PGFSM_READ_REG_0 + tile);
> >> +     val &= ACP_TILE_ON_MASK;

> > This is definitely looking like a SoC that could benefit from the
> > standard kernel power management infrastructure and/or DAPM.  There's a
> > lot of code here that looks like it's following very common SoC design
> > patterns and could benefit from using infrastructure more.

> Sorry, I didn't understand. Could you help in adding more pointers on this.
> The device can get powered-off, with this helper function, which can
> be called from device 'suspend' callback.

Power domains are implemented using the interface in include/linux/pm_domain.h
and DAPM is sound/soc/soc-dapm.c.  The point here is that it looks very
much like you are open coding implementations of concepts that we
already have generic support for which adds greatly to both the size and
complexity of this code.

> >> +static void acp_turnoff_sram_banks(void __iomem *acp_mmio)
> >> +{
> >> +     /* Bank 0 : used for DMA descriptors
> >> +      * Bank 1 to 4 : used for playback
> >> +      * Bank 5 to 8 : used for capture
> >> +      * Each bank is 8kB and max size allocated for playback/ capture is
> >> +      * 16kB(max period size) * 2(max periods) reserved for playback/capture
> >> +      * in ALSA driver
> >> +      * Turn off all SRAM banks except above banks during playback/capture
> >> +      */
> >> +     u32 val, bank;

> > I didn't see any runtime management of the other SRAM bank power, seems
> > like that'd be a good idea?

> SRAM banks are part of ACP IP. With ACP's runtime PM handling, all blocks
> within ACP IP can be powered-off and on.

So why can't that cope with these banks then?

> >> +/* Update DMA postion in audio ring buffer at period level granularity.
> >> + * This will be used by ALSA PCM driver
> >> + */
> >> +u32 acp_update_dma_pointer(void __iomem *acp_mmio, int direction,
> >> +                               u32 period_size)
> >> +{

> > Don't limit the accuracy to period level if the harwdare can do better,
> > report the current position as accurately as possible please.  This is
> > also feeling like we've got an unneeded abstraction here - why was this
> > not just directly the pointer operation?

> The current version of hardware has the limitation of accuracy reporting.

If the hardware is limited why does the comment suggest that we are
limiting based on periods?

> I will remove abstraction, if suggested.

Yes, it's just adding to the complexity of the code.
Alex Deucher Oct. 23, 2015, 8:39 p.m. UTC | #4
On Fri, Oct 23, 2015 at 3:31 PM, Mark Brown <broonie@kernel.org> wrote:
> On Sat, Oct 24, 2015 at 12:20:09AM +0530, maruthi srinivas wrote:
>> On Thu, Oct 22, 2015 at 9:44 PM, Mark Brown <broonie@kernel.org> wrote:
>> > On Thu, Oct 08, 2015 at 12:12:40PM -0400, Alex Deucher wrote:
>
>> >> +/* ACP DMA irq handler routine for playback, capture usecases */
>> >> +int dma_irq_handler(struct device *dev)
>> >> +{
>> >> +     u16 dscr_idx;
>> >> +     u32 intr_flag;
>> >
>> > This says it's an interrupt handler but it's using some custom,
>> > non-genirq interface?
>
>> Irq handling is based on parent device's (part of other subsystem)
>> provided interfaces. I will coordinate with others for this.
>
>> Do you mean, using virtual irq assignment for MFD devices
>> (ACP is a MFD device) and registering irq handler for it ?
>
> Well, I'd expect that if we're exporting interrupts around the system
> we'd be doing so using genirq rather than open coding something.

I think the problem is, that in a lot of cases, it's not always
readily clear that there is an existing infrastructure to do
something.  In this particular case, most of the common infrastructure
that should be utilized for this particular patch set comes from
non-traditional x86 platforms so most of us that come from a more
traditional x86 background as not as familiar with them.  Searching
for information on how to solve these does not always produce
particularly useful results (e.g., genirq).  So we end up open coding
a solution, not out of malice, but ignorance.  If there is
infrastructure we should be using, please continue to point it out
during code review and we'll do our best to take advantage of it.

In the case of this hardware, audio interrupts are triggered on the
GPU.  The GPU driver's interrupt handler checks the interrupt source
and calls the handler registered to handle that source.  Until now the
ACP audio block was added, all the GPU interrupt sources were stuff
handled directly by the GPU driver (vblanks, display hotplug, command
submission fences, etc.).

Thanks,

Alex


>
>> >> +             /* Let ACP know the Allocated memory */
>> >> +             num_of_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
>> >> +
>> >> +             /* Fill the page table entries in ACP SRAM */
>> >> +             rtd->pg = pg;
>> >> +             rtd->size = size;
>> >> +             rtd->num_of_pages = num_of_pages;
>> >> +             rtd->direction = substream->stream;
>
>> > We never reference num_of_pages other than to assign it into the page
>> > table entry?
>
>> Sorry, I didn't understand your comment.
>> I used 'num_of_pages' to configure ACP audio device for accessing system
>> memory. The implementation is in 'acp_pte_config' function in the patch.
>
> In the above code we have two blocks of code, one doing an assignment to
> a local variable and the other initialising the struct but the local
> variable in the first block is only ever referenced in the second block.
>
>> >> +static int acp_dma_close(struct snd_pcm_substream *substream)
>> >> +{
>> >> +     struct snd_pcm_runtime *runtime = substream->runtime;
>> >> +     struct audio_substream_data *rtd = runtime->private_data;
>> >> +     struct snd_soc_pcm_runtime *prtd = substream->private_data;
>
>> >> +     kfree(rtd);
>
>> >> +     pm_runtime_mark_last_busy(prtd->platform->dev);
>
>> > Why the _mark_last_busy() here?
>
>> I want to power off ACP audio IP, when the audio usecase is not active for
>> sometime (run time PM). I felt, 'close' is the correct place to mark this.
>
> That's not what _mark_last_busy() does...  the core already takes
> runtime PM references for you.
>
>> >> +static int acp_pcm_suspend(struct device *dev)
>> >> +{
>> >> +     bool pm_rts;
>> >> +     struct audio_drv_data *adata = dev_get_drvdata(dev);
>> >> +
>> >> +     pm_rts = pm_runtime_status_suspended(dev);
>> >> +     if (pm_rts == false)
>> >> +             acp_suspend(adata->acp_mmio);
>> >> +
>> >> +     return 0;
>> >> +}
>
>> > This appears to merely call into the parent/core device (at least it
>> > looks like this is the intention, there's a bunch of infrastructure fo
>> > the core device which appeaars to replicate standard infrastructure).
>> > Isn't whatever this eventually ends up doing handled by the parent
>> > device in the normal PM callbacks.
>
>> ACP device (child) can power off itself, when it receives suspend
>> request. So, the intention is to call 'acp_suspend' (defined in same patch)
>> from the 'suspend' callback of ACP device.
>
> This doesn't address why you're open coding this rather than using
> standard infrastructure.
>
>> >> +     pm_rts = pm_runtime_status_suspended(dev);
>> >> +     if (pm_rts == true) {
>> >> +             /* Resumed from system wide suspend and there is
>> >> +              * no pending audio activity to resume. */
>> >> +             pm_runtime_disable(dev);
>> >> +             pm_runtime_set_active(dev);
>> >> +             pm_runtime_enable(dev);
>> >
>> > The above looks very strange - why are we bouncing runtime PM like this?
>>
>> Sorry, I didn't understand your comment. I felt, steps mentioned in
>> kernel documentation :
>> http://lxr.free-electrons.com/source/Documentation/power/runtime_pm.txt#L634
>> is applicable in this scenario. I maybe wrong, but felt that is applicable.
>
> Please document this clearly - your comment doesn't appear to relate to
> the case where system resume powers things on at all.
>
>> >> +/* Initialize the dma descriptors location in SRAM and page size */
>> >> +static void acp_dma_descr_init(void __iomem *acp_mmio)
>> >> +{
>> >> +     u32 sram_pte_offset = 0;
>> >> +
>> >> +     /* SRAM starts at 0x04000000. From that offset one page (4KB) left for
>> >> +      * filling DMA descriptors.sram_pte_offset = 0x04001000 , used for
>
>> > This is a device relative address rather than an absolute address?  A
>> > lot of these numbers seem kind of large...
>
>> That is SRAM block's offset address. Sorry. I didn't understand the
>> expected change, here.
>
> Offset with regard to what?  I'm asking if these addresses are within
> the IP or global.
>
>> >> +/* power off a tile/block within ACP */
>> >> +static void acp_suspend_tile(void __iomem *acp_mmio, int tile)
>> >> +{
>> >> +     u32 val = 0;
>> >> +     u32 timeout = 0;
>> >> +
>> >> +     if ((tile  < ACP_TILE_P1) || (tile > ACP_TILE_DSP2))
>> >> +             return;
>> >> +
>> >> +     val = acp_reg_read(acp_mmio, mmACP_PGFSM_READ_REG_0 + tile);
>> >> +     val &= ACP_TILE_ON_MASK;
>
>> > This is definitely looking like a SoC that could benefit from the
>> > standard kernel power management infrastructure and/or DAPM.  There's a
>> > lot of code here that looks like it's following very common SoC design
>> > patterns and could benefit from using infrastructure more.
>
>> Sorry, I didn't understand. Could you help in adding more pointers on this.
>> The device can get powered-off, with this helper function, which can
>> be called from device 'suspend' callback.
>
> Power domains are implemented using the interface in include/linux/pm_domain.h
> and DAPM is sound/soc/soc-dapm.c.  The point here is that it looks very
> much like you are open coding implementations of concepts that we
> already have generic support for which adds greatly to both the size and
> complexity of this code.
>
>> >> +static void acp_turnoff_sram_banks(void __iomem *acp_mmio)
>> >> +{
>> >> +     /* Bank 0 : used for DMA descriptors
>> >> +      * Bank 1 to 4 : used for playback
>> >> +      * Bank 5 to 8 : used for capture
>> >> +      * Each bank is 8kB and max size allocated for playback/ capture is
>> >> +      * 16kB(max period size) * 2(max periods) reserved for playback/capture
>> >> +      * in ALSA driver
>> >> +      * Turn off all SRAM banks except above banks during playback/capture
>> >> +      */
>> >> +     u32 val, bank;
>
>> > I didn't see any runtime management of the other SRAM bank power, seems
>> > like that'd be a good idea?
>
>> SRAM banks are part of ACP IP. With ACP's runtime PM handling, all blocks
>> within ACP IP can be powered-off and on.
>
> So why can't that cope with these banks then?
>
>> >> +/* Update DMA postion in audio ring buffer at period level granularity.
>> >> + * This will be used by ALSA PCM driver
>> >> + */
>> >> +u32 acp_update_dma_pointer(void __iomem *acp_mmio, int direction,
>> >> +                               u32 period_size)
>> >> +{
>
>> > Don't limit the accuracy to period level if the harwdare can do better,
>> > report the current position as accurately as possible please.  This is
>> > also feeling like we've got an unneeded abstraction here - why was this
>> > not just directly the pointer operation?
>
>> The current version of hardware has the limitation of accuracy reporting.
>
> If the hardware is limited why does the comment suggest that we are
> limiting based on periods?
>
>> I will remove abstraction, if suggested.
>
> Yes, it's just adding to the complexity of the code.
maruthi srinivas Oct. 23, 2015, 8:51 p.m. UTC | #5
On Sat, Oct 24, 2015 at 1:01 AM, Mark Brown <broonie@kernel.org> wrote:
> On Sat, Oct 24, 2015 at 12:20:09AM +0530, maruthi srinivas wrote:
>> On Thu, Oct 22, 2015 at 9:44 PM, Mark Brown <broonie@kernel.org> wrote:
>> > On Thu, Oct 08, 2015 at 12:12:40PM -0400, Alex Deucher wrote:
>
> Please document this clearly - your comment doesn't appear to relate to
> the case where system resume powers things on at all.
>
ok.

>> >> +/* Initialize the dma descriptors location in SRAM and page size */
>> >> +static void acp_dma_descr_init(void __iomem *acp_mmio)
>> >> +{
>> >> +     u32 sram_pte_offset = 0;
>> >> +
>> >> +     /* SRAM starts at 0x04000000. From that offset one page (4KB) left for
>> >> +      * filling DMA descriptors.sram_pte_offset = 0x04001000 , used for
>
>> > This is a device relative address rather than an absolute address?  A
>> > lot of these numbers seem kind of large...
>
>> That is SRAM block's offset address. Sorry. I didn't understand the
>> expected change, here.
>
> Offset with regard to what?  I'm asking if these addresses are within
> the IP or global.

mentioned addresses are offsets within the ACP IP.
>

>> >> +static void acp_turnoff_sram_banks(void __iomem *acp_mmio)
>> >> +{
>> >> +     /* Bank 0 : used for DMA descriptors
>> >> +      * Bank 1 to 4 : used for playback
>> >> +      * Bank 5 to 8 : used for capture
>> >> +      * Each bank is 8kB and max size allocated for playback/ capture is
>> >> +      * 16kB(max period size) * 2(max periods) reserved for playback/capture
>> >> +      * in ALSA driver
>> >> +      * Turn off all SRAM banks except above banks during playback/capture
>> >> +      */
>> >> +     u32 val, bank;
>
>> > I didn't see any runtime management of the other SRAM bank power, seems
>> > like that'd be a good idea?
>
>> SRAM banks are part of ACP IP. With ACP's runtime PM handling, all blocks
>> within ACP IP can be powered-off and on.
>
> So why can't that cope with these banks then?

Maybe Iam not clear before. I mean that memory banks which wont be needed
at all are turned off forever. Using runtime PM, when complete ACP IP gets
powered-off all the banks(including the ones used for play/capture) within IP
are turned off. When IP is runtime resumed, though all banks gets turned on,
the unused banks are turned off again. With this, Iam trying to achieve
runtime management.

>
>> >> +/* Update DMA postion in audio ring buffer at period level granularity.
>> >> + * This will be used by ALSA PCM driver
>> >> + */
>> >> +u32 acp_update_dma_pointer(void __iomem *acp_mmio, int direction,
>> >> +                               u32 period_size)
>> >> +{
>
>> > Don't limit the accuracy to period level if the harwdare can do better,
>> > report the current position as accurately as possible please.  This is
>> > also feeling like we've got an unneeded abstraction here - why was this
>> > not just directly the pointer operation?
>
>> The current version of hardware has the limitation of accuracy reporting.
>
> If the hardware is limited why does the comment suggest that we are
> limiting based on periods?

I will modify accordingly.
Mark Brown Oct. 23, 2015, 9:17 p.m. UTC | #6
On Fri, Oct 23, 2015 at 04:39:07PM -0400, Alex Deucher wrote:

> something.  In this particular case, most of the common infrastructure
> that should be utilized for this particular patch set comes from
> non-traditional x86 platforms so most of us that come from a more
> traditional x86 background as not as familiar with them.  Searching

A lot of this infrastructure (like genirq) comes as much from the x86
side as anywhere else :(
Mark Brown Oct. 23, 2015, 9:18 p.m. UTC | #7
On Sat, Oct 24, 2015 at 02:21:04AM +0530, maruthi srinivas wrote:
> On Sat, Oct 24, 2015 at 1:01 AM, Mark Brown <broonie@kernel.org> wrote:

> >> >> +static void acp_turnoff_sram_banks(void __iomem *acp_mmio)
> >> >> +{
> >> >> +     /* Bank 0 : used for DMA descriptors
> >> >> +      * Bank 1 to 4 : used for playback
> >> >> +      * Bank 5 to 8 : used for capture

> >> SRAM banks are part of ACP IP. With ACP's runtime PM handling, all blocks
> >> within ACP IP can be powered-off and on.

> > So why can't that cope with these banks then?

> Maybe Iam not clear before. I mean that memory banks which wont be needed
> at all are turned off forever. Using runtime PM, when complete ACP IP gets
> powered-off all the banks(including the ones used for play/capture) within IP
> are turned off. When IP is runtime resumed, though all banks gets turned on,
> the unused banks are turned off again. With this, Iam trying to achieve
> runtime management.

So the initialisation that's done to power off the unused memory banks
somehow gets preserved when the block is powered off during runtime
power management?  It's really not clear, this looks like it's only
called once on init.

Obviously it's also less power efficient than it could be too since it's
going to (so far as I can tell) keep the playback and capture areas
powered up even when only one is in use.

Patch
diff mbox

diff --git a/sound/soc/Kconfig b/sound/soc/Kconfig
index 225bfda..a278840 100644
--- a/sound/soc/Kconfig
+++ b/sound/soc/Kconfig
@@ -35,6 +35,7 @@  config SND_SOC_TOPOLOGY
 
 # All the supported SoCs
 source "sound/soc/adi/Kconfig"
+source "sound/soc/amd/Kconfig"
 source "sound/soc/atmel/Kconfig"
 source "sound/soc/au1x/Kconfig"
 source "sound/soc/bcm/Kconfig"
diff --git a/sound/soc/Makefile b/sound/soc/Makefile
index 134aca1..5927544 100644
--- a/sound/soc/Makefile
+++ b/sound/soc/Makefile
@@ -17,6 +17,7 @@  obj-$(CONFIG_SND_SOC)	+= snd-soc-core.o
 obj-$(CONFIG_SND_SOC)	+= codecs/
 obj-$(CONFIG_SND_SOC)	+= generic/
 obj-$(CONFIG_SND_SOC)	+= adi/
+obj-$(CONFIG_SND_SOC)	+= amd/
 obj-$(CONFIG_SND_SOC)	+= atmel/
 obj-$(CONFIG_SND_SOC)	+= au1x/
 obj-$(CONFIG_SND_SOC)	+= bcm/
diff --git a/sound/soc/amd/Kconfig b/sound/soc/amd/Kconfig
new file mode 100644
index 0000000..78187eb
--- /dev/null
+++ b/sound/soc/amd/Kconfig
@@ -0,0 +1,4 @@ 
+config SND_SOC_AMD_ACP
+	tristate "AMD Audio Coprocessor support"
+	help
+	 This option enables ACP DMA support on AMD platform.
diff --git a/sound/soc/amd/Makefile b/sound/soc/amd/Makefile
new file mode 100644
index 0000000..62648cb
--- /dev/null
+++ b/sound/soc/amd/Makefile
@@ -0,0 +1,3 @@ 
+snd-soc-acp-pcm-objs	:= acp-pcm-dma.o acp.o
+
+obj-$(CONFIG_SND_SOC_AMD_ACP) += snd-soc-acp-pcm.o
diff --git a/sound/soc/amd/acp-pcm-dma.c b/sound/soc/amd/acp-pcm-dma.c
new file mode 100644
index 0000000..5044188
--- /dev/null
+++ b/sound/soc/amd/acp-pcm-dma.c
@@ -0,0 +1,518 @@ 
+/*
+ * AMD ALSA SoC PCM Driver
+ *
+ * Copyright 2014-2015 Advanced Micro Devices, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#include <linux/interrupt.h>
+#include <linux/platform_device.h>
+#include <linux/device.h>
+#include <linux/dma-mapping.h>
+#include <linux/slab.h>
+#include <linux/module.h>
+#include <linux/err.h>
+#include <linux/io.h>
+#include <linux/pci.h>
+#include <linux/pm_runtime.h>
+#include <linux/mfd/amd_acp.h>
+
+#include <sound/pcm.h>
+#include <sound/pcm_params.h>
+#include <sound/soc.h>
+
+#include "acp.h"
+
+#define PLAYBACK_MIN_NUM_PERIODS    2
+#define PLAYBACK_MAX_NUM_PERIODS    2
+#define PLAYBACK_MAX_PERIOD_SIZE    16384
+#define PLAYBACK_MIN_PERIOD_SIZE    1024
+#define CAPTURE_MIN_NUM_PERIODS     2
+#define CAPTURE_MAX_NUM_PERIODS     2
+#define CAPTURE_MAX_PERIOD_SIZE     16384
+#define CAPTURE_MIN_PERIOD_SIZE     1024
+
+#define NUM_DSCRS_PER_CHANNEL 2
+
+#define MAX_BUFFER (PLAYBACK_MAX_PERIOD_SIZE * PLAYBACK_MAX_NUM_PERIODS)
+#define MIN_BUFFER MAX_BUFFER
+
+static const struct snd_pcm_hardware acp_pcm_hardware_playback = {
+	.info = SNDRV_PCM_INFO_INTERLEAVED |
+		SNDRV_PCM_INFO_BLOCK_TRANSFER | SNDRV_PCM_INFO_MMAP |
+		SNDRV_PCM_INFO_MMAP_VALID | SNDRV_PCM_INFO_BATCH |
+		SNDRV_PCM_INFO_PAUSE | SNDRV_PCM_INFO_RESUME,
+	.formats = SNDRV_PCM_FMTBIT_S16_LE |
+		SNDRV_PCM_FMTBIT_S24_LE | SNDRV_PCM_FMTBIT_S32_LE,
+	.channels_min = 1,
+	.channels_max = 8,
+	.rates = SNDRV_PCM_RATE_8000_96000,
+	.rate_min = 8000,
+	.rate_max = 96000,
+	.buffer_bytes_max = PLAYBACK_MAX_NUM_PERIODS * PLAYBACK_MAX_PERIOD_SIZE,
+	.period_bytes_min = PLAYBACK_MIN_PERIOD_SIZE,
+	.period_bytes_max = PLAYBACK_MAX_PERIOD_SIZE,
+	.periods_min = PLAYBACK_MIN_NUM_PERIODS,
+	.periods_max = PLAYBACK_MAX_NUM_PERIODS,
+};
+
+static const struct snd_pcm_hardware acp_pcm_hardware_capture = {
+	.info = SNDRV_PCM_INFO_INTERLEAVED |
+		SNDRV_PCM_INFO_BLOCK_TRANSFER | SNDRV_PCM_INFO_MMAP |
+		SNDRV_PCM_INFO_MMAP_VALID | SNDRV_PCM_INFO_BATCH |
+	    SNDRV_PCM_INFO_PAUSE | SNDRV_PCM_INFO_RESUME,
+	.formats = SNDRV_PCM_FMTBIT_S16_LE |
+		SNDRV_PCM_FMTBIT_S24_LE | SNDRV_PCM_FMTBIT_S32_LE,
+	.channels_min = 1,
+	.channels_max = 2,
+	.rates = SNDRV_PCM_RATE_8000_48000,
+	.rate_min = 8000,
+	.rate_max = 48000,
+	.buffer_bytes_max = CAPTURE_MAX_NUM_PERIODS * CAPTURE_MAX_PERIOD_SIZE,
+	.period_bytes_min = CAPTURE_MIN_PERIOD_SIZE,
+	.period_bytes_max = CAPTURE_MAX_PERIOD_SIZE,
+	.periods_min = CAPTURE_MIN_NUM_PERIODS,
+	.periods_max = CAPTURE_MAX_NUM_PERIODS,
+};
+
+struct audio_drv_data {
+	struct snd_pcm_substream *play_stream;
+	struct snd_pcm_substream *capture_stream;
+	struct acp_irq_prv *iprv;
+	void __iomem *acp_mmio;
+};
+
+/* ACP DMA irq handler routine for playback, capture usecases */
+int dma_irq_handler(struct device *dev)
+{
+	u16 dscr_idx;
+	u32 intr_flag;
+
+	int priority_level = 0;
+	struct audio_drv_data *irq_data = dev_get_drvdata(dev);
+	void __iomem *acp_mmio = irq_data->acp_mmio;
+
+	intr_flag = acp_get_intr_flag(acp_mmio);
+
+	if ((intr_flag & BIT(ACP_TO_I2S_DMA_CH_NUM)) != 0) {
+		dscr_idx = get_dscr_idx(acp_mmio, SNDRV_PCM_STREAM_PLAYBACK);
+		config_acp_dma_channel(acp_mmio, SYSRAM_TO_ACP_CH_NUM, dscr_idx,
+				       1, priority_level);
+		acp_dma_start(acp_mmio, SYSRAM_TO_ACP_CH_NUM, false);
+
+		snd_pcm_period_elapsed(irq_data->play_stream);
+		acp_ext_stat_clear_dmaioc(acp_mmio, ACP_TO_I2S_DMA_CH_NUM);
+	}
+
+	if ((intr_flag & BIT(I2S_TO_ACP_DMA_CH_NUM)) != 0) {
+		dscr_idx = get_dscr_idx(acp_mmio, SNDRV_PCM_STREAM_CAPTURE);
+		config_acp_dma_channel(acp_mmio, ACP_TO_SYSRAM_CH_NUM, dscr_idx,
+				       1, priority_level);
+		acp_dma_start(acp_mmio, ACP_TO_SYSRAM_CH_NUM, false);
+		acp_ext_stat_clear_dmaioc(acp_mmio, I2S_TO_ACP_DMA_CH_NUM);
+	}
+
+	if ((intr_flag & BIT(ACP_TO_SYSRAM_CH_NUM)) != 0) {
+		snd_pcm_period_elapsed(irq_data->capture_stream);
+		acp_ext_stat_clear_dmaioc(acp_mmio, ACP_TO_SYSRAM_CH_NUM);
+	}
+
+	return 0;
+}
+
+static int acp_dma_open(struct snd_pcm_substream *substream)
+{
+	int ret = 0;
+	struct snd_pcm_runtime *runtime = substream->runtime;
+	struct snd_soc_pcm_runtime *prtd = substream->private_data;
+	struct audio_drv_data *intr_data = dev_get_drvdata(prtd->platform->dev);
+
+	struct audio_substream_data *adata =
+		kzalloc(sizeof(struct audio_substream_data), GFP_KERNEL);
+	if (adata == NULL)
+		return -ENOMEM;
+
+	if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK)
+		runtime->hw = acp_pcm_hardware_playback;
+	else
+		runtime->hw = acp_pcm_hardware_capture;
+
+	ret = snd_pcm_hw_constraint_integer(runtime,
+					    SNDRV_PCM_HW_PARAM_PERIODS);
+	if (ret < 0) {
+		dev_err(prtd->platform->dev, "set integer constraint failed\n");
+		return ret;
+	}
+
+	adata->acp_mmio = intr_data->acp_mmio;
+	runtime->private_data = adata;
+
+	if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK)
+		intr_data->play_stream = substream;
+	else
+		intr_data->capture_stream = substream;
+
+	return 0;
+}
+
+static int acp_dma_hw_params(struct snd_pcm_substream *substream,
+			     struct snd_pcm_hw_params *params)
+{
+	int status;
+	uint64_t size;
+	struct snd_dma_buffer *dma_buffer;
+	struct page *pg;
+	u16 num_of_pages;
+	struct snd_pcm_runtime *runtime;
+	struct audio_substream_data *rtd;
+
+	dma_buffer = &substream->dma_buffer;
+
+	runtime = substream->runtime;
+	rtd = runtime->private_data;
+
+	if (WARN_ON(!rtd))
+		return -EINVAL;
+
+	size = params_buffer_bytes(params);
+	status = snd_pcm_lib_malloc_pages(substream, size);
+	if (status < 0)
+		return status;
+
+	memset(substream->runtime->dma_area, 0, params_buffer_bytes(params));
+	pg = virt_to_page(substream->dma_buffer.area);
+
+	if (pg != NULL) {
+		/* Save for runtime private data */
+		rtd->pg = pg;
+		rtd->order = get_order(size);
+
+		/* Let ACP know the Allocated memory */
+		num_of_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
+
+		/* Fill the page table entries in ACP SRAM */
+		rtd->pg = pg;
+		rtd->size = size;
+		rtd->num_of_pages = num_of_pages;
+		rtd->direction = substream->stream;
+
+		config_acp_dma(rtd->acp_mmio, rtd);
+		status = 0;
+	} else {
+		status = -ENOMEM;
+	}
+	return status;
+}
+
+static int acp_dma_hw_free(struct snd_pcm_substream *substream)
+{
+	return snd_pcm_lib_free_pages(substream);
+}
+
+static snd_pcm_uframes_t acp_dma_pointer(struct snd_pcm_substream *substream)
+{
+	u32 pos = 0;
+	struct snd_pcm_runtime *runtime = substream->runtime;
+	struct audio_substream_data *rtd = runtime->private_data;
+
+	pos = acp_update_dma_pointer(rtd->acp_mmio, substream->stream,
+				frames_to_bytes(runtime, runtime->period_size));
+	return bytes_to_frames(runtime, pos);
+
+}
+
+static int acp_dma_mmap(struct snd_pcm_substream *substream,
+			struct vm_area_struct *vma)
+{
+	return snd_pcm_lib_default_mmap(substream, vma);
+}
+
+static int acp_dma_prepare(struct snd_pcm_substream *substream)
+{
+	struct snd_pcm_runtime *runtime = substream->runtime;
+	struct audio_substream_data *rtd = runtime->private_data;
+
+	if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK) {
+		config_acp_dma_channel(rtd->acp_mmio, SYSRAM_TO_ACP_CH_NUM,
+					PLAYBACK_START_DMA_DESCR_CH12,
+					NUM_DSCRS_PER_CHANNEL, 0);
+		config_acp_dma_channel(rtd->acp_mmio, ACP_TO_I2S_DMA_CH_NUM,
+					PLAYBACK_START_DMA_DESCR_CH13,
+					NUM_DSCRS_PER_CHANNEL, 0);
+		/* Fill ACP SRAM (2 periods) with zeros from System RAM
+		 * which is zero-ed in hw_params */
+		acp_dma_start(rtd->acp_mmio, SYSRAM_TO_ACP_CH_NUM, false);
+
+		/* ACP SRAM (2 periods of buffer size) is intially filled with
+		 * zeros. Before rendering starts, 2nd half of SRAM will be
+		 * filled with valid audio data DMA'ed from first half of system
+		 * RAM and 1st half of SRAM will be filled with Zeros. This is
+		 * the initial scenario when redering starts from SRAM. Later
+		 * on, 2nd half of system memory will be DMA'ed to 1st half of
+		 * SRAM, 1st half of system memory will be DMA'ed to 2nd half of
+		 * SRAM in ping-pong way till rendering stops. */
+		config_acp_dma_channel(rtd->acp_mmio, SYSRAM_TO_ACP_CH_NUM,
+					PLAYBACK_START_DMA_DESCR_CH12,
+					1, 0);
+	} else {
+		config_acp_dma_channel(rtd->acp_mmio, ACP_TO_SYSRAM_CH_NUM,
+					CAPTURE_START_DMA_DESCR_CH14,
+					NUM_DSCRS_PER_CHANNEL, 0);
+		config_acp_dma_channel(rtd->acp_mmio, I2S_TO_ACP_DMA_CH_NUM,
+					CAPTURE_START_DMA_DESCR_CH15,
+					NUM_DSCRS_PER_CHANNEL, 0);
+	}
+	return 0;
+}
+
+static int acp_dma_trigger(struct snd_pcm_substream *substream, int cmd)
+{
+	int ret;
+
+	struct snd_pcm_runtime *runtime = substream->runtime;
+	struct audio_substream_data *rtd = runtime->private_data;
+	struct snd_soc_pcm_runtime *prtd = substream->private_data;
+	struct amd_acp_device *acp_dev = dev_get_platdata(prtd->platform->dev);
+
+	if (!rtd)
+		return -EINVAL;
+	switch (cmd) {
+	case SNDRV_PCM_TRIGGER_START:
+	case SNDRV_PCM_TRIGGER_RESUME:
+	case SNDRV_PCM_TRIGGER_PAUSE_RELEASE:
+		if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK) {
+			acp_dma_start(rtd->acp_mmio,
+						SYSRAM_TO_ACP_CH_NUM, false);
+			prebuffer_audio(rtd->acp_mmio);
+			acp_dma_start(rtd->acp_mmio,
+					ACP_TO_I2S_DMA_CH_NUM, true);
+			acp_dev->irq_get(acp_dev);
+
+		} else {
+			acp_dma_start(rtd->acp_mmio,
+					    I2S_TO_ACP_DMA_CH_NUM, true);
+			acp_dev->irq_get(acp_dev);
+		}
+		ret = 0;
+		break;
+	case SNDRV_PCM_TRIGGER_STOP:
+	case SNDRV_PCM_TRIGGER_SUSPEND:
+	case SNDRV_PCM_TRIGGER_PAUSE_PUSH:
+		if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK) {
+			acp_dma_stop(rtd->acp_mmio, SYSRAM_TO_ACP_CH_NUM);
+			acp_dma_stop(rtd->acp_mmio, ACP_TO_I2S_DMA_CH_NUM);
+			acp_dev->irq_put(acp_dev);
+		} else {
+			acp_dma_stop(rtd->acp_mmio, I2S_TO_ACP_DMA_CH_NUM);
+			acp_dma_stop(rtd->acp_mmio, ACP_TO_SYSRAM_CH_NUM);
+			acp_dev->irq_put(acp_dev);
+		}
+		ret = 0;
+		break;
+	default:
+		ret = -EINVAL;
+
+	}
+	return ret;
+}
+
+static int acp_dma_new(struct snd_soc_pcm_runtime *rtd)
+{
+	return snd_pcm_lib_preallocate_pages_for_all(rtd->pcm,
+							SNDRV_DMA_TYPE_DEV,
+							NULL, MIN_BUFFER,
+							MAX_BUFFER);
+}
+
+static int acp_dma_close(struct snd_pcm_substream *substream)
+{
+	struct snd_pcm_runtime *runtime = substream->runtime;
+	struct audio_substream_data *rtd = runtime->private_data;
+	struct snd_soc_pcm_runtime *prtd = substream->private_data;
+
+	kfree(rtd);
+
+	pm_runtime_mark_last_busy(prtd->platform->dev);
+	return 0;
+}
+
+static struct snd_pcm_ops acp_dma_ops = {
+	.open = acp_dma_open,
+	.close = acp_dma_close,
+	.ioctl = snd_pcm_lib_ioctl,
+	.hw_params = acp_dma_hw_params,
+	.hw_free = acp_dma_hw_free,
+	.trigger = acp_dma_trigger,
+	.pointer = acp_dma_pointer,
+	.mmap = acp_dma_mmap,
+	.prepare = acp_dma_prepare,
+};
+
+static struct snd_soc_platform_driver acp_asoc_platform = {
+	.ops = &acp_dma_ops,
+	.pcm_new = acp_dma_new,
+};
+
+static int acp_audio_probe(struct platform_device *pdev)
+{
+	int status;
+	struct audio_drv_data *audio_drv_data;
+	struct resource *res;
+	struct amd_acp_device *acp_dev = dev_get_platdata(&pdev->dev);
+
+	audio_drv_data = devm_kzalloc(&pdev->dev, sizeof(struct audio_drv_data),
+					GFP_KERNEL);
+	if (audio_drv_data == NULL)
+		return -ENOMEM;
+
+	audio_drv_data->iprv = devm_kzalloc(&pdev->dev,
+						sizeof(struct acp_irq_prv),
+						GFP_KERNEL);
+	if (audio_drv_data->iprv == NULL)
+		return -ENOMEM;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	audio_drv_data->acp_mmio = devm_ioremap_resource(&pdev->dev, res);
+
+	/* The following members gets populated in device 'open'
+	 * function. Till then interrupts are disabled in 'acp_hw_init'
+	 * and device doesn't generate any interrupts.
+	 */
+
+	audio_drv_data->play_stream = NULL;
+	audio_drv_data->capture_stream = NULL;
+
+	audio_drv_data->iprv->dev = &pdev->dev;
+	audio_drv_data->iprv->acp_mmio = audio_drv_data->acp_mmio;
+	audio_drv_data->iprv->enable_intr = acp_enable_external_interrupts;
+	audio_drv_data->iprv->irq_handler = dma_irq_handler;
+
+	dev_set_drvdata(&pdev->dev, audio_drv_data);
+
+	/* Initialize the ACP */
+	acp_hw_init(audio_drv_data->acp_mmio);
+
+	acp_dev->irq_register(acp_dev, audio_drv_data->iprv);
+
+	status = snd_soc_register_platform(&pdev->dev, &acp_asoc_platform);
+	if (0 != status) {
+		dev_err(&pdev->dev, "Fail to register ALSA platform device\n");
+		return status;
+	}
+
+	pm_runtime_set_autosuspend_delay(&pdev->dev, 10000);
+	pm_runtime_use_autosuspend(&pdev->dev);
+	pm_runtime_enable(&pdev->dev);
+
+	return status;
+}
+
+static int acp_audio_remove(struct platform_device *pdev)
+{
+	struct audio_drv_data *adata = dev_get_drvdata(&pdev->dev);
+
+	acp_hw_deinit(adata->acp_mmio);
+	snd_soc_unregister_platform(&pdev->dev);
+	pm_runtime_disable(&pdev->dev);
+
+	return 0;
+}
+
+static int acp_pcm_suspend(struct device *dev)
+{
+	bool pm_rts;
+	struct audio_drv_data *adata = dev_get_drvdata(dev);
+
+	pm_rts = pm_runtime_status_suspended(dev);
+	if (pm_rts == false)
+		acp_suspend(adata->acp_mmio);
+
+	return 0;
+}
+
+static int acp_pcm_resume(struct device *dev)
+{
+	bool pm_rts;
+	struct snd_pcm_substream *stream;
+	struct snd_pcm_runtime *rtd;
+	struct audio_substream_data *sdata;
+	struct audio_drv_data *adata = dev_get_drvdata(dev);
+
+	pm_rts = pm_runtime_status_suspended(dev);
+	if (pm_rts == true) {
+		/* Resumed from system wide suspend and there is
+		 * no pending audio activity to resume. */
+		pm_runtime_disable(dev);
+		pm_runtime_set_active(dev);
+		pm_runtime_enable(dev);
+
+		goto out;
+	}
+
+	acp_resume(adata->acp_mmio);
+
+	stream = adata->play_stream;
+	rtd = stream ? stream->runtime : NULL;
+	if (rtd != NULL) {
+		/* Resume playback stream from a suspended state */
+		sdata = rtd->private_data;
+		config_acp_dma(adata->acp_mmio, sdata);
+	}
+
+	stream = adata->capture_stream;
+	rtd =  stream ? stream->runtime : NULL;
+	if (rtd != NULL) {
+		/* Resume capture stream from a suspended state */
+		sdata = rtd->private_data;
+		config_acp_dma(adata->acp_mmio, sdata);
+	}
+out:
+	return 0;
+}
+
+static int acp_pcm_runtime_suspend(struct device *dev)
+{
+	struct audio_drv_data *adata = dev_get_drvdata(dev);
+
+	acp_suspend(adata->acp_mmio);
+	return 0;
+}
+
+static int acp_pcm_runtime_resume(struct device *dev)
+{
+	struct audio_drv_data *adata = dev_get_drvdata(dev);
+
+	acp_resume(adata->acp_mmio);
+	return 0;
+}
+
+static const struct dev_pm_ops acp_pm_ops = {
+	.suspend = acp_pcm_suspend,
+	.resume = acp_pcm_resume,
+	.runtime_suspend = acp_pcm_runtime_suspend,
+	.runtime_resume = acp_pcm_runtime_resume,
+};
+
+static struct platform_driver acp_dma_driver = {
+	.probe = acp_audio_probe,
+	.remove = acp_audio_remove,
+	.driver = {
+		.name = "acp_audio_dma",
+		.pm = &acp_pm_ops,
+	},
+};
+
+module_platform_driver(acp_dma_driver);
+
+MODULE_AUTHOR("Maruthi.Bayyavarapu@amd.com");
+MODULE_DESCRIPTION("AMD ACP PCM Driver");
+MODULE_LICENSE("GPL v2");
+MODULE_ALIAS("platform:acp-dma-audio");
diff --git a/sound/soc/amd/acp.c b/sound/soc/amd/acp.c
new file mode 100644
index 0000000..59ec312
--- /dev/null
+++ b/sound/soc/amd/acp.c
@@ -0,0 +1,736 @@ 
+/*
+ * AMD ACP module
+ *
+ * Copyright 2015 Advanced Micro Devices, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+*/
+
+#include <linux/mm.h>
+#include <linux/slab.h>
+#include <linux/io.h>
+#include <linux/device.h>
+#include <linux/delay.h>
+#include <linux/errno.h>
+#include <sound/asound.h>
+#include "acp.h"
+
+#include "include/acp_2_2_d.h"
+#include "include/acp_2_2_sh_mask.h"
+
+#define VISLANDS30_IV_SRCID_ACP 0x000000a2
+
+static u32 acp_reg_read(void __iomem *acp_mmio, u32 reg)
+{
+	return readl(acp_mmio + (reg * 4));
+}
+
+static void acp_reg_write(u32 val, void __iomem *acp_mmio, u32 reg)
+{
+	writel(val, acp_mmio + (reg * 4));
+}
+
+/* Configure a given dma channel parameters - enable/disble,
+ * number of descriptors, priority */
+void config_acp_dma_channel(void __iomem *acp_mmio, u8 ch_num,
+				   u16 dscr_strt_idx, u16 num_dscrs,
+				   enum acp_dma_priority_level priority_level)
+{
+	u32 dma_ctrl;
+
+	/* disable the channel run field */
+	dma_ctrl = acp_reg_read(acp_mmio, mmACP_DMA_CNTL_0 + ch_num);
+	dma_ctrl &= ~ACP_DMA_CNTL_0__DMAChRun_MASK;
+	acp_reg_write(dma_ctrl, acp_mmio, mmACP_DMA_CNTL_0 + ch_num);
+
+	/* program a DMA channel with first descriptor to be processed. */
+	acp_reg_write((ACP_DMA_DSCR_STRT_IDX_0__DMAChDscrStrtIdx_MASK
+			& dscr_strt_idx),
+			acp_mmio, mmACP_DMA_DSCR_STRT_IDX_0 + ch_num);
+
+	/* program a DMA channel with the number of descriptors to be
+	 * processed in the transfer */
+	acp_reg_write(ACP_DMA_DSCR_CNT_0__DMAChDscrCnt_MASK & num_dscrs,
+		acp_mmio, mmACP_DMA_DSCR_CNT_0 + ch_num);
+
+	/* set DMA channel priority */
+	acp_reg_write(priority_level, acp_mmio, mmACP_DMA_PRIO_0 + ch_num);
+}
+
+/* Initialize the dma descriptors location in SRAM and page size */
+static void acp_dma_descr_init(void __iomem *acp_mmio)
+{
+	u32 sram_pte_offset = 0;
+
+	/* SRAM starts at 0x04000000. From that offset one page (4KB) left for
+	 * filling DMA descriptors.sram_pte_offset = 0x04001000 , used for
+	 * filling system RAM's physical pages.
+	 * This becomes the ALSA's Ring buffer start address
+	 */
+	sram_pte_offset = ACP_DAGB_GRP_SRAM_BASE_ADDRESS;
+
+	/* snoopable */
+	sram_pte_offset |= ACP_DAGB_BASE_ADDR_GRP_1__AXI2DAGBSnoopSel_MASK;
+	/* Memmory is system mmemory */
+	sram_pte_offset |= ACP_DAGB_BASE_ADDR_GRP_1__AXI2DAGBTargetMemSel_MASK;
+	/* Page Enabled */
+	sram_pte_offset |= ACP_DAGB_BASE_ADDR_GRP_1__AXI2DAGBGrpEnable_MASK;
+
+	acp_reg_write(sram_pte_offset,	acp_mmio, mmACP_DAGB_BASE_ADDR_GRP_1);
+	acp_reg_write(PAGE_SIZE_4K_ENABLE, acp_mmio,
+						mmACP_DAGB_PAGE_SIZE_GRP_1);
+}
+
+/* Initialize a dma descriptor in SRAM based on descritor information passed */
+static void config_dma_descriptor_in_sram(void __iomem *acp_mmio,
+					  u16 descr_idx,
+					  acp_dma_dscr_transfer_t *descr_info)
+{
+	u32 sram_offset;
+
+	sram_offset = (descr_idx * sizeof(acp_dma_dscr_transfer_t));
+
+	/* program the source base address. */
+	acp_reg_write(sram_offset, acp_mmio, mmACP_SRBM_Targ_Idx_Addr);
+	acp_reg_write(descr_info->src,	acp_mmio, mmACP_SRBM_Targ_Idx_Data);
+	/* program the destination base address. */
+	acp_reg_write(sram_offset + 4,	acp_mmio, mmACP_SRBM_Targ_Idx_Addr);
+	acp_reg_write(descr_info->dest, acp_mmio, mmACP_SRBM_Targ_Idx_Data);
+
+	/* program the number of bytes to be transferred for this descriptor. */
+	acp_reg_write(sram_offset + 8,	acp_mmio, mmACP_SRBM_Targ_Idx_Addr);
+	acp_reg_write(descr_info->xfer_val, acp_mmio, mmACP_SRBM_Targ_Idx_Data);
+}
+
+/* Initialize the DMA descriptor information for transfer between
+ * system memory <-> ACP SRAM
+ */
+static void set_acp_sysmem_dma_descriptors(void __iomem *acp_mmio,
+					   u32 size, int direction,
+					   u32 pte_offset)
+{
+	u16 num_descr;
+	u16 dma_dscr_idx = PLAYBACK_START_DMA_DESCR_CH12;
+	acp_dma_dscr_transfer_t dmadscr[2];
+
+	num_descr = 2;
+
+	dmadscr[0].xfer_val = 0;
+	if (direction == SNDRV_PCM_STREAM_PLAYBACK) {
+		dma_dscr_idx = PLAYBACK_START_DMA_DESCR_CH12;
+		dmadscr[0].dest = ACP_SHARED_RAM_BANK_1_ADDRESS + (size / 2);
+		dmadscr[0].src = ACP_INTERNAL_APERTURE_WINDOW_0_ADDRESS +
+			(pte_offset * PAGE_SIZE_4K);
+		dmadscr[0].xfer_val |= (DISABLE << 22) |
+			(ACP_DMA_ATTRIBUTES_DAGB_ONION_TO_SHAREDMEM << 16) |
+			(size / 2);
+	} else {
+		dma_dscr_idx = CAPTURE_START_DMA_DESCR_CH14;
+		dmadscr[0].src = ACP_SHARED_RAM_BANK_5_ADDRESS;
+		dmadscr[0].dest = ACP_INTERNAL_APERTURE_WINDOW_0_ADDRESS +
+			(pte_offset * PAGE_SIZE_4K);
+		dmadscr[0].xfer_val |=
+			(ENABLE << 22) |
+			(ACP_DMA_ATTRIBUTES_SHAREDMEM_TO_DAGB_ONION << 16) |
+			(size / 2);
+	}
+
+	config_dma_descriptor_in_sram(acp_mmio, dma_dscr_idx, &dmadscr[0]);
+
+	dmadscr[1].xfer_val = 0;
+	if (direction == SNDRV_PCM_STREAM_PLAYBACK) {
+		dma_dscr_idx = PLAYBACK_END_DMA_DESCR_CH12;
+		dmadscr[1].dest = ACP_SHARED_RAM_BANK_1_ADDRESS;
+		dmadscr[1].src = ACP_INTERNAL_APERTURE_WINDOW_0_ADDRESS +
+			(pte_offset * PAGE_SIZE_4K) + (size / 2);
+		dmadscr[1].xfer_val |= (DISABLE << 22) |
+			(ACP_DMA_ATTRIBUTES_DAGB_ONION_TO_SHAREDMEM << 16) |
+			(size / 2);
+	} else {
+		dma_dscr_idx = CAPTURE_END_DMA_DESCR_CH14;
+		dmadscr[1].dest = dmadscr[0].dest + (size / 2);
+		dmadscr[1].src = dmadscr[0].src + (size / 2);
+		dmadscr[1].xfer_val |= (ENABLE << 22) |
+			(ACP_DMA_ATTRIBUTES_SHAREDMEM_TO_DAGB_ONION << 16) |
+			(size / 2);
+	}
+
+	config_dma_descriptor_in_sram(acp_mmio, dma_dscr_idx, &dmadscr[1]);
+
+	if (direction == SNDRV_PCM_STREAM_PLAYBACK) {
+		/* starting descriptor for this channel */
+		dma_dscr_idx = PLAYBACK_START_DMA_DESCR_CH12;
+		config_acp_dma_channel(acp_mmio, SYSRAM_TO_ACP_CH_NUM,
+					dma_dscr_idx, num_descr,
+					ACP_DMA_PRIORITY_LEVEL_NORMAL);
+	} else {
+		/* starting descriptor for this channel */
+		dma_dscr_idx = CAPTURE_START_DMA_DESCR_CH14;
+		config_acp_dma_channel(acp_mmio, ACP_TO_SYSRAM_CH_NUM,
+					dma_dscr_idx, num_descr,
+					ACP_DMA_PRIORITY_LEVEL_NORMAL);
+	}
+}
+
+/* Initialize the DMA descriptor information for transfer between
+ * ACP SRAM <-> I2S
+ */
+static void set_acp_to_i2s_dma_descriptors(void __iomem *acp_mmio,
+					   u32 size, int direction)
+{
+
+	u16 num_descr;
+	u16 dma_dscr_idx = PLAYBACK_START_DMA_DESCR_CH13;
+	acp_dma_dscr_transfer_t dmadscr[2];
+
+	num_descr = 2;
+
+	dmadscr[0].xfer_val = 0;
+	if (direction == SNDRV_PCM_STREAM_PLAYBACK) {
+		dma_dscr_idx = PLAYBACK_START_DMA_DESCR_CH13;
+		dmadscr[0].src = ACP_SHARED_RAM_BANK_1_ADDRESS;
+		/* dmadscr[0].dest is unused by hardware. Assgned to 0 to
+		 * remove compiler warning */
+		dmadscr[0].dest = 0;
+		dmadscr[0].xfer_val |= (ENABLE << 22) | (TO_ACP_I2S_1 << 16) |
+					(size / 2);
+	} else {
+		dma_dscr_idx = CAPTURE_START_DMA_DESCR_CH15;
+		/* dmadscr[0].src is unused by hardware. Assgned to 0 to
+		 * remove compiler warning */
+		dmadscr[0].src = 0;
+		dmadscr[0].dest = ACP_SHARED_RAM_BANK_5_ADDRESS;
+		dmadscr[0].xfer_val |= (ENABLE << 22) |
+					(FROM_ACP_I2S_1 << 16) | (size / 2);
+	}
+
+	config_dma_descriptor_in_sram(acp_mmio, dma_dscr_idx, &dmadscr[0]);
+
+	dmadscr[1].xfer_val = 0;
+	if (direction == SNDRV_PCM_STREAM_PLAYBACK) {
+		dma_dscr_idx = PLAYBACK_END_DMA_DESCR_CH13;
+		dmadscr[1].src = dmadscr[0].src + (size / 2);
+		/* dmadscr[1].dest is unused by hardware. Assgned to 0 to
+		 * remove compiler warning */
+		dmadscr[1].dest = 0;
+		dmadscr[1].xfer_val |= (ENABLE << 22) | (TO_ACP_I2S_1 << 16) |
+					(size / 2);
+	} else {
+		dma_dscr_idx = CAPTURE_END_DMA_DESCR_CH15;
+		/* dmadscr[1].src is unused by hardware. Assgned to 0 to
+		 * remove compiler warning */
+		dmadscr[1].src = 0;
+		dmadscr[1].dest = dmadscr[0].dest + (size / 2);
+		dmadscr[1].xfer_val |= (ENABLE << 22) |
+					(FROM_ACP_I2S_1 << 16) | (size / 2);
+	}
+
+	config_dma_descriptor_in_sram(acp_mmio, dma_dscr_idx, &dmadscr[1]);
+
+	/* Configure the DMA channel with the above descriptore */
+	if (direction == SNDRV_PCM_STREAM_PLAYBACK) {
+		/* starting descriptor for this channel */
+		dma_dscr_idx = PLAYBACK_START_DMA_DESCR_CH13;
+		config_acp_dma_channel(acp_mmio, ACP_TO_I2S_DMA_CH_NUM,
+					dma_dscr_idx, num_descr,
+					ACP_DMA_PRIORITY_LEVEL_NORMAL);
+	} else {
+		/* starting descriptor for this channel */
+		dma_dscr_idx = CAPTURE_START_DMA_DESCR_CH15;
+		config_acp_dma_channel(acp_mmio, I2S_TO_ACP_DMA_CH_NUM,
+					dma_dscr_idx, num_descr,
+					ACP_DMA_PRIORITY_LEVEL_NORMAL);
+	}
+
+}
+
+u16 get_dscr_idx(void __iomem *acp_mmio, int direction)
+{
+	u16 dscr_idx;
+
+	if (direction == SNDRV_PCM_STREAM_PLAYBACK) {
+		dscr_idx = acp_reg_read(acp_mmio, mmACP_DMA_CUR_DSCR_13);
+		dscr_idx = (dscr_idx == PLAYBACK_START_DMA_DESCR_CH13) ?
+				PLAYBACK_START_DMA_DESCR_CH12 :
+				PLAYBACK_END_DMA_DESCR_CH12;
+	} else {
+		dscr_idx = acp_reg_read(acp_mmio, mmACP_DMA_CUR_DSCR_15);
+		dscr_idx = (dscr_idx == CAPTURE_START_DMA_DESCR_CH15) ?
+				CAPTURE_END_DMA_DESCR_CH14 :
+				CAPTURE_START_DMA_DESCR_CH14;
+	}
+
+	return dscr_idx;
+}
+
+/* Create page table entries in ACP SRAM for the allocated memory */
+static void acp_pte_config(void __iomem *acp_mmio, struct page *pg,
+			   u16 num_of_pages, u32 pte_offset)
+{
+	u16 page_idx;
+	u64 addr;
+	u32 low;
+	u32 high;
+	u32 offset;
+
+	offset	= ACP_DAGB_GRP_SRBM_SRAM_BASE_OFFSET + (pte_offset * 8);
+	for (page_idx = 0; page_idx < (num_of_pages); page_idx++) {
+		/* Load the low address of page int ACP SRAM through SRBM */
+		acp_reg_write((offset + (page_idx * 8)),
+			acp_mmio, mmACP_SRBM_Targ_Idx_Addr);
+		addr = page_to_phys(pg);
+
+		low = lower_32_bits(addr);
+		high = upper_32_bits(addr);
+
+		acp_reg_write(low, acp_mmio, mmACP_SRBM_Targ_Idx_Data);
+
+		/* Load the High address of page int ACP SRAM through SRBM */
+		acp_reg_write((offset + (page_idx * 8) + 4),
+			acp_mmio, mmACP_SRBM_Targ_Idx_Addr);
+
+		/* page enable in ACP */
+		high |= BIT(31);
+		acp_reg_write(high, acp_mmio, mmACP_SRBM_Targ_Idx_Data);
+
+		/* Move to next physically contiguos page */
+		pg++;
+	}
+}
+
+/* enables/disables ACP's external interrupt */
+void acp_enable_external_interrupts(void __iomem *acp_mmio,
+					   int enable)
+{
+	u32 acp_ext_intr_enb;
+
+	acp_ext_intr_enb = enable ?
+				ACP_EXTERNAL_INTR_ENB__ACPExtIntrEnb_MASK : 0;
+
+	/* Write the Software External Interrupt Enable register */
+	acp_reg_write(acp_ext_intr_enb, acp_mmio, mmACP_EXTERNAL_INTR_ENB);
+}
+
+/* Clear (acknowledge) DMA 'Interrupt on Complete' (IOC) in ACP
+ * external interrupt status register
+ */
+void acp_ext_stat_clear_dmaioc(void __iomem *acp_mmio, u8 ch_num)
+{
+	u32 ext_intr_stat;
+	u32 chmask = BIT(ch_num);
+
+	ext_intr_stat = acp_reg_read(acp_mmio, mmACP_EXTERNAL_INTR_STAT);
+	if (ext_intr_stat & (chmask <<
+			     ACP_EXTERNAL_INTR_STAT__DMAIOCStat__SHIFT)) {
+
+		ext_intr_stat &= (chmask <<
+				  ACP_EXTERNAL_INTR_STAT__DMAIOCAck__SHIFT);
+		acp_reg_write(ext_intr_stat, acp_mmio,
+						mmACP_EXTERNAL_INTR_STAT);
+	}
+}
+
+/* Check whether ACP DMA interrupt (IOC) is generated or not */
+u32 acp_get_intr_flag(void __iomem *acp_mmio)
+{
+	u32 ext_intr_status;
+	u32 intr_gen;
+
+	ext_intr_status = acp_reg_read(acp_mmio, mmACP_EXTERNAL_INTR_STAT);
+	intr_gen = (((ext_intr_status &
+		      ACP_EXTERNAL_INTR_STAT__DMAIOCStat_MASK) >>
+		     ACP_EXTERNAL_INTR_STAT__DMAIOCStat__SHIFT));
+
+	return intr_gen;
+}
+
+void config_acp_dma(void __iomem *acp_mmio,
+			   struct audio_substream_data *audio_config)
+{
+	u32 pte_offset;
+
+	if (audio_config->direction == SNDRV_PCM_STREAM_PLAYBACK)
+		pte_offset = PLAYBACK_PTE_OFFSET;
+	else
+		pte_offset = CAPTURE_PTE_OFFSET;
+
+	acp_pte_config(acp_mmio, audio_config->pg, audio_config->num_of_pages,
+			pte_offset);
+
+	/* Configure System memory <-> ACP SRAM DMA descriptors */
+	set_acp_sysmem_dma_descriptors(acp_mmio, audio_config->size,
+				       audio_config->direction, pte_offset);
+
+	/* Configure ACP SRAM <-> I2S DMA descriptors */
+	set_acp_to_i2s_dma_descriptors(acp_mmio, audio_config->size,
+					audio_config->direction);
+}
+
+/* Start a given DMA channel transfer */
+void acp_dma_start(void __iomem *acp_mmio,
+			 u16 ch_num, bool is_circular)
+{
+	u32 dma_ctrl;
+
+	/* read the dma control register and disable the channel run field */
+	dma_ctrl = acp_reg_read(acp_mmio, mmACP_DMA_CNTL_0 + ch_num);
+
+	/*Invalidating the DAGB cache */
+	acp_reg_write(ENABLE, acp_mmio, mmACP_DAGB_ATU_CTRL);
+
+	/* configure the DMA channel and start the DMA transfer
+	 * set dmachrun bit to start the transfer and enable the
+	 * interrupt on completion of the dma transfer
+	 */
+	dma_ctrl |= ACP_DMA_CNTL_0__DMAChRun_MASK;
+
+	if ((ch_num == ACP_TO_I2S_DMA_CH_NUM) ||
+	    (ch_num == ACP_TO_SYSRAM_CH_NUM) ||
+	    (ch_num == I2S_TO_ACP_DMA_CH_NUM))
+		dma_ctrl |= ACP_DMA_CNTL_0__DMAChIOCEn_MASK;
+	else
+		dma_ctrl &= ~ACP_DMA_CNTL_0__DMAChIOCEn_MASK;
+
+	/* enable  for ACP SRAM to/from I2S DMA channel */
+	if (is_circular == true)
+		dma_ctrl |= ACP_DMA_CNTL_0__Circular_DMA_En_MASK;
+	else
+		dma_ctrl &= ~ACP_DMA_CNTL_0__Circular_DMA_En_MASK;
+
+	acp_reg_write(dma_ctrl, acp_mmio, mmACP_DMA_CNTL_0 + ch_num);
+}
+
+/* Stop a given DMA channel transfer */
+void acp_dma_stop(void __iomem *acp_mmio, u8 ch_num)
+{
+	u32 dma_ctrl;
+	u32 dma_ch_sts;
+	u32 delay_time = ACP_DMA_RESET_TIME;
+
+	dma_ctrl = acp_reg_read(acp_mmio, mmACP_DMA_CNTL_0 + ch_num);
+
+	/* clear the dma control register fields before writing zero
+	 * in reset bit
+	 */
+	dma_ctrl &= ~ACP_DMA_CNTL_0__DMAChRun_MASK;
+	dma_ctrl &= ~ACP_DMA_CNTL_0__DMAChIOCEn_MASK;
+
+	acp_reg_write(dma_ctrl, acp_mmio, mmACP_DMA_CNTL_0 + ch_num);
+	dma_ch_sts = acp_reg_read(acp_mmio, mmACP_DMA_CH_STS);
+
+	if (dma_ch_sts & BIT(ch_num)) {
+		/* set the reset bit for this channel
+		 * to stop the dma transfer */
+		dma_ctrl |= ACP_DMA_CNTL_0__DMAChRst_MASK;
+		acp_reg_write(dma_ctrl, acp_mmio, mmACP_DMA_CNTL_0 + ch_num);
+	}
+
+	/* check the channel status bit for some time and return the status */
+	while (0 < delay_time) {
+		dma_ch_sts = acp_reg_read(acp_mmio, mmACP_DMA_CH_STS);
+		if (!(dma_ch_sts & BIT(ch_num))) {
+			/* clear the reset flag after successfully stopping
+			   the dma transfer and break from the loop */
+			dma_ctrl &= ~ACP_DMA_CNTL_0__DMAChRst_MASK;
+
+			acp_reg_write(dma_ctrl, acp_mmio, mmACP_DMA_CNTL_0
+								+ ch_num);
+			break;
+		}
+		delay_time--;
+	}
+}
+
+/* power off a tile/block within ACP */
+static void acp_suspend_tile(void __iomem *acp_mmio, int tile)
+{
+	u32 val = 0;
+	u32 timeout = 0;
+
+	if ((tile  < ACP_TILE_P1) || (tile > ACP_TILE_DSP2))
+		return;
+
+	val = acp_reg_read(acp_mmio, mmACP_PGFSM_READ_REG_0 + tile);
+	val &= ACP_TILE_ON_MASK;
+
+	if (val == 0x0) {
+		val = acp_reg_read(acp_mmio, mmACP_PGFSM_RETAIN_REG);
+		val = val | (1 << tile);
+		acp_reg_write(val, acp_mmio, mmACP_PGFSM_RETAIN_REG);
+		acp_reg_write(0x500 + tile, acp_mmio, mmACP_PGFSM_CONFIG_REG);
+
+		timeout = ACP_SOFT_RESET_DONE_TIME_OUT_VALUE;
+		while (timeout--) {
+			val = acp_reg_read(acp_mmio, mmACP_PGFSM_READ_REG_0
+								+ tile);
+			val = val & ACP_TILE_ON_MASK;
+			if (val == ACP_TILE_OFF_MASK)
+				break;
+		}
+
+		val = acp_reg_read(acp_mmio, mmACP_PGFSM_RETAIN_REG);
+
+		val |= ACP_TILE_OFF_RETAIN_REG_MASK;
+		acp_reg_write(val, acp_mmio, mmACP_PGFSM_RETAIN_REG);
+	}
+}
+
+/* power on a tile/block within ACP */
+static void acp_resume_tile(void __iomem *acp_mmio, int tile)
+{
+	u32 val = 0;
+	u32 timeout = 0;
+
+	if ((tile  < ACP_TILE_P1) || (tile > ACP_TILE_DSP2))
+		return;
+
+	val = acp_reg_read(acp_mmio, mmACP_PGFSM_READ_REG_0 + tile);
+	val = val & ACP_TILE_ON_MASK;
+
+	if (val != 0x0) {
+		acp_reg_write(0x600 + tile, acp_mmio, mmACP_PGFSM_CONFIG_REG);
+		timeout = ACP_SOFT_RESET_DONE_TIME_OUT_VALUE;
+		while (timeout--) {
+			val = acp_reg_read(acp_mmio, mmACP_PGFSM_READ_REG_0
+							+ tile);
+			val = val & ACP_TILE_ON_MASK;
+			if (val == 0x0)
+				break;
+		}
+		val = acp_reg_read(acp_mmio, mmACP_PGFSM_RETAIN_REG);
+		if (tile == ACP_TILE_P1)
+			val = val & (ACP_TILE_P1_MASK);
+		else if (tile == ACP_TILE_P2)
+			val = val & (ACP_TILE_P2_MASK);
+
+		acp_reg_write(val, acp_mmio, mmACP_PGFSM_RETAIN_REG);
+	}
+}
+
+/* Shutdown unused SRAM memory banks in ACP IP */
+static void acp_turnoff_sram_banks(void __iomem *acp_mmio)
+{
+	/* Bank 0 : used for DMA descriptors
+	 * Bank 1 to 4 : used for playback
+	 * Bank 5 to 8 : used for capture
+	 * Each bank is 8kB and max size allocated for playback/ capture is
+	 * 16kB(max period size) * 2(max periods) reserved for playback/capture
+	 * in ALSA driver
+	 * Turn off all SRAM banks except above banks during playback/capture
+	 */
+	u32 val, bank;
+
+	for (bank = 9; bank < 32; bank++) {
+		val = acp_reg_read(acp_mmio, mmACP_MEM_SHUT_DOWN_REQ_LO);
+		if (!(val & (1 << bank))) {
+			val |= 1 << bank;
+			acp_reg_write(val, acp_mmio,
+					mmACP_MEM_SHUT_DOWN_REQ_LO);
+			/* If ACP_MEM_SHUT_DOWN_STS_LO is 0xFFFFFFFF, then
+			 * shutdown sequence is complete. */
+			do {
+				val = acp_reg_read(acp_mmio,
+						mmACP_MEM_SHUT_DOWN_STS_LO);
+			} while (val != 0xFFFFFFFF);
+		}
+	}
+
+	for (bank = 32; bank < 48; bank++) {
+		val = acp_reg_read(acp_mmio, mmACP_MEM_SHUT_DOWN_REQ_HI);
+		if (!(val & (1 << (bank - 32)))) {
+			val |= 1 << (bank - 32);
+			acp_reg_write(val, acp_mmio,
+					mmACP_MEM_SHUT_DOWN_REQ_HI);
+			/* If ACP_MEM_SHUT_DOWN_STS_HI is 0x0000FFFF, then
+			 * shutdown sequence is complete. */
+			do {
+				val = acp_reg_read(acp_mmio,
+						mmACP_MEM_SHUT_DOWN_STS_HI);
+			} while (val != 0x0000FFFF);
+		}
+	}
+}
+
+/* Initialize and bring ACP hardware to default state. */
+static void acp_init(void __iomem *acp_mmio)
+{
+	u32 val;
+	u32 timeout_value;
+
+	/* Assert Soft reset of ACP */
+	val = acp_reg_read(acp_mmio, mmACP_SOFT_RESET);
+
+	val |= ACP_SOFT_RESET__SoftResetAud_MASK;
+	acp_reg_write(val, acp_mmio, mmACP_SOFT_RESET);
+
+	timeout_value = ACP_SOFT_RESET_DONE_TIME_OUT_VALUE;
+	while (timeout_value--) {
+		val = acp_reg_read(acp_mmio, mmACP_SOFT_RESET);
+		if (ACP_SOFT_RESET__SoftResetAudDone_MASK ==
+		    (val & ACP_SOFT_RESET__SoftResetAudDone_MASK))
+			break;
+	}
+
+	/* Enable clock to ACP and wait until the clock is enabled */
+	val = acp_reg_read(acp_mmio, mmACP_CONTROL);
+	val = val | ACP_CONTROL__ClkEn_MASK;
+	acp_reg_write(val, acp_mmio, mmACP_CONTROL);
+
+	timeout_value = ACP_CLOCK_EN_TIME_OUT_VALUE;
+
+	while (timeout_value--) {
+		val = acp_reg_read(acp_mmio, mmACP_STATUS);
+		if (val & (u32) 0x1)
+			break;
+		udelay(100);
+	}
+
+	/* Deassert the SOFT RESET flags */
+	val = acp_reg_read(acp_mmio, mmACP_SOFT_RESET);
+	val &= ~ACP_SOFT_RESET__SoftResetAud_MASK;
+	acp_reg_write(val, acp_mmio, mmACP_SOFT_RESET);
+
+	/* initiailizing Garlic Control DAGB register */
+	acp_reg_write(ONION_CNTL_DEFAULT, acp_mmio, mmACP_AXI2DAGB_ONION_CNTL);
+
+	/* initiailizing Onion Control DAGB registers */
+	acp_reg_write(GARLIC_CNTL_DEFAULT, acp_mmio,
+			mmACP_AXI2DAGB_GARLIC_CNTL);
+
+	acp_dma_descr_init(acp_mmio);
+
+	acp_reg_write(ACP_SRAM_BASE_ADDRESS, acp_mmio,
+			mmACP_DMA_DESC_BASE_ADDR);
+
+	/* Num of descriptiors in SRAM 0x4, means 256 descriptors;(64 * 4) */
+	acp_reg_write(0x4, acp_mmio, mmACP_DMA_DESC_MAX_NUM_DSCR);
+	acp_reg_write(ACP_EXTERNAL_INTR_CNTL__DMAIOCMask_MASK,
+		acp_mmio, mmACP_EXTERNAL_INTR_CNTL);
+
+	acp_turnoff_sram_banks(acp_mmio);
+}
+
+void acp_hw_init(void __iomem *acp_mmio)
+{
+	acp_init(acp_mmio);
+
+	/* Disable DSPs which are not used */
+	acp_suspend_tile(acp_mmio, ACP_TILE_DSP0);
+	acp_suspend_tile(acp_mmio, ACP_TILE_DSP1);
+	acp_suspend_tile(acp_mmio, ACP_TILE_DSP2);
+}
+
+/* Deintialize ACP */
+void acp_hw_deinit(void __iomem *acp_mmio)
+{
+	u32 val;
+	u32 timeout_value;
+
+	/* Assert Soft reset of ACP */
+	val = acp_reg_read(acp_mmio, mmACP_SOFT_RESET);
+
+	val |= ACP_SOFT_RESET__SoftResetAud_MASK;
+	acp_reg_write(val, acp_mmio, mmACP_SOFT_RESET);
+
+	timeout_value = ACP_SOFT_RESET_DONE_TIME_OUT_VALUE;
+	while (timeout_value--) {
+		val = acp_reg_read(acp_mmio, mmACP_SOFT_RESET);
+		if (ACP_SOFT_RESET__SoftResetAudDone_MASK ==
+		    (val & ACP_SOFT_RESET__SoftResetAudDone_MASK)) {
+			break;
+	    }
+	}
+	/** Disable ACP clock */
+	val = acp_reg_read(acp_mmio, mmACP_CONTROL);
+	val &= ~ACP_CONTROL__ClkEn_MASK;
+	acp_reg_write(val, acp_mmio, mmACP_CONTROL);
+
+	timeout_value = ACP_CLOCK_EN_TIME_OUT_VALUE;
+
+	while (timeout_value--) {
+		val = acp_reg_read(acp_mmio, mmACP_STATUS);
+		if (!(val & (u32) 0x1))
+			break;
+		udelay(100);
+	}
+}
+
+/* Update DMA postion in audio ring buffer at period level granularity.
+ * This will be used by ALSA PCM driver
+ */
+u32 acp_update_dma_pointer(void __iomem *acp_mmio, int direction,
+				  u32 period_size)
+{
+	u32 pos;
+	u16 dscr;
+	u32 mul;
+	u32 dma_config;
+
+	pos = 0;
+
+	if (direction == SNDRV_PCM_STREAM_PLAYBACK) {
+		dscr = acp_reg_read(acp_mmio, mmACP_DMA_CUR_DSCR_13);
+
+		mul = (dscr == PLAYBACK_START_DMA_DESCR_CH13) ? 0 : 1;
+		pos =  (mul * period_size);
+
+	} else {
+		dma_config = acp_reg_read(acp_mmio, mmACP_DMA_CNTL_14);
+		if (dma_config != 0) {
+			dscr = acp_reg_read(acp_mmio, mmACP_DMA_CUR_DSCR_14);
+			mul = (dscr == CAPTURE_START_DMA_DESCR_CH14) ? 1 : 2;
+			pos = (mul * period_size);
+		}
+
+		if (pos >= (2 * period_size))
+			pos = 0;
+
+	}
+	return pos;
+}
+
+/* Wait for initial buffering to complete in HOST to SRAM DMA channel
+ * for plaback usecase
+ */
+void prebuffer_audio(void __iomem *acp_mmio)
+{
+	u32 dma_ch_sts;
+	u32 channel_mask = BIT(SYSRAM_TO_ACP_CH_NUM);
+
+	do {
+		/* Read the channel status to poll dma transfer completion
+		 * (System RAM to SRAM)
+		 * In this case, it will be runtime->start_threshold
+		 * (2 ALSA periods) of transfer. Rendering starts after this
+		 * threshold is met.
+		 */
+		dma_ch_sts = acp_reg_read(acp_mmio, mmACP_DMA_CH_STS);
+		udelay(20);
+	} while (dma_ch_sts & channel_mask);
+}
+
+void acp_suspend(void __iomem *acp_mmio)
+{
+	acp_suspend_tile(acp_mmio, ACP_TILE_P2);
+	acp_suspend_tile(acp_mmio, ACP_TILE_P1);
+}
+
+void acp_resume(void __iomem *acp_mmio)
+{
+	acp_resume_tile(acp_mmio, ACP_TILE_P1);
+	acp_resume_tile(acp_mmio, ACP_TILE_P2);
+
+	acp_init(acp_mmio);
+
+	/* Disable DSPs which are not going to be used */
+	acp_suspend_tile(acp_mmio, ACP_TILE_DSP0);
+	acp_suspend_tile(acp_mmio, ACP_TILE_DSP1);
+	acp_suspend_tile(acp_mmio, ACP_TILE_DSP2);
+}
diff --git a/sound/soc/amd/acp.h b/sound/soc/amd/acp.h
new file mode 100644
index 0000000..4e4417f
--- /dev/null
+++ b/sound/soc/amd/acp.h
@@ -0,0 +1,147 @@ 
+#ifndef __ACP_HW_H
+#define __ACP_HW_H
+
+#define ACP_MODE_I2S				0
+#define ACP_MODE_AZ				1
+
+#define DISABLE					0
+#define ENABLE					1
+
+#define PAGE_SIZE_4K				4096
+#define PAGE_SIZE_4K_ENABLE			0x02
+
+#define PLAYBACK_PTE_OFFSET			10
+#define CAPTURE_PTE_OFFSET			0
+
+#define GARLIC_CNTL_DEFAULT			0x00000FB4
+#define ONION_CNTL_DEFAULT			0x00000FB4
+
+#define ACP_PHYSICAL_BASE			0x14000
+
+/* Playback SRAM address (as a destination in dma descriptor) */
+#define ACP_SHARED_RAM_BANK_1_ADDRESS		0x4002000
+
+/* Capture SRAM address (as a source in dma descriptor) */
+#define ACP_SHARED_RAM_BANK_5_ADDRESS		0x400A000
+
+#define ACP_DMA_RESET_TIME			10000
+#define ACP_CLOCK_EN_TIME_OUT_VALUE		0x000000FF
+#define ACP_SOFT_RESET_DONE_TIME_OUT_VALUE	0x000000FF
+#define ACP_DMA_COMPLETE_TIME_OUT_VALUE		0x000000FF
+
+#define ACP_SRAM_BASE_ADDRESS			0x4000000
+#define ACP_DAGB_GRP_SRAM_BASE_ADDRESS		0x4001000
+#define ACP_DAGB_GRP_SRBM_SRAM_BASE_OFFSET	0x1000
+#define ACP_INTERNAL_APERTURE_WINDOW_0_ADDRESS	0x00000000
+#define ACP_INTERNAL_APERTURE_WINDOW_4_ADDRESS	0x01800000
+
+#define TO_ACP_I2S_1   0x2
+#define TO_ACP_I2S_2   0x4
+#define FROM_ACP_I2S_1 0xa
+#define FROM_ACP_I2S_2 0xb
+
+#define ACP_TILE_ON_MASK                0x03
+#define ACP_TILE_OFF_MASK               0x02
+#define ACP_TILE_ON_RETAIN_REG_MASK     0x1f
+#define ACP_TILE_OFF_RETAIN_REG_MASK    0x20
+
+#define ACP_TILE_P1_MASK                0x3e
+#define ACP_TILE_P2_MASK                0x3d
+#define ACP_TILE_DSP0_MASK              0x3b
+#define ACP_TILE_DSP1_MASK              0x37
+
+#define ACP_TILE_DSP2_MASK              0x2f
+/* Playback DMA channels */
+#define SYSRAM_TO_ACP_CH_NUM 12
+#define ACP_TO_I2S_DMA_CH_NUM 13
+
+/* Capture DMA channels */
+#define ACP_TO_SYSRAM_CH_NUM 14
+#define I2S_TO_ACP_DMA_CH_NUM 15
+
+#define PLAYBACK_START_DMA_DESCR_CH12 0
+#define PLAYBACK_END_DMA_DESCR_CH12 1
+
+#define PLAYBACK_START_DMA_DESCR_CH13 2
+#define PLAYBACK_END_DMA_DESCR_CH13 3
+
+
+#define CAPTURE_START_DMA_DESCR_CH14 4
+#define CAPTURE_END_DMA_DESCR_CH14 5
+
+#define CAPTURE_START_DMA_DESCR_CH15 6
+#define CAPTURE_END_DMA_DESCR_CH15 7
+
+#define STATUS_SUCCESS 0
+#define STATUS_UNSUCCESSFUL -1
+
+enum acp_dma_priority_level {
+	/* 0x0 Specifies the DMA channel is given normal priority */
+	ACP_DMA_PRIORITY_LEVEL_NORMAL = 0x0,
+	/* 0x1 Specifies the DMA channel is given high priority */
+	ACP_DMA_PRIORITY_LEVEL_HIGH = 0x1,
+	ACP_DMA_PRIORITY_LEVEL_FORCESIZE = 0xFF
+};
+
+struct audio_substream_data {
+	struct page *pg;
+	unsigned int order;
+	u16 num_of_pages;
+	u16 direction;
+	uint64_t size;
+	void __iomem *acp_mmio;
+};
+
+enum {
+	ACP_TILE_P1 = 0,
+	ACP_TILE_P2,
+	ACP_TILE_DSP0,
+	ACP_TILE_DSP1,
+	ACP_TILE_DSP2,
+};
+
+enum {
+	ACP_DMA_ATTRIBUTES_SHAREDMEM_TO_DAGB_ONION = 0x0,
+	ACP_DMA_ATTRIBUTES_SHARED_MEM_TO_DAGB_GARLIC = 0x1,
+	ACP_DMA_ATTRIBUTES_DAGB_ONION_TO_SHAREDMEM = 0x8,
+	ACP_DMA_ATTRIBUTES_DAGB_GARLIC_TO_SHAREDMEM = 0x9,
+	ACP_DMA_ATTRIBUTES_FORCE_SIZE = 0xF
+};
+
+typedef struct acp_dma_dscr_transfer {
+	/* Specifies the source memory location for the DMA data transfer. */
+	u32 src;
+	/* Specifies the destination memory location to where the data will
+	   be transferred.
+	 */
+	u32 dest;
+	/* Specifies the number of bytes need to be transferred
+	 * from source to destination memory.Transfer direction & IOC enable
+	 */
+	u32 xfer_val;
+	/** Reserved for future use */
+	u32 reserved;
+} acp_dma_dscr_transfer_t;
+
+extern void acp_hw_init(void __iomem *acp_mmio);
+extern void acp_hw_deinit(void __iomem *acp_mmio);
+extern void config_acp_dma_channel(void __iomem *acp_mmio, u8 ch_num,
+				   u16 dscr_strt_idx, u16 num_dscrs,
+				   enum acp_dma_priority_level priority_level);
+extern void config_acp_dma(void __iomem *acp_mmio,
+			   struct audio_substream_data *audio_config);
+extern void acp_dma_start(void __iomem *acp_mmio,
+			 u16 ch_num, bool is_circular);
+extern void acp_dma_stop(void __iomem *acp_mmio, u8 ch_num);
+extern u32 acp_update_dma_pointer(void __iomem *acp_mmio, int direction,
+				  u32 period_size);
+extern void prebuffer_audio(void __iomem *acp_mmio);
+extern void acp_suspend(void __iomem *acp_mmio);
+extern void acp_resume(void __iomem *acp_mmio);
+extern void acp_enable_external_interrupts(void __iomem *acp_mmio,
+					   int enable);
+extern u32 acp_get_intr_flag(void __iomem *acp_mmio);
+extern u16 get_dscr_idx(void __iomem *acp_mmio, int direction);
+extern void acp_ext_stat_clear_dmaioc(void __iomem *acp_mmio, u8 ch_num);
+
+#endif /*__ACP_HW_H */