diff mbox

mmc: mmci: Support non-power-of-two block sizes for ux500v2 variant

Message ID 1350050522-8852-1-git-send-email-ulf.hansson@stericsson.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ulf Hansson Oct. 12, 2012, 2:02 p.m. UTC
From: Ulf Hansson <ulf.hansson@linaro.org>

For the ux500v2 variant non power of two block sizes are supported.
This will make it possible to decrease data overhead for SDIO
transfers. Although we need to put some constraints to the alignment
of the buffers when enabling this feature.

Buffers must be 4 bytes aligned due to restrictions that the PL18x
FIFO accesses must be done in a 4 byte aligned manner. Moreover we
need to enable DMA_REQCTL for SDIO to support write of non 32 bytes
aligned sg element lengths. In PIO mode any buffer length can be
handled as long as the buffer address is 4 byte aligned.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Per Forlin <per.forlin@stericsson.com>
---
 drivers/mmc/host/mmci.c |   56 +++++++++++++++++++++++++++++++++++++++++------
 drivers/mmc/host/mmci.h |    7 ++++++
 2 files changed, 56 insertions(+), 7 deletions(-)

Comments

Linus Walleij Oct. 12, 2012, 9:22 p.m. UTC | #1
On Fri, Oct 12, 2012 at 4:02 PM, Ulf Hansson <ulf.hansson@stericsson.com> wrote:

> From: Ulf Hansson <ulf.hansson@linaro.org>
>
> For the ux500v2 variant non power of two block sizes are supported.
> This will make it possible to decrease data overhead for SDIO
> transfers. Although we need to put some constraints to the alignment
> of the buffers when enabling this feature.
>
> Buffers must be 4 bytes aligned due to restrictions that the PL18x
> FIFO accesses must be done in a 4 byte aligned manner. Moreover we
> need to enable DMA_REQCTL for SDIO to support write of non 32 bytes
> aligned sg element lengths. In PIO mode any buffer length can be
> handled as long as the buffer address is 4 byte aligned.
>
> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> Signed-off-by: Per Forlin <per.forlin@stericsson.com>

Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

I finally understand how this works now.

Bonus for comments like this:

> +/*
> + * DMA request control is required for write
> + * if transfer size is not 32 byte aligned.
> + * DMA request control is also needed if the total
> + * transfer size is 32 byte aligned but any of the
> + * sg element lengths are not aligned with 32 byte.
> + */
>  #define MCI_ST_DPSM_DMAREQCTL  (1 << 12)

Which make you understand what is actually happening.

Yours,
Linus Walleij
Johan Rudholm Oct. 15, 2012, 10:24 a.m. UTC | #2
2012/10/12 Ulf Hansson <ulf.hansson@stericsson.com>:
> From: Ulf Hansson <ulf.hansson@linaro.org>
>
> For the ux500v2 variant non power of two block sizes are supported.
> This will make it possible to decrease data overhead for SDIO
> transfers. Although we need to put some constraints to the alignment
> of the buffers when enabling this feature.
>
> Buffers must be 4 bytes aligned due to restrictions that the PL18x
> FIFO accesses must be done in a 4 byte aligned manner. Moreover we
> need to enable DMA_REQCTL for SDIO to support write of non 32 bytes
> aligned sg element lengths. In PIO mode any buffer length can be
> handled as long as the buffer address is 4 byte aligned.
>
> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> Signed-off-by: Per Forlin <per.forlin@stericsson.com>

Acked-By: Johan Rudholm <johan.rudholm@stericsson.com>
Russell King - ARM Linux Nov. 21, 2012, 3:38 p.m. UTC | #3
On Fri, Oct 12, 2012 at 04:02:02PM +0200, Ulf Hansson wrote:
>  /*
> + * Validate mmc prerequisites
> + */
> +static int mmci_validate_data(struct mmci_host *host,
> +			      struct mmc_data *data)
> +{
> +	if (!data)
> +		return 0;
> +
> +	if (!host->variant->non_power_of_2_blksize &&
> +	    !is_power_of_2(data->blksz)) {
> +		dev_err(mmc_dev(host->mmc),
> +			"unsupported block size (%d bytes)\n", data->blksz);
> +		return -EINVAL;
> +	}
> +
> +	if (data->sg->offset & 3) {
> +		dev_err(mmc_dev(host->mmc),
> +			"unsupported alginment (0x%x)\n", data->sg->offset);
> +		return -EINVAL;
> +	}

Why?  What's the reasoning behind this suddenly introduced restriction?
readsl()/writesl() copes just fine with non-aligned pointers.  It may be
that your DMA engine can not, but that's no business interfering with
non-DMA transfers, and no reason to fail such transfers.

If your DMA engine can't do that then its your DMA engine code which
should refuse to prepare the transfer.

Yes, that means problems with the way things are ordered - or it needs a
proper API where DMA engine can export these kinds of properties.
Per Forlin Nov. 21, 2012, 4:13 p.m. UTC | #4
On Wed, Nov 21, 2012 at 4:38 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Oct 12, 2012 at 04:02:02PM +0200, Ulf Hansson wrote:
>>  /*
>> + * Validate mmc prerequisites
>> + */
>> +static int mmci_validate_data(struct mmci_host *host,
>> +                           struct mmc_data *data)
>> +{
>> +     if (!data)
>> +             return 0;
>> +
>> +     if (!host->variant->non_power_of_2_blksize &&
>> +         !is_power_of_2(data->blksz)) {
>> +             dev_err(mmc_dev(host->mmc),
>> +                     "unsupported block size (%d bytes)\n", data->blksz);
>> +             return -EINVAL;
>> +     }
>> +
>> +     if (data->sg->offset & 3) {
>> +             dev_err(mmc_dev(host->mmc),
>> +                     "unsupported alginment (0x%x)\n", data->sg->offset);
>> +             return -EINVAL;
>> +     }
>
> Why?  What's the reasoning behind this suddenly introduced restriction?
> readsl()/writesl() copes just fine with non-aligned pointers.  It may be
> that your DMA engine can not, but that's no business interfering with
> non-DMA transfers, and no reason to fail such transfers.
>
> If your DMA engine can't do that then its your DMA engine code which
> should refuse to prepare the transfer.
>
> Yes, that means problems with the way things are ordered - or it needs a
> proper API where DMA engine can export these kinds of properties.
The alignment constraint is related to PIO, sg_miter and that FIFO
access must be done with 4 bytes.

For a 8k buffer sg miter may return 3 buffer
1. 7 bytes
2. 4096
3. 4089

DMA can handle this because it will treat this a one buffer being 8 k.
PIO will do three transfer due to sg_miter (7, 4096, 4089).
One could change the driver to not use sg_miter and just access the 8k
buffer directly to avoid the issue.

BR
Per



> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Russell King - ARM Linux Nov. 21, 2012, 4:50 p.m. UTC | #5
On Wed, Nov 21, 2012 at 05:13:55PM +0100, Per Forlin wrote:
> On Wed, Nov 21, 2012 at 4:38 PM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Fri, Oct 12, 2012 at 04:02:02PM +0200, Ulf Hansson wrote:
> >>  /*
> >> + * Validate mmc prerequisites
> >> + */
> >> +static int mmci_validate_data(struct mmci_host *host,
> >> +                           struct mmc_data *data)
> >> +{
> >> +     if (!data)
> >> +             return 0;
> >> +
> >> +     if (!host->variant->non_power_of_2_blksize &&
> >> +         !is_power_of_2(data->blksz)) {
> >> +             dev_err(mmc_dev(host->mmc),
> >> +                     "unsupported block size (%d bytes)\n", data->blksz);
> >> +             return -EINVAL;
> >> +     }
> >> +
> >> +     if (data->sg->offset & 3) {
> >> +             dev_err(mmc_dev(host->mmc),
> >> +                     "unsupported alginment (0x%x)\n", data->sg->offset);
> >> +             return -EINVAL;
> >> +     }
> >
> > Why?  What's the reasoning behind this suddenly introduced restriction?
> > readsl()/writesl() copes just fine with non-aligned pointers.  It may be
> > that your DMA engine can not, but that's no business interfering with
> > non-DMA transfers, and no reason to fail such transfers.
> >
> > If your DMA engine can't do that then its your DMA engine code which
> > should refuse to prepare the transfer.
> >
> > Yes, that means problems with the way things are ordered - or it needs a
> > proper API where DMA engine can export these kinds of properties.
> The alignment constraint is related to PIO, sg_miter and that FIFO
> access must be done with 4 bytes.

Total claptrap.  No it isn't.

- sg_miter just deals with bytes, and number of bytes transferred; there
  is no word assumptions in that code.  Indeed many ATA disks transfer
  by half-word accesses so such a restriction would be insane.

- the FIFO access itself needs to be 32-bit words, so readsl or writesl
  (or their io* equivalents must be used).

- but - and this is the killer item to your argument as I said above -
  readsl and writesl _can_ take misaligned pointers and cope with them
  fine.

The actual alignment of the buffer address is totally irrelevant here.

What isn't irrelevant is the _number_ of bytes to be transferred, but
that's something totally different and completely unrelated from
data->sg->offset.
Ulf Hansson Nov. 22, 2012, 1:43 p.m. UTC | #6
On 21 November 2012 17:50, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Wed, Nov 21, 2012 at 05:13:55PM +0100, Per Forlin wrote:
>> On Wed, Nov 21, 2012 at 4:38 PM, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>> > On Fri, Oct 12, 2012 at 04:02:02PM +0200, Ulf Hansson wrote:
>> >>  /*
>> >> + * Validate mmc prerequisites
>> >> + */
>> >> +static int mmci_validate_data(struct mmci_host *host,
>> >> +                           struct mmc_data *data)
>> >> +{
>> >> +     if (!data)
>> >> +             return 0;
>> >> +
>> >> +     if (!host->variant->non_power_of_2_blksize &&
>> >> +         !is_power_of_2(data->blksz)) {
>> >> +             dev_err(mmc_dev(host->mmc),
>> >> +                     "unsupported block size (%d bytes)\n", data->blksz);
>> >> +             return -EINVAL;
>> >> +     }
>> >> +
>> >> +     if (data->sg->offset & 3) {
>> >> +             dev_err(mmc_dev(host->mmc),
>> >> +                     "unsupported alginment (0x%x)\n", data->sg->offset);
>> >> +             return -EINVAL;
>> >> +     }
>> >
>> > Why?  What's the reasoning behind this suddenly introduced restriction?
>> > readsl()/writesl() copes just fine with non-aligned pointers.  It may be
>> > that your DMA engine can not, but that's no business interfering with
>> > non-DMA transfers, and no reason to fail such transfers.
>> >
>> > If your DMA engine can't do that then its your DMA engine code which
>> > should refuse to prepare the transfer.
>> >
>> > Yes, that means problems with the way things are ordered - or it needs a
>> > proper API where DMA engine can export these kinds of properties.
>> The alignment constraint is related to PIO, sg_miter and that FIFO
>> access must be done with 4 bytes.
>
> Total claptrap.  No it isn't.
>
> - sg_miter just deals with bytes, and number of bytes transferred; there
>   is no word assumptions in that code.  Indeed many ATA disks transfer
>   by half-word accesses so such a restriction would be insane.
>
> - the FIFO access itself needs to be 32-bit words, so readsl or writesl
>   (or their io* equivalents must be used).
>
> - but - and this is the killer item to your argument as I said above -
>   readsl and writesl _can_ take misaligned pointers and cope with them
>   fine.
>
> The actual alignment of the buffer address is totally irrelevant here.
>
> What isn't irrelevant is the _number_ of bytes to be transferred, but
> that's something totally different and completely unrelated from
> data->sg->offset.

We totally agree with the above, but we still think it is worth to
elaborate a bit on why we think the constraints on the buffer address
could be a way forward.
We have also been considering to verify the length of the buffers but
felt in the end if was getting somewhat too complicated.
Let's look at a concrete example.

The mmci_pio_read|write functions are being provided with sg element
buffers, which may have any length and any address alignment, as you
stated.

This example is a read:
-sg-list has 2 elements.
-sg-element [0], is 3 bytes long, and the address is not aligned to 4 bytes.
-sg-element [1], is 1 bytes long and the address is aligned to 4 bytes.

pio_read will start by reading one word (4 bytes) from the FIFO and
fill the sg-element [0] with 3 of those 4 bytes. Then it will return
that 3 bytes has been read.
The upper pio_irq loop still think there is 1 byte more to read but
will never be able to read it. Instead the DATAEND irq will be
triggered and the mmc request will be "successfully" ended.

The above problem, can be fixed by reading data to a local allocated
buffer instead of directly to the sg-element buffer. Thus an extra
memcpy will be needed. Our concern is that it will be messy when
solving the corner cases and thus affecting the "hot path" too much
for pio_read.

Instead we decided to figure out a way to prevent us from needing to
take care of such pio_read situations completely. Since sg-element
buffers can be considered to be consecutive in memory and by adding
the 4 bytes alignment constraint of the buffer address, we believe
this should do it. The idea is that it will then only be the _last_
read to the FIFO which might be done as "unaligned" due to that the
_length_ of data does not have to be 4 bytes aligned. Such read is
already handled properly by pio_read.

Do note, that the problem is present for pio_write as well. Here we
could end up in writing "corrupt" data to the card or even worse,
accessing data outside a mapped page casing page faults.
This is again possible to be solved inside the pio_write function, but
we fare that it will be messy and break the hot path.

Sorry if this was too fussy, somewhat hard to explain without a
"whiteboard". :-)

Kind regards
Ulf Hansson
Russell King - ARM Linux Nov. 22, 2012, 2:50 p.m. UTC | #7
On Thu, Nov 22, 2012 at 02:43:10PM +0100, Ulf Hansson wrote:
> The mmci_pio_read|write functions are being provided with sg element
> buffers, which may have any length and any address alignment, as you
> stated.
> 
> This example is a read:
> -sg-list has 2 elements.
> -sg-element [0], is 3 bytes long, and the address is not aligned to 4 bytes.
> -sg-element [1], is 1 bytes long and the address is aligned to 4 bytes.

Notice here you're talking about the _length_ of the buffer, not its
alignment.  So you're talking about something entirely different from
what I'm saying.

The alignment of the buffer has precisely NOTHING to do with what you're
talking about above.

If the first sg-element is three bytes long and the second is one byte
long, it makes no difference what so ever whether it's aligned to 4 bytes
or not.  Think about it.  You're going to have to read three bytes from
the first sg element whatever.  You can't read four bytes from it and
hope that the additional byte in the second sg-element was following it.

So actually, if you have sg-element[0].length=3 and sg-lement[1].length=1
you can't deal with that case if you're going to be loading 4 bytes
_irrespective_ of the alignment of sg-element[0]'s buffer.

See?  The memory buffer location has absolutely nothing to do with this
issue.

> pio_read will start by reading one word (4 bytes) from the FIFO and
> fill the sg-element [0] with 3 of those 4 bytes. Then it will return
> that 3 bytes has been read.
> The upper pio_irq loop still think there is 1 byte more to read but
> will never be able to read it. Instead the DATAEND irq will be
> triggered and the mmc request will be "successfully" ended.
> 
> The above problem, can be fixed by reading data to a local allocated
> buffer instead of directly to the sg-element buffer. Thus an extra
> memcpy will be needed. Our concern is that it will be messy when
> solving the corner cases and thus affecting the "hot path" too much
> for pio_read.
> 
> Instead we decided to figure out a way to prevent us from needing to
> take care of such pio_read situations completely. Since sg-element
> buffers can be considered to be consecutive in memory and by adding
> the 4 bytes alignment constraint of the buffer address, we believe
> this should do it. The idea is that it will then only be the _last_
> read to the FIFO which might be done as "unaligned" due to that the
> _length_ of data does not have to be 4 bytes aligned. Such read is
> already handled properly by pio_read.

... and thereby penalise all cases where (eg) you're transferring 16
bytes but are misaligned?

No, I think you've got this all wrong.
Per Forlin Nov. 22, 2012, 5:28 p.m. UTC | #8
On Wed, Nov 21, 2012 at 5:50 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Wed, Nov 21, 2012 at 05:13:55PM +0100, Per Forlin wrote:
>> On Wed, Nov 21, 2012 at 4:38 PM, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>> > On Fri, Oct 12, 2012 at 04:02:02PM +0200, Ulf Hansson wrote:
>> >>  /*
>> >> + * Validate mmc prerequisites
>> >> + */
>> >> +static int mmci_validate_data(struct mmci_host *host,
>> >> +                           struct mmc_data *data)
>> >> +{
>> >> +     if (!data)
>> >> +             return 0;
>> >> +
>> >> +     if (!host->variant->non_power_of_2_blksize &&
>> >> +         !is_power_of_2(data->blksz)) {
>> >> +             dev_err(mmc_dev(host->mmc),
>> >> +                     "unsupported block size (%d bytes)\n", data->blksz);
>> >> +             return -EINVAL;
>> >> +     }
>> >> +
>> >> +     if (data->sg->offset & 3) {
>> >> +             dev_err(mmc_dev(host->mmc),
>> >> +                     "unsupported alginment (0x%x)\n", data->sg->offset);
>> >> +             return -EINVAL;
>> >> +     }
>> >
>> > Why?  What's the reasoning behind this suddenly introduced restriction?
>> > readsl()/writesl() copes just fine with non-aligned pointers.  It may be
>> > that your DMA engine can not, but that's no business interfering with
>> > non-DMA transfers, and no reason to fail such transfers.
>> >
>> > If your DMA engine can't do that then its your DMA engine code which
>> > should refuse to prepare the transfer.
>> >
>> > Yes, that means problems with the way things are ordered - or it needs a
>> > proper API where DMA engine can export these kinds of properties.
>> The alignment constraint is related to PIO, sg_miter and that FIFO
>> access must be done with 4 bytes.
>
> Total claptrap.  No it isn't.
>
> - sg_miter just deals with bytes, and number of bytes transferred; there
>   is no word assumptions in that code.  Indeed many ATA disks transfer
>   by half-word accesses so such a restriction would be insane.
>
> - the FIFO access itself needs to be 32-bit words, so readsl or writesl
>   (or their io* equivalents must be used).
>
> - but - and this is the killer item to your argument as I said above -
>   readsl and writesl _can_ take misaligned pointers and cope with them
>   fine.
>
> The actual alignment of the buffer address is totally irrelevant here.
>
> What isn't irrelevant is the _number_ of bytes to be transferred, but
> that's something totally different and completely unrelated from
> data->sg->offset.
Let's try again :)

Keep in mind that the mmc -block layer is aligned so from that aspect
everything is fine.
SDIO may have any length or alignment but sg-len is always 1.

There is just one sg-element and one continues buffer.

sg-miter splits the continues buffer in chunks that may not be aligned
with 4 byte size. It depends on the start address alignment of the
buffer.

Is it more clear now?
BR
Per
Russell King - ARM Linux Nov. 22, 2012, 5:37 p.m. UTC | #9
On Thu, Nov 22, 2012 at 06:28:30PM +0100, Per Forlin wrote:
> On Wed, Nov 21, 2012 at 5:50 PM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Wed, Nov 21, 2012 at 05:13:55PM +0100, Per Forlin wrote:
> >> On Wed, Nov 21, 2012 at 4:38 PM, Russell King - ARM Linux
> >> <linux@arm.linux.org.uk> wrote:
> >> > On Fri, Oct 12, 2012 at 04:02:02PM +0200, Ulf Hansson wrote:
> >> >>  /*
> >> >> + * Validate mmc prerequisites
> >> >> + */
> >> >> +static int mmci_validate_data(struct mmci_host *host,
> >> >> +                           struct mmc_data *data)
> >> >> +{
> >> >> +     if (!data)
> >> >> +             return 0;
> >> >> +
> >> >> +     if (!host->variant->non_power_of_2_blksize &&
> >> >> +         !is_power_of_2(data->blksz)) {
> >> >> +             dev_err(mmc_dev(host->mmc),
> >> >> +                     "unsupported block size (%d bytes)\n", data->blksz);
> >> >> +             return -EINVAL;
> >> >> +     }
> >> >> +
> >> >> +     if (data->sg->offset & 3) {
> >> >> +             dev_err(mmc_dev(host->mmc),
> >> >> +                     "unsupported alginment (0x%x)\n", data->sg->offset);
> >> >> +             return -EINVAL;
> >> >> +     }
> >> >
> >> > Why?  What's the reasoning behind this suddenly introduced restriction?
> >> > readsl()/writesl() copes just fine with non-aligned pointers.  It may be
> >> > that your DMA engine can not, but that's no business interfering with
> >> > non-DMA transfers, and no reason to fail such transfers.
> >> >
> >> > If your DMA engine can't do that then its your DMA engine code which
> >> > should refuse to prepare the transfer.
> >> >
> >> > Yes, that means problems with the way things are ordered - or it needs a
> >> > proper API where DMA engine can export these kinds of properties.
> >> The alignment constraint is related to PIO, sg_miter and that FIFO
> >> access must be done with 4 bytes.
> >
> > Total claptrap.  No it isn't.
> >
> > - sg_miter just deals with bytes, and number of bytes transferred; there
> >   is no word assumptions in that code.  Indeed many ATA disks transfer
> >   by half-word accesses so such a restriction would be insane.
> >
> > - the FIFO access itself needs to be 32-bit words, so readsl or writesl
> >   (or their io* equivalents must be used).
> >
> > - but - and this is the killer item to your argument as I said above -
> >   readsl and writesl _can_ take misaligned pointers and cope with them
> >   fine.
> >
> > The actual alignment of the buffer address is totally irrelevant here.
> >
> > What isn't irrelevant is the _number_ of bytes to be transferred, but
> > that's something totally different and completely unrelated from
> > data->sg->offset.
> Let's try again :)
> 
> Keep in mind that the mmc -block layer is aligned so from that aspect
> everything is fine.
> SDIO may have any length or alignment but sg-len is always 1.
> 
> There is just one sg-element and one continues buffer.
> 
> sg-miter splits the continues buffer in chunks that may not be aligned
> with 4 byte size. It depends on the start address alignment of the
> buffer.
> 
> Is it more clear now?

Is this more clear: you may be passed a single buffer which is misaligned.
We cope with that just fine.  There is *no* reason to reject that transfer
because readsl/writesl cope just fine with it.
Per Forlin Nov. 26, 2012, 10:20 a.m. UTC | #10
On 11/22/2012 06:37 PM, Russell King - ARM Linux wrote:
> On Thu, Nov 22, 2012 at 06:28:30PM +0100, Per Forlin wrote:
>> On Wed, Nov 21, 2012 at 5:50 PM, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>>> On Wed, Nov 21, 2012 at 05:13:55PM +0100, Per Forlin wrote:
>>>> On Wed, Nov 21, 2012 at 4:38 PM, Russell King - ARM Linux
>>>> <linux@arm.linux.org.uk> wrote:
>>>>> On Fri, Oct 12, 2012 at 04:02:02PM +0200, Ulf Hansson wrote:
>>>>>>  /*
>>>>>> + * Validate mmc prerequisites
>>>>>> + */
>>>>>> +static int mmci_validate_data(struct mmci_host *host,
>>>>>> +                           struct mmc_data *data)
>>>>>> +{
>>>>>> +     if (!data)
>>>>>> +             return 0;
>>>>>> +
>>>>>> +     if (!host->variant->non_power_of_2_blksize &&
>>>>>> +         !is_power_of_2(data->blksz)) {
>>>>>> +             dev_err(mmc_dev(host->mmc),
>>>>>> +                     "unsupported block size (%d bytes)\n", data->blksz);
>>>>>> +             return -EINVAL;
>>>>>> +     }
>>>>>> +
>>>>>> +     if (data->sg->offset & 3) {
>>>>>> +             dev_err(mmc_dev(host->mmc),
>>>>>> +                     "unsupported alginment (0x%x)\n", data->sg->offset);
>>>>>> +             return -EINVAL;
>>>>>> +     }
>>>>>
>>>>> Why?  What's the reasoning behind this suddenly introduced restriction?
>>>>> readsl()/writesl() copes just fine with non-aligned pointers.  It may be
>>>>> that your DMA engine can not, but that's no business interfering with
>>>>> non-DMA transfers, and no reason to fail such transfers.
>>>>>
>>>>> If your DMA engine can't do that then its your DMA engine code which
>>>>> should refuse to prepare the transfer.
>>>>>
>>>>> Yes, that means problems with the way things are ordered - or it needs a
>>>>> proper API where DMA engine can export these kinds of properties.
>>>> The alignment constraint is related to PIO, sg_miter and that FIFO
>>>> access must be done with 4 bytes.
>>>
>>> Total claptrap.  No it isn't.
>>>
>>> - sg_miter just deals with bytes, and number of bytes transferred; there
>>>   is no word assumptions in that code.  Indeed many ATA disks transfer
>>>   by half-word accesses so such a restriction would be insane.
>>>
>>> - the FIFO access itself needs to be 32-bit words, so readsl or writesl
>>>   (or their io* equivalents must be used).
>>>
>>> - but - and this is the killer item to your argument as I said above -
>>>   readsl and writesl _can_ take misaligned pointers and cope with them
>>>   fine.
>>>
>>> The actual alignment of the buffer address is totally irrelevant here.
>>>
>>> What isn't irrelevant is the _number_ of bytes to be transferred, but
>>> that's something totally different and completely unrelated from
>>> data->sg->offset.
>> Let's try again :)
>>
>> Keep in mind that the mmc -block layer is aligned so from that aspect
>> everything is fine.
>> SDIO may have any length or alignment but sg-len is always 1.
>>
>> There is just one sg-element and one continues buffer.
>>
>> sg-miter splits the continues buffer in chunks that may not be aligned
>> with 4 byte size. It depends on the start address alignment of the
>> buffer.
>>
>> Is it more clear now?
> 
> Is this more clear: you may be passed a single buffer which is misaligned.
> We cope with that just fine.  There is *no* reason to reject that transfer
> because readsl/writesl cope just fine with it.
> 
The MMCI driver doesn't support alignment smaller than 4 bytes (it may result in data corruption).
There are two options
1. Either one should fix the driver to support it. Currently the driver only supports miss-alignment of the last sg-miter buffer.
2. Or be kind to inform the user that the alignment is not supported.

BR
Per
Russell King - ARM Linux Nov. 26, 2012, 10:27 a.m. UTC | #11
On Mon, Nov 26, 2012 at 11:20:32AM +0100, Per Förlin wrote:
> On 11/22/2012 06:37 PM, Russell King - ARM Linux wrote:
> > On Thu, Nov 22, 2012 at 06:28:30PM +0100, Per Forlin wrote:
> >> On Wed, Nov 21, 2012 at 5:50 PM, Russell King - ARM Linux
> >> <linux@arm.linux.org.uk> wrote:
> >>> On Wed, Nov 21, 2012 at 05:13:55PM +0100, Per Forlin wrote:
> >>>> On Wed, Nov 21, 2012 at 4:38 PM, Russell King - ARM Linux
> >>>> <linux@arm.linux.org.uk> wrote:
> >>>>> On Fri, Oct 12, 2012 at 04:02:02PM +0200, Ulf Hansson wrote:
> >>>>>>  /*
> >>>>>> + * Validate mmc prerequisites
> >>>>>> + */
> >>>>>> +static int mmci_validate_data(struct mmci_host *host,
> >>>>>> +                           struct mmc_data *data)
> >>>>>> +{
> >>>>>> +     if (!data)
> >>>>>> +             return 0;
> >>>>>> +
> >>>>>> +     if (!host->variant->non_power_of_2_blksize &&
> >>>>>> +         !is_power_of_2(data->blksz)) {
> >>>>>> +             dev_err(mmc_dev(host->mmc),
> >>>>>> +                     "unsupported block size (%d bytes)\n", data->blksz);
> >>>>>> +             return -EINVAL;
> >>>>>> +     }
> >>>>>> +
> >>>>>> +     if (data->sg->offset & 3) {
> >>>>>> +             dev_err(mmc_dev(host->mmc),
> >>>>>> +                     "unsupported alginment (0x%x)\n", data->sg->offset);
> >>>>>> +             return -EINVAL;
> >>>>>> +     }
> >>>>>
> >>>>> Why?  What's the reasoning behind this suddenly introduced restriction?
> >>>>> readsl()/writesl() copes just fine with non-aligned pointers.  It may be
> >>>>> that your DMA engine can not, but that's no business interfering with
> >>>>> non-DMA transfers, and no reason to fail such transfers.
> >>>>>
> >>>>> If your DMA engine can't do that then its your DMA engine code which
> >>>>> should refuse to prepare the transfer.
> >>>>>
> >>>>> Yes, that means problems with the way things are ordered - or it needs a
> >>>>> proper API where DMA engine can export these kinds of properties.
> >>>> The alignment constraint is related to PIO, sg_miter and that FIFO
> >>>> access must be done with 4 bytes.
> >>>
> >>> Total claptrap.  No it isn't.
> >>>
> >>> - sg_miter just deals with bytes, and number of bytes transferred; there
> >>>   is no word assumptions in that code.  Indeed many ATA disks transfer
> >>>   by half-word accesses so such a restriction would be insane.
> >>>
> >>> - the FIFO access itself needs to be 32-bit words, so readsl or writesl
> >>>   (or their io* equivalents must be used).
> >>>
> >>> - but - and this is the killer item to your argument as I said above -
> >>>   readsl and writesl _can_ take misaligned pointers and cope with them
> >>>   fine.
> >>>
> >>> The actual alignment of the buffer address is totally irrelevant here.
> >>>
> >>> What isn't irrelevant is the _number_ of bytes to be transferred, but
> >>> that's something totally different and completely unrelated from
> >>> data->sg->offset.
> >> Let's try again :)
> >>
> >> Keep in mind that the mmc -block layer is aligned so from that aspect
> >> everything is fine.
> >> SDIO may have any length or alignment but sg-len is always 1.
> >>
> >> There is just one sg-element and one continues buffer.
> >>
> >> sg-miter splits the continues buffer in chunks that may not be aligned
> >> with 4 byte size. It depends on the start address alignment of the
> >> buffer.
> >>
> >> Is it more clear now?
> > 
> > Is this more clear: you may be passed a single buffer which is misaligned.
> > We cope with that just fine.  There is *no* reason to reject that transfer
> > because readsl/writesl cope just fine with it.
> > 
> The MMCI driver doesn't support alignment smaller than 4 bytes (it may
> result in data corruption).

Explain yourself.  That's what's lacking here.  I'm explaining why I
think you're wrong, but you're just asserting all the time that I'm
wrong without giving any real details.

> There are two options
> 1. Either one should fix the driver to support it. Currently the driver
> only supports miss-alignment of the last sg-miter buffer.
> 2. Or be kind to inform the user that the alignment is not supported.

Look, it's very very simple.

If you have a multi-sg transfer, and the pointer starts off being
misaligned, the first transfer to the end of the page _MAY_ be a
non-word aligned number of bytes.  _THAT_ is what you should be checking.
_THAT_ is what the limitation is in the driver.  _NOT_ that the pointer
is misaligned.
Per Forlin Nov. 26, 2012, 10:52 a.m. UTC | #12
On 11/26/2012 11:27 AM, Russell King - ARM Linux wrote:
> On Mon, Nov 26, 2012 at 11:20:32AM +0100, Per Förlin wrote:
>> On 11/22/2012 06:37 PM, Russell King - ARM Linux wrote:
>>> On Thu, Nov 22, 2012 at 06:28:30PM +0100, Per Forlin wrote:
>>>> On Wed, Nov 21, 2012 at 5:50 PM, Russell King - ARM Linux
>>>> <linux@arm.linux.org.uk> wrote:
>>>>> On Wed, Nov 21, 2012 at 05:13:55PM +0100, Per Forlin wrote:
>>>>>> On Wed, Nov 21, 2012 at 4:38 PM, Russell King - ARM Linux
>>>>>> <linux@arm.linux.org.uk> wrote:
>>>>>>> On Fri, Oct 12, 2012 at 04:02:02PM +0200, Ulf Hansson wrote:
>>>>>>>>  /*
>>>>>>>> + * Validate mmc prerequisites
>>>>>>>> + */
>>>>>>>> +static int mmci_validate_data(struct mmci_host *host,
>>>>>>>> +                           struct mmc_data *data)
>>>>>>>> +{
>>>>>>>> +     if (!data)
>>>>>>>> +             return 0;
>>>>>>>> +
>>>>>>>> +     if (!host->variant->non_power_of_2_blksize &&
>>>>>>>> +         !is_power_of_2(data->blksz)) {
>>>>>>>> +             dev_err(mmc_dev(host->mmc),
>>>>>>>> +                     "unsupported block size (%d bytes)\n", data->blksz);
>>>>>>>> +             return -EINVAL;
>>>>>>>> +     }
>>>>>>>> +
>>>>>>>> +     if (data->sg->offset & 3) {
>>>>>>>> +             dev_err(mmc_dev(host->mmc),
>>>>>>>> +                     "unsupported alginment (0x%x)\n", data->sg->offset);
>>>>>>>> +             return -EINVAL;
>>>>>>>> +     }
>>>>>>>
>>>>>>> Why?  What's the reasoning behind this suddenly introduced restriction?
>>>>>>> readsl()/writesl() copes just fine with non-aligned pointers.  It may be
>>>>>>> that your DMA engine can not, but that's no business interfering with
>>>>>>> non-DMA transfers, and no reason to fail such transfers.
>>>>>>>
>>>>>>> If your DMA engine can't do that then its your DMA engine code which
>>>>>>> should refuse to prepare the transfer.
>>>>>>>
>>>>>>> Yes, that means problems with the way things are ordered - or it needs a
>>>>>>> proper API where DMA engine can export these kinds of properties.
>>>>>> The alignment constraint is related to PIO, sg_miter and that FIFO
>>>>>> access must be done with 4 bytes.
>>>>>
>>>>> Total claptrap.  No it isn't.
>>>>>
>>>>> - sg_miter just deals with bytes, and number of bytes transferred; there
>>>>>   is no word assumptions in that code.  Indeed many ATA disks transfer
>>>>>   by half-word accesses so such a restriction would be insane.
>>>>>
>>>>> - the FIFO access itself needs to be 32-bit words, so readsl or writesl
>>>>>   (or their io* equivalents must be used).
>>>>>
>>>>> - but - and this is the killer item to your argument as I said above -
>>>>>   readsl and writesl _can_ take misaligned pointers and cope with them
>>>>>   fine.
>>>>>
>>>>> The actual alignment of the buffer address is totally irrelevant here.
>>>>>
>>>>> What isn't irrelevant is the _number_ of bytes to be transferred, but
>>>>> that's something totally different and completely unrelated from
>>>>> data->sg->offset.
>>>> Let's try again :)
>>>>
>>>> Keep in mind that the mmc -block layer is aligned so from that aspect
>>>> everything is fine.
>>>> SDIO may have any length or alignment but sg-len is always 1.
>>>>
>>>> There is just one sg-element and one continues buffer.
>>>>
>>>> sg-miter splits the continues buffer in chunks that may not be aligned
>>>> with 4 byte size. It depends on the start address alignment of the
>>>> buffer.
>>>>
>>>> Is it more clear now?
>>>
>>> Is this more clear: you may be passed a single buffer which is misaligned.
>>> We cope with that just fine.  There is *no* reason to reject that transfer
>>> because readsl/writesl cope just fine with it.
>>>
>> The MMCI driver doesn't support alignment smaller than 4 bytes (it may
>> result in data corruption).
> 
> Explain yourself.  That's what's lacking here.  I'm explaining why I
> think you're wrong, but you're just asserting all the time that I'm
> wrong without giving any real details.
> 
>> There are two options
>> 1. Either one should fix the driver to support it. Currently the driver
>> only supports miss-alignment of the last sg-miter buffer.
>> 2. Or be kind to inform the user that the alignment is not supported.
> 
> Look, it's very very simple.
> 
> If you have a multi-sg transfer, and the pointer starts off being
> misaligned, the first transfer to the end of the page _MAY_ be a
> non-word aligned number of bytes.  _THAT_ is what you should be checking.
> _THAT_ is what the limitation is in the driver.  _NOT_ that the pointer
> is misaligned.
>
There will be no multi-sg transfer that is misaligned because SDIO-fwk set the sg-length to 1. Still the transfer may be multi-PAGE with sg-length 1 which is something that mmci driver cannot handle.

The intention of "data->sg->offset & 3" is to check for misaligned data. What would you replace this check with?

Thanks
Per
Per Forlin Nov. 28, 2012, 4:55 p.m. UTC | #13
On Mon, Nov 26, 2012 at 11:52 AM, Per Förlin <per.forlin@stericsson.com> wrote:
> On 11/26/2012 11:27 AM, Russell King - ARM Linux wrote:
>> On Mon, Nov 26, 2012 at 11:20:32AM +0100, Per Förlin wrote:
>>> On 11/22/2012 06:37 PM, Russell King - ARM Linux wrote:
>>>> On Thu, Nov 22, 2012 at 06:28:30PM +0100, Per Forlin wrote:
>>>>> On Wed, Nov 21, 2012 at 5:50 PM, Russell King - ARM Linux
>>>>> <linux@arm.linux.org.uk> wrote:
>>>>>> On Wed, Nov 21, 2012 at 05:13:55PM +0100, Per Forlin wrote:
>>>>>>> On Wed, Nov 21, 2012 at 4:38 PM, Russell King - ARM Linux
>>>>>>> <linux@arm.linux.org.uk> wrote:
>>>>>>>> On Fri, Oct 12, 2012 at 04:02:02PM +0200, Ulf Hansson wrote:
>>>>>>>>>  /*
>>>>>>>>> + * Validate mmc prerequisites
>>>>>>>>> + */
>>>>>>>>> +static int mmci_validate_data(struct mmci_host *host,
>>>>>>>>> +                           struct mmc_data *data)
>>>>>>>>> +{
>>>>>>>>> +     if (!data)
>>>>>>>>> +             return 0;
>>>>>>>>> +
>>>>>>>>> +     if (!host->variant->non_power_of_2_blksize &&
>>>>>>>>> +         !is_power_of_2(data->blksz)) {
>>>>>>>>> +             dev_err(mmc_dev(host->mmc),
>>>>>>>>> +                     "unsupported block size (%d bytes)\n", data->blksz);
>>>>>>>>> +             return -EINVAL;
>>>>>>>>> +     }
>>>>>>>>> +
>>>>>>>>> +     if (data->sg->offset & 3) {
>>>>>>>>> +             dev_err(mmc_dev(host->mmc),
>>>>>>>>> +                     "unsupported alginment (0x%x)\n", data->sg->offset);
>>>>>>>>> +             return -EINVAL;
>>>>>>>>> +     }
>>>>>>>>
>>>>>>>> Why?  What's the reasoning behind this suddenly introduced restriction?
>>>>>>>> readsl()/writesl() copes just fine with non-aligned pointers.  It may be
>>>>>>>> that your DMA engine can not, but that's no business interfering with
>>>>>>>> non-DMA transfers, and no reason to fail such transfers.
>>>>>>>>
>>>>>>>> If your DMA engine can't do that then its your DMA engine code which
>>>>>>>> should refuse to prepare the transfer.
>>>>>>>>
>>>>>>>> Yes, that means problems with the way things are ordered - or it needs a
>>>>>>>> proper API where DMA engine can export these kinds of properties.
>>>>>>> The alignment constraint is related to PIO, sg_miter and that FIFO
>>>>>>> access must be done with 4 bytes.
>>>>>>
>>>>>> Total claptrap.  No it isn't.
>>>>>>
>>>>>> - sg_miter just deals with bytes, and number of bytes transferred; there
>>>>>>   is no word assumptions in that code.  Indeed many ATA disks transfer
>>>>>>   by half-word accesses so such a restriction would be insane.
>>>>>>
>>>>>> - the FIFO access itself needs to be 32-bit words, so readsl or writesl
>>>>>>   (or their io* equivalents must be used).
>>>>>>
>>>>>> - but - and this is the killer item to your argument as I said above -
>>>>>>   readsl and writesl _can_ take misaligned pointers and cope with them
>>>>>>   fine.
>>>>>>
>>>>>> The actual alignment of the buffer address is totally irrelevant here.
>>>>>>
>>>>>> What isn't irrelevant is the _number_ of bytes to be transferred, but
>>>>>> that's something totally different and completely unrelated from
>>>>>> data->sg->offset.
>>>>> Let's try again :)
>>>>>
>>>>> Keep in mind that the mmc -block layer is aligned so from that aspect
>>>>> everything is fine.
>>>>> SDIO may have any length or alignment but sg-len is always 1.
>>>>>
>>>>> There is just one sg-element and one continues buffer.
>>>>>
>>>>> sg-miter splits the continues buffer in chunks that may not be aligned
>>>>> with 4 byte size. It depends on the start address alignment of the
>>>>> buffer.
>>>>>
>>>>> Is it more clear now?
>>>>
>>>> Is this more clear: you may be passed a single buffer which is misaligned.
>>>> We cope with that just fine.  There is *no* reason to reject that transfer
>>>> because readsl/writesl cope just fine with it.
>>>>
>>> The MMCI driver doesn't support alignment smaller than 4 bytes (it may
>>> result in data corruption).
>>
>> Explain yourself.  That's what's lacking here.  I'm explaining why I
>> think you're wrong, but you're just asserting all the time that I'm
>> wrong without giving any real details.
>>
>>> There are two options
>>> 1. Either one should fix the driver to support it. Currently the driver
>>> only supports miss-alignment of the last sg-miter buffer.
>>> 2. Or be kind to inform the user that the alignment is not supported.
>>
>> Look, it's very very simple.
>>
>> If you have a multi-sg transfer, and the pointer starts off being
>> misaligned, the first transfer to the end of the page _MAY_ be a
>> non-word aligned number of bytes.  _THAT_ is what you should be checking.
>> _THAT_ is what the limitation is in the driver.  _NOT_ that the pointer
>> is misaligned.
>>
> There will be no multi-sg transfer that is misaligned because SDIO-fwk set the sg-length to 1. Still the transfer may be multi-PAGE with sg-length 1 which is something that mmci driver cannot handle.
>
> The intention of "data->sg->offset & 3" is to check for misaligned data. What would you replace this check with?
>
> Thanks
> Per
>
I have tried to work out an if-statement to check this properly. Here
is my conclusion,
This only works if sg len is 1 (in the SDIO case)

if (WRITE)
  align = sg->offset & 3
else if (READ)
  align = 0

if (sg->offset & 3 && (PAGE_SIZE - (sg->offset + align) < host->size))
   error; /* cannot be handled by mmci driver */

Is this if-statement align with your view of the alignment issue ?

BR
Per
Russell King - ARM Linux Nov. 28, 2012, 5:12 p.m. UTC | #14
On Wed, Nov 28, 2012 at 05:55:13PM +0100, Per Forlin wrote:
> I have tried to work out an if-statement to check this properly. Here
> is my conclusion,
> This only works if sg len is 1 (in the SDIO case)
> 
> if (WRITE)
>   align = sg->offset & 3
> else if (READ)
>   align = 0
> 
> if (sg->offset & 3 && (PAGE_SIZE - (sg->offset + align) < host->size))
>    error; /* cannot be handled by mmci driver */

This looks wrong.  For starters, why in the write case do you _add_ the
byte offset in the word to sg->offset ?  So, if we're offset by one byte,
it becomes two bytes.  Makes no sense.

Secondly, (PAGE_SIZE - whatever) < host->size will be mostly always true.
What that means is the check is effectively just "sg->offset & 3".

Ah, maybe that's what you're trying to do - obfuscate the code in the hope
that I won't analyse it so you can get your check past me and then remove
all the obfuscation later... but maybe I'm just paranoid. :)

> Is this if-statement align with your view of the alignment issue ?

I've no idea why you're making it soo complicated.

Let's review.  It copes fine with aligned buffers.  It copes fine with
misaligned buffers that don't cross a page boundary.  It also copes
fine with transfers that end with a non-word sized number of bytes
(which we don't need a check for).

So:

	/* We can't cope with a misaligned transfer which crosses a page */
	if (sg->offset & 3 && (sg->offset + sg->len > PAGE_SIZE ||
			       data->sg_len > 1))
		bad;

See what I've done there?  I've taken the English and converted it directly
to code.  And the comment also matches the code.

And this way we continue to allow buffers that the string-IO operations
can cope with just fine, but avoid the issue of a word crossing a page
boundary.
Ulf Hansson Nov. 29, 2012, 11:38 a.m. UTC | #15
On 28 November 2012 18:12, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Wed, Nov 28, 2012 at 05:55:13PM +0100, Per Forlin wrote:
>> I have tried to work out an if-statement to check this properly. Here
>> is my conclusion,
>> This only works if sg len is 1 (in the SDIO case)
>>
>> if (WRITE)
>>   align = sg->offset & 3
>> else if (READ)
>>   align = 0
>>
>> if (sg->offset & 3 && (PAGE_SIZE - (sg->offset + align) < host->size))
>>    error; /* cannot be handled by mmci driver */
>
> This looks wrong.  For starters, why in the write case do you _add_ the
> byte offset in the word to sg->offset ?  So, if we're offset by one byte,
> it becomes two bytes.  Makes no sense.
>
> Secondly, (PAGE_SIZE - whatever) < host->size will be mostly always true.
> What that means is the check is effectively just "sg->offset & 3".
>
> Ah, maybe that's what you're trying to do - obfuscate the code in the hope
> that I won't analyse it so you can get your check past me and then remove
> all the obfuscation later... but maybe I'm just paranoid. :)
>
>> Is this if-statement align with your view of the alignment issue ?
>
> I've no idea why you're making it soo complicated.
>
> Let's review.  It copes fine with aligned buffers.  It copes fine with
> misaligned buffers that don't cross a page boundary.  It also copes
> fine with transfers that end with a non-word sized number of bytes
> (which we don't need a check for).
>
> So:
>
>         /* We can't cope with a misaligned transfer which crosses a page */
>         if (sg->offset & 3 && (sg->offset + sg->len > PAGE_SIZE ||
>                                data->sg_len > 1))
>                 bad;

This is ok for read but not for write. If your data also resides in
any of the last 3 bytes of the PAGE, you might get page faults since
at the final writesl you will access data outside the PAGE. That is
what pseudo code was trying to accomplish as well.

You might want to solve this similarly as we did for pio_read with a
local cache buffer in pio_write. That should not be that hard to fix.
Do you think this is what we want?

>
> See what I've done there?  I've taken the English and converted it directly
> to code.  And the comment also matches the code.
>
> And this way we continue to allow buffers that the string-IO operations
> can cope with just fine, but avoid the issue of a word crossing a page
> boundary.

Some overall ideas we might want to consider, maybe as a the second step.

From mmci perspective it is good that we can cope with any constraints
at all from alignment perspective. Although, that will require a quite
extensive hacks in both pio_read and pio_write. By looking into how
other host drives has solved this, for example "dw_mci_push*"
functions gives you an idea. In the end this will mean complicated
code which will be affecting "hot path" and maybe performance. Do we
think this is ok? Is it worth a try?

Per and myself was kind of going in the opposite direction, thus keep
code simple and make sure it is optimized. On the other hand it means
putting constraints on buffer alignment. Right now there is no SDIO
API to tell SDIO clients about alignments constraints. Additionally
drivers is normally interested in using buffers that can be handled in
an optimized manner.

Kind regards
Ulf Hansson

> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ulf Hansson Dec. 21, 2012, 10:36 a.m. UTC | #16
On 29 November 2012 12:38, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> On 28 November 2012 18:12, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
>> On Wed, Nov 28, 2012 at 05:55:13PM +0100, Per Forlin wrote:
>>> I have tried to work out an if-statement to check this properly. Here
>>> is my conclusion,
>>> This only works if sg len is 1 (in the SDIO case)
>>>
>>> if (WRITE)
>>>   align = sg->offset & 3
>>> else if (READ)
>>>   align = 0
>>>
>>> if (sg->offset & 3 && (PAGE_SIZE - (sg->offset + align) < host->size))
>>>    error; /* cannot be handled by mmci driver */
>>
>> This looks wrong.  For starters, why in the write case do you _add_ the
>> byte offset in the word to sg->offset ?  So, if we're offset by one byte,
>> it becomes two bytes.  Makes no sense.
>>
>> Secondly, (PAGE_SIZE - whatever) < host->size will be mostly always true.
>> What that means is the check is effectively just "sg->offset & 3".
>>
>> Ah, maybe that's what you're trying to do - obfuscate the code in the hope
>> that I won't analyse it so you can get your check past me and then remove
>> all the obfuscation later... but maybe I'm just paranoid. :)
>>
>>> Is this if-statement align with your view of the alignment issue ?
>>
>> I've no idea why you're making it soo complicated.
>>
>> Let's review.  It copes fine with aligned buffers.  It copes fine with
>> misaligned buffers that don't cross a page boundary.  It also copes
>> fine with transfers that end with a non-word sized number of bytes
>> (which we don't need a check for).
>>
>> So:
>>
>>         /* We can't cope with a misaligned transfer which crosses a page */
>>         if (sg->offset & 3 && (sg->offset + sg->len > PAGE_SIZE ||
>>                                data->sg_len > 1))
>>                 bad;
>
> This is ok for read but not for write. If your data also resides in
> any of the last 3 bytes of the PAGE, you might get page faults since
> at the final writesl you will access data outside the PAGE. That is
> what pseudo code was trying to accomplish as well.
>
> You might want to solve this similarly as we did for pio_read with a
> local cache buffer in pio_write. That should not be that hard to fix.
> Do you think this is what we want?

Just wanted to conclude on the way forward. Should we fixup pio_write
according to how pio_read has been fixed, or should we adapt the check
for misaligned buffers to what Per proposed?
What do you prefer Russell?

>
>>
>> See what I've done there?  I've taken the English and converted it directly
>> to code.  And the comment also matches the code.
>>
>> And this way we continue to allow buffers that the string-IO operations
>> can cope with just fine, but avoid the issue of a word crossing a page
>> boundary.
>
> Some overall ideas we might want to consider, maybe as a the second step.
>
> From mmci perspective it is good that we can cope with any constraints
> at all from alignment perspective. Although, that will require a quite
> extensive hacks in both pio_read and pio_write. By looking into how
> other host drives has solved this, for example "dw_mci_push*"
> functions gives you an idea. In the end this will mean complicated
> code which will be affecting "hot path" and maybe performance. Do we
> think this is ok? Is it worth a try?
>
> Per and myself was kind of going in the opposite direction, thus keep
> code simple and make sure it is optimized. On the other hand it means
> putting constraints on buffer alignment. Right now there is no SDIO
> API to tell SDIO clients about alignments constraints. Additionally
> drivers is normally interested in using buffers that can be handled in
> an optimized manner.
>
> Kind regards
> Ulf Hansson
>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Kind regards
Ulf Hansson
Russell King - ARM Linux Dec. 21, 2012, 10:39 a.m. UTC | #17
On Fri, Dec 21, 2012 at 11:36:32AM +0100, Ulf Hansson wrote:
> Just wanted to conclude on the way forward. Should we fixup pio_write
> according to how pio_read has been fixed, or should we adapt the check
> for misaligned buffers to what Per proposed?
> What do you prefer Russell?

It's really silly to have pio_read doing something that pio_write doesn't
do - they should have the same behaviour except for the differing data
flow direction.
Ulf Hansson Dec. 21, 2012, 10:43 a.m. UTC | #18
On 21 December 2012 11:39, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Dec 21, 2012 at 11:36:32AM +0100, Ulf Hansson wrote:
>> Just wanted to conclude on the way forward. Should we fixup pio_write
>> according to how pio_read has been fixed, or should we adapt the check
>> for misaligned buffers to what Per proposed?
>> What do you prefer Russell?
>
> It's really silly to have pio_read doing something that pio_write doesn't
> do - they should have the same behaviour except for the differing data
> flow direction.

I agree. Thanks for your input.

Kind regards
Ulf Hansson
diff mbox

Patch

diff --git a/drivers/mmc/host/mmci.c b/drivers/mmc/host/mmci.c
index edc3e9b..ca6d128 100644
--- a/drivers/mmc/host/mmci.c
+++ b/drivers/mmc/host/mmci.c
@@ -48,6 +48,7 @@  static unsigned int fmax = 515633;
  * struct variant_data - MMCI variant-specific quirks
  * @clkreg: default value for MCICLOCK register
  * @clkreg_enable: enable value for MMCICLOCK register
+ * @dma_sdio_req_ctrl: enable value for DMAREQCTL register for SDIO write
  * @datalength_bits: number of bits in the MMCIDATALENGTH register
  * @fifosize: number of bytes that can be written when MMCI_TXFIFOEMPTY
  *	      is asserted (likewise for RX)
@@ -58,10 +59,12 @@  static unsigned int fmax = 515633;
  * @blksz_datactrl16: true if Block size is at b16..b30 position in datactrl register
  * @pwrreg_powerup: power up value for MMCIPOWER register
  * @signal_direction: input/out direction of bus signals can be indicated
+ * @non_power_of_2_blksize: true if block sizes can be other than power of two
  */
 struct variant_data {
 	unsigned int		clkreg;
 	unsigned int		clkreg_enable;
+	unsigned int		dma_sdio_req_ctrl;
 	unsigned int		datalength_bits;
 	unsigned int		fifosize;
 	unsigned int		fifohalfsize;
@@ -70,6 +73,7 @@  struct variant_data {
 	bool			blksz_datactrl16;
 	u32			pwrreg_powerup;
 	bool			signal_direction;
+	bool			non_power_of_2_blksize;
 };
 
 static struct variant_data variant_arm = {
@@ -112,6 +116,7 @@  static struct variant_data variant_ux500 = {
 	.fifohalfsize		= 8 * 4,
 	.clkreg			= MCI_CLK_ENABLE,
 	.clkreg_enable		= MCI_ST_UX500_HWFCEN,
+	.dma_sdio_req_ctrl	= MCI_ST_DPSM_DMAREQCTL,
 	.datalength_bits	= 24,
 	.sdio			= true,
 	.st_clkdiv		= true,
@@ -124,15 +129,42 @@  static struct variant_data variant_ux500v2 = {
 	.fifohalfsize		= 8 * 4,
 	.clkreg			= MCI_CLK_ENABLE,
 	.clkreg_enable		= MCI_ST_UX500_HWFCEN,
+	.dma_sdio_req_ctrl	= MCI_ST_DPSM_DMAREQCTL,
 	.datalength_bits	= 24,
 	.sdio			= true,
 	.st_clkdiv		= true,
 	.blksz_datactrl16	= true,
 	.pwrreg_powerup		= MCI_PWR_ON,
 	.signal_direction	= true,
+	.non_power_of_2_blksize	= true,
 };
 
 /*
+ * Validate mmc prerequisites
+ */
+static int mmci_validate_data(struct mmci_host *host,
+			      struct mmc_data *data)
+{
+	if (!data)
+		return 0;
+
+	if (!host->variant->non_power_of_2_blksize &&
+	    !is_power_of_2(data->blksz)) {
+		dev_err(mmc_dev(host->mmc),
+			"unsupported block size (%d bytes)\n", data->blksz);
+		return -EINVAL;
+	}
+
+	if (data->sg->offset & 3) {
+		dev_err(mmc_dev(host->mmc),
+			"unsupported alginment (0x%x)\n", data->sg->offset);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+/*
  * This must be called with host->lock held
  */
 static void mmci_write_clkreg(struct mmci_host *host, u32 clk)
@@ -446,8 +478,12 @@  static int mmci_dma_prep_data(struct mmci_host *host, struct mmc_data *data,
 	if (!chan)
 		return -EINVAL;
 
-	/* If less than or equal to the fifo size, don't bother with DMA */
-	if (data->blksz * data->blocks <= variant->fifosize)
+	/*
+	 * If less than or equal to the fifo size, don't bother with DMA
+	 * SDIO transfers may not be 4 bytes aligned, fall back to PIO
+	 */
+	if (data->blksz * data->blocks <= variant->fifosize ||
+	    (data->blksz * data->blocks) & 3)
 		return -EINVAL;
 
 	device = chan->device;
@@ -482,6 +518,7 @@  static int mmci_dma_start_data(struct mmci_host *host, unsigned int datactrl)
 {
 	int ret;
 	struct mmc_data *data = host->data;
+	struct variant_data *variant = host->variant;
 
 	ret = mmci_dma_prep_data(host, host->data, NULL);
 	if (ret)
@@ -496,6 +533,11 @@  static int mmci_dma_start_data(struct mmci_host *host, unsigned int datactrl)
 
 	datactrl |= MCI_DPSM_DMAENABLE;
 
+	/* Some hardware versions need special flags for SDIO DMA write */
+	if (variant->sdio && host->mmc->card && mmc_card_sdio(host->mmc->card)
+	    && (data->flags & MMC_DATA_WRITE))
+		datactrl |= variant->dma_sdio_req_ctrl;
+
 	/* Trigger the DMA transfer */
 	writel(datactrl, host->base + MMCIDATACTRL);
 
@@ -540,6 +582,9 @@  static void mmci_pre_request(struct mmc_host *mmc, struct mmc_request *mrq,
 	if (!data)
 		return;
 
+	if (mmci_validate_data(host, mrq->data))
+		return;
+
 	if (data->host_cookie) {
 		data->host_cookie = 0;
 		return;
@@ -642,7 +687,6 @@  static void mmci_start_data(struct mmci_host *host, struct mmc_data *data)
 	writel(host->size, base + MMCIDATALENGTH);
 
 	blksz_bits = ffs(data->blksz) - 1;
-	BUG_ON(1 << blksz_bits != data->blksz);
 
 	if (variant->blksz_datactrl16)
 		datactrl = MCI_DPSM_ENABLE | (data->blksz << 16);
@@ -1048,10 +1092,8 @@  static void mmci_request(struct mmc_host *mmc, struct mmc_request *mrq)
 
 	WARN_ON(host->mrq != NULL);
 
-	if (mrq->data && !is_power_of_2(mrq->data->blksz)) {
-		dev_err(mmc_dev(mmc), "unsupported block size (%d bytes)\n",
-			mrq->data->blksz);
-		mrq->cmd->error = -EINVAL;
+	mrq->cmd->error = mmci_validate_data(host, mrq->data);
+	if (mrq->cmd->error) {
 		mmc_request_done(mmc, mrq);
 		return;
 	}
diff --git a/drivers/mmc/host/mmci.h b/drivers/mmc/host/mmci.h
index d437ccf..c2b3332 100644
--- a/drivers/mmc/host/mmci.h
+++ b/drivers/mmc/host/mmci.h
@@ -60,6 +60,13 @@ 
 #define MCI_ST_DPSM_RWMOD	(1 << 10)
 #define MCI_ST_DPSM_SDIOEN	(1 << 11)
 /* Control register extensions in the ST Micro Ux500 versions */
+/*
+ * DMA request control is required for write
+ * if transfer size is not 32 byte aligned.
+ * DMA request control is also needed if the total
+ * transfer size is 32 byte aligned but any of the
+ * sg element lengths are not aligned with 32 byte.
+ */
 #define MCI_ST_DPSM_DMAREQCTL	(1 << 12)
 #define MCI_ST_DPSM_DBOOTMODEEN	(1 << 13)
 #define MCI_ST_DPSM_BUSYMODE	(1 << 14)