diff mbox series

[v7,3/5] usb: host: ohci-sm501: init genalloc for local memory

Message ID 20190529102843.13174-4-laurentiu.tudor@nxp.com (mailing list archive)
State Mainlined
Commit 7d9e6f5aebe8c03f1a5199ca5c30f0c53042af23
Headers show
Series prerequisites for device reserved local mem rework | expand

Commit Message

Laurentiu Tudor May 29, 2019, 10:28 a.m. UTC
From: Laurentiu Tudor <laurentiu.tudor@nxp.com>

In preparation for dropping the existing "coherent" dma mem declaration
APIs, replace the current dma_declare_coherent_memory() based mechanism
with the creation of a genalloc pool that will be used in the OHCI
subsystem as replacement for the DMA APIs.

For context, see thread here: https://lkml.org/lkml/2019/4/22/357

Signed-off-by: Laurentiu Tudor <laurentiu.tudor@nxp.com>
---
 drivers/usb/host/ohci-sm501.c | 47 +++++++++++++++--------------------
 1 file changed, 20 insertions(+), 27 deletions(-)

Comments

Guenter Roeck June 5, 2019, 9:46 p.m. UTC | #1
On Wed, May 29, 2019 at 01:28:41PM +0300, laurentiu.tudor@nxp.com wrote:
> From: Laurentiu Tudor <laurentiu.tudor@nxp.com>
> 
> In preparation for dropping the existing "coherent" dma mem declaration
> APIs, replace the current dma_declare_coherent_memory() based mechanism
> with the creation of a genalloc pool that will be used in the OHCI
> subsystem as replacement for the DMA APIs.
> 
> For context, see thread here: https://lkml.org/lkml/2019/4/22/357
> 
> Signed-off-by: Laurentiu Tudor <laurentiu.tudor@nxp.com>

This patch results in usb access failures when trying to boot from the
sm501-usb controller on sh4 with qemu.

usb 1-2.1: reset full-speed USB device number 4 using sm501-usb
sd 1:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x03 driverbyte=0x00
sd 1:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 08 7c 00 00 f0 00
print_req_error: I/O error, dev sda, sector 2172 flags 80700
usb 1-2.1: reset full-speed USB device number 4 using sm501-usb
sd 1:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x03 driverbyte=0x00
sd 1:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 01 da 00 00 f0 00
print_req_error: I/O error, dev sda, sector 474 flags 84700
usb 1-2.1: reset full-speed USB device number 4 using sm501-usb
sd 1:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x03 driverbyte=0x00
sd 1:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 02 da 00 00 f0 00
print_req_error: I/O error, dev sda, sector 730 flags 84700
usb 1-2.1: reset full-speed USB device number 4 using sm501-usb
sd 1:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x03 driverbyte=0x00
sd 1:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 0b 50 00 00 f0 00
print_req_error: I/O error, dev sda, sector 2896 flags 84700

Qemu command line is:

The qemu command line is:

qemu-system-sh4 -M r2d \
        -kernel ./arch/sh/boot/zImage \
	-snapshot \
	-usb -device usb-storage,drive=d0 \
	-drive file=rootfs.ext2,if=none,id=d0,format=raw \
	-append 'panic=-1 slub_debug=FZPUA root=/dev/sda rootwait console=ttySC1,115200 earlycon=scif,mmio16,0xffe80000 noiotrap' \
	-serial null -serial stdio \
	-net nic,model=rtl8139 -net user -nographic -monitor null

Reverting this patch as well as "USB: drop HCD_LOCAL_MEM flag" fixes the
problem. Reverting "USB: drop HCD_LOCAL_MEM flag" alone does not help.

Guenter
Guenter Roeck June 11, 2019, 1:32 p.m. UTC | #2
On Wed, Jun 05, 2019 at 02:46:22PM -0700, Guenter Roeck wrote:
> On Wed, May 29, 2019 at 01:28:41PM +0300, laurentiu.tudor@nxp.com wrote:
> > From: Laurentiu Tudor <laurentiu.tudor@nxp.com>
> > 
> > In preparation for dropping the existing "coherent" dma mem declaration
> > APIs, replace the current dma_declare_coherent_memory() based mechanism
> > with the creation of a genalloc pool that will be used in the OHCI
> > subsystem as replacement for the DMA APIs.
> > 
> > For context, see thread here: https://lkml.org/lkml/2019/4/22/357
> > 
> > Signed-off-by: Laurentiu Tudor <laurentiu.tudor@nxp.com>
> 
> This patch results in usb access failures when trying to boot from the
> sm501-usb controller on sh4 with qemu.
> 
> usb 1-2.1: reset full-speed USB device number 4 using sm501-usb
> sd 1:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x03 driverbyte=0x00
> sd 1:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 08 7c 00 00 f0 00
> print_req_error: I/O error, dev sda, sector 2172 flags 80700
> usb 1-2.1: reset full-speed USB device number 4 using sm501-usb
> sd 1:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x03 driverbyte=0x00
> sd 1:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 01 da 00 00 f0 00
> print_req_error: I/O error, dev sda, sector 474 flags 84700
> usb 1-2.1: reset full-speed USB device number 4 using sm501-usb
> sd 1:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x03 driverbyte=0x00
> sd 1:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 02 da 00 00 f0 00
> print_req_error: I/O error, dev sda, sector 730 flags 84700
> usb 1-2.1: reset full-speed USB device number 4 using sm501-usb
> sd 1:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x03 driverbyte=0x00
> sd 1:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 0b 50 00 00 f0 00
> print_req_error: I/O error, dev sda, sector 2896 flags 84700
> 
> Qemu command line is:
> 
> The qemu command line is:
> 
> qemu-system-sh4 -M r2d \
>         -kernel ./arch/sh/boot/zImage \
> 	-snapshot \
> 	-usb -device usb-storage,drive=d0 \
> 	-drive file=rootfs.ext2,if=none,id=d0,format=raw \
> 	-append 'panic=-1 slub_debug=FZPUA root=/dev/sda rootwait console=ttySC1,115200 earlycon=scif,mmio16,0xffe80000 noiotrap' \
> 	-serial null -serial stdio \
> 	-net nic,model=rtl8139 -net user -nographic -monitor null
> 
> Reverting this patch as well as "USB: drop HCD_LOCAL_MEM flag" fixes the
> problem. Reverting "USB: drop HCD_LOCAL_MEM flag" alone does not help.
> 

This problem is still seen in next-20190611.
Has anyone actually tested this code ?

Guenter
Fredrik Noring June 11, 2019, 5:26 p.m. UTC | #3
Hi Guenter,

> > This patch results in usb access failures when trying to boot from the
> > sm501-usb controller on sh4 with qemu.
> > 
> > usb 1-2.1: reset full-speed USB device number 4 using sm501-usb
> > sd 1:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x03 driverbyte=0x00
> > sd 1:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 08 7c 00 00 f0 00
> > print_req_error: I/O error, dev sda, sector 2172 flags 80700
> > usb 1-2.1: reset full-speed USB device number 4 using sm501-usb
> > sd 1:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x03 driverbyte=0x00
> > sd 1:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 01 da 00 00 f0 00
> > print_req_error: I/O error, dev sda, sector 474 flags 84700
> > usb 1-2.1: reset full-speed USB device number 4 using sm501-usb
> > sd 1:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x03 driverbyte=0x00
> > sd 1:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 02 da 00 00 f0 00
> > print_req_error: I/O error, dev sda, sector 730 flags 84700
> > usb 1-2.1: reset full-speed USB device number 4 using sm501-usb
> > sd 1:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x03 driverbyte=0x00
> > sd 1:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 0b 50 00 00 f0 00
> > print_req_error: I/O error, dev sda, sector 2896 flags 84700
> > 
> > Qemu command line is:
> > 
> > The qemu command line is:
> > 
> > qemu-system-sh4 -M r2d \
> >         -kernel ./arch/sh/boot/zImage \
> > 	-snapshot \
> > 	-usb -device usb-storage,drive=d0 \
> > 	-drive file=rootfs.ext2,if=none,id=d0,format=raw \
> > 	-append 'panic=-1 slub_debug=FZPUA root=/dev/sda rootwait console=ttySC1,115200 earlycon=scif,mmio16,0xffe80000 noiotrap' \
> > 	-serial null -serial stdio \
> > 	-net nic,model=rtl8139 -net user -nographic -monitor null
> > 
> > Reverting this patch as well as "USB: drop HCD_LOCAL_MEM flag" fixes the
> > problem. Reverting "USB: drop HCD_LOCAL_MEM flag" alone does not help.
> > 
> 
> This problem is still seen in next-20190611.
> Has anyone actually tested this code ?

I tested patches 1, 2 and 5 with v5.0.19. Perhaps yet another part of the
OHCI subsystem allocates memory from the wrong pool? With some luck it is
relatively easy to trace backwards from the error messages to the point
where the memory is being allocated. One way to establish this is to
sprinkle printk around if-statements. There may be 10-20 levels of calls
including one or two indirect calls via pointers. Would you be able to do
that?

Fredrik
Guenter Roeck June 11, 2019, 7:03 p.m. UTC | #4
On Tue, Jun 11, 2019 at 07:26:54PM +0200, Fredrik Noring wrote:
> Hi Guenter,
> 
> > > This patch results in usb access failures when trying to boot from the
> > > sm501-usb controller on sh4 with qemu.
> > > 
> > > usb 1-2.1: reset full-speed USB device number 4 using sm501-usb
> > > sd 1:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x03 driverbyte=0x00
> > > sd 1:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 08 7c 00 00 f0 00
> > > print_req_error: I/O error, dev sda, sector 2172 flags 80700
> > > usb 1-2.1: reset full-speed USB device number 4 using sm501-usb
> > > sd 1:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x03 driverbyte=0x00
> > > sd 1:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 01 da 00 00 f0 00
> > > print_req_error: I/O error, dev sda, sector 474 flags 84700
> > > usb 1-2.1: reset full-speed USB device number 4 using sm501-usb
> > > sd 1:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x03 driverbyte=0x00
> > > sd 1:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 02 da 00 00 f0 00
> > > print_req_error: I/O error, dev sda, sector 730 flags 84700
> > > usb 1-2.1: reset full-speed USB device number 4 using sm501-usb
> > > sd 1:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x03 driverbyte=0x00
> > > sd 1:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 0b 50 00 00 f0 00
> > > print_req_error: I/O error, dev sda, sector 2896 flags 84700
> > > 
> > > Qemu command line is:
> > > 
> > > The qemu command line is:
> > > 
> > > qemu-system-sh4 -M r2d \
> > >         -kernel ./arch/sh/boot/zImage \
> > > 	-snapshot \
> > > 	-usb -device usb-storage,drive=d0 \
> > > 	-drive file=rootfs.ext2,if=none,id=d0,format=raw \
> > > 	-append 'panic=-1 slub_debug=FZPUA root=/dev/sda rootwait console=ttySC1,115200 earlycon=scif,mmio16,0xffe80000 noiotrap' \
> > > 	-serial null -serial stdio \
> > > 	-net nic,model=rtl8139 -net user -nographic -monitor null
> > > 
> > > Reverting this patch as well as "USB: drop HCD_LOCAL_MEM flag" fixes the
> > > problem. Reverting "USB: drop HCD_LOCAL_MEM flag" alone does not help.
> > > 
> > 
> > This problem is still seen in next-20190611.
> > Has anyone actually tested this code ?
> 
> I tested patches 1, 2 and 5 with v5.0.19. Perhaps yet another part of the
> OHCI subsystem allocates memory from the wrong pool? With some luck it is
> relatively easy to trace backwards from the error messages to the point
> where the memory is being allocated. One way to establish this is to
> sprinkle printk around if-statements. There may be 10-20 levels of calls
> including one or two indirect calls via pointers. Would you be able to do
> that?
> 

I don't think I'll have time to do that anytime soon. Not that I know what
exactly to look for in the first place.

Can you do that debugging yourself ? All you would need is a cross-compiler
(eg from kernel.org), qemu, and a working configuration (the root file
system doesn't really matter since the code doesn't get to the point of
loading it, but you can use [1]). For the configuration file, you can use
rts7751r2dplus_defconfig with CONFIG_CMDLINE and CONFIG_CMDLINE_OVERWRITE
removed.

Thanks,
Guenter

---
[1] https://github.com/groeck/linux-build-test/blob/master/rootfs/sh/rootfs.ext2.gz
Fredrik Noring June 13, 2019, 1:40 p.m. UTC | #5
Hi Guenter,

> I don't think I'll have time to do that anytime soon. Not that I know what
> exactly to look for in the first place.

I can confirm that there is a problem with mass storage devices and these
local memory patches.

Fredrik
Guenter Roeck June 13, 2019, 1:54 p.m. UTC | #6
Hi Fredrik,

On 6/13/19 6:40 AM, Fredrik Noring wrote:
> Hi Guenter,
> 
>> I don't think I'll have time to do that anytime soon. Not that I know what
>> exactly to look for in the first place.
> 
> I can confirm that there is a problem with mass storage devices and these
> local memory patches.
> 

Thanks for the confirmation. Do you see the problem only with the
ohci-sm501 driver or also with others ?

Thanks,
Guenter
Fredrik Noring June 13, 2019, 3:34 p.m. UTC | #7
Hi Guenter,

> Thanks for the confirmation. Do you see the problem only with the
> ohci-sm501 driver or also with others ?

All are likely affected, but it depends, because I believe the problem is
that the USB subsystem runs out of memory. Please try the attached patch!

The pool assumed 4096 byte page alignment for every allocation, which is
excessive given that many requests are for 16 and 32 bytes. In the patch
below, I have turned down the order to 5, which is good enough for the ED
and TD structures of the OHCI, but not enough for the HCCA that needs 256
byte alignment. With some luck, the WARN_ON_ONCE will not trigger in your
test, though. If it does, you may try to increase the order from 5 to 8.

I have observed strange things happen when the USB subsystem runs out of
memory. The mass storage drivers often seem to busy-wait on -ENOMEM,
consuming a lot of processor resources. It would be much more efficient
to sleep waiting for memory to become available.

In your case I suspect that allocation failures are not correctly
attributed. Certain kinds of temporary freezes may also occur, as the
various devices are reset due to host memory allocation errors.

Fredrik

diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
--- a/drivers/usb/core/hcd.c
+++ b/drivers/usb/core/hcd.c
@@ -3011,7 +3011,7 @@ int usb_hcd_setup_local_mem(struct usb_hcd *hcd, phys_addr_t phys_addr,
 	int err;
 	void __iomem *local_mem;
 
-	hcd->localmem_pool = devm_gen_pool_create(hcd->self.sysdev, PAGE_SHIFT,
+	hcd->localmem_pool = devm_gen_pool_create(hcd->self.sysdev, 5,
 						  dev_to_node(hcd->self.sysdev),
 						  dev_name(hcd->self.sysdev));
 	if (IS_ERR(hcd->localmem_pool))
diff --git a/drivers/usb/host/ohci-hcd.c b/drivers/usb/host/ohci-hcd.c
--- a/drivers/usb/host/ohci-hcd.c
+++ b/drivers/usb/host/ohci-hcd.c
@@ -517,6 +517,7 @@ static int ohci_init (struct ohci_hcd *ohci)
 						GFP_KERNEL);
 	if (!ohci->hcca)
 		return -ENOMEM;
+	WARN_ON_ONCE(ohci->hcca_dma & 0xff);
 
 	if ((ret = ohci_mem_init (ohci)) < 0)
 		ohci_stop (hcd);
Guenter Roeck June 13, 2019, 6:05 p.m. UTC | #8
On 6/13/19 8:34 AM, Fredrik Noring wrote:
> Hi Guenter,
> 
>> Thanks for the confirmation. Do you see the problem only with the
>> ohci-sm501 driver or also with others ?
> 
> All are likely affected, but it depends, because I believe the problem is
> that the USB subsystem runs out of memory. Please try the attached patch!
> 
> The pool assumed 4096 byte page alignment for every allocation, which is
> excessive given that many requests are for 16 and 32 bytes. In the patch
> below, I have turned down the order to 5, which is good enough for the ED
> and TD structures of the OHCI, but not enough for the HCCA that needs 256
> byte alignment. With some luck, the WARN_ON_ONCE will not trigger in your
> test, though. If it does, you may try to increase the order from 5 to 8.
> 

You are right, the patch below fixes the problem. I did not get the warning
with order==5. Nevertheless, I also tested with order==8; that works as well.

Thanks a lot for tracking this down!
Guenter

> I have observed strange things happen when the USB subsystem runs out of
> memory. The mass storage drivers often seem to busy-wait on -ENOMEM,
> consuming a lot of processor resources. It would be much more efficient
> to sleep waiting for memory to become available.
> 
> In your case I suspect that allocation failures are not correctly
> attributed. Certain kinds of temporary freezes may also occur, as the
> various devices are reset due to host memory allocation errors.
>  > Fredrik
> 
> diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
> --- a/drivers/usb/core/hcd.c
> +++ b/drivers/usb/core/hcd.c
> @@ -3011,7 +3011,7 @@ int usb_hcd_setup_local_mem(struct usb_hcd *hcd, phys_addr_t phys_addr,
>   	int err;
>   	void __iomem *local_mem;
>   
> -	hcd->localmem_pool = devm_gen_pool_create(hcd->self.sysdev, PAGE_SHIFT,
> +	hcd->localmem_pool = devm_gen_pool_create(hcd->self.sysdev, 5,
>   						  dev_to_node(hcd->self.sysdev),
>   						  dev_name(hcd->self.sysdev));
>   	if (IS_ERR(hcd->localmem_pool))
> diff --git a/drivers/usb/host/ohci-hcd.c b/drivers/usb/host/ohci-hcd.c
> --- a/drivers/usb/host/ohci-hcd.c
> +++ b/drivers/usb/host/ohci-hcd.c
> @@ -517,6 +517,7 @@ static int ohci_init (struct ohci_hcd *ohci)
>   						GFP_KERNEL);
>   	if (!ohci->hcca)
>   		return -ENOMEM;
> +	WARN_ON_ONCE(ohci->hcca_dma & 0xff);
>   
>   	if ((ret = ohci_mem_init (ohci)) < 0)
>   		ohci_stop (hcd);
>
Fredrik Noring June 14, 2019, 2:28 p.m. UTC | #9
Hi Guenter,

> You are right, the patch below fixes the problem. I did not get the warning
> with order==5. Nevertheless, I also tested with order==8; that works as well.
> 
> Thanks a lot for tracking this down!

You are welcome, and thanks for your report!

This patch series needs some redesign, I think, because the problem you
reported would come back if one attaches two or more devices to the
system. Local memory devices are typically memory constrained and so it
has to be used efficiently. I believe there are four kinds of alignments
to consider when memory is allocated in the pool:

	- 256 bytes for the host controller communication area (HCCA);
	- 32 bytes for the general transfer descriptors (TDs);
	- 16 bytes for the endpoint descriptors (EDs);
	- buffer alignment for data.

Using the greatest common alignment for all is clearly an undesirable
regression. The TDs and EDs could have their own subpools, perhaps, as
they are abundant. There is only one instance of the HCCA.

As mentioned, the USB subsystem could be improved to properly report
allocation failures, and the logic to retry allocations could be more
efficient by avoiding polling loops.

Fredrik
Laurentiu Tudor June 18, 2019, 10:48 a.m. UTC | #10
Hello,

> -----Original Message-----
> From: Guenter Roeck <groeck7@gmail.com> On Behalf Of Guenter Roeck
> Sent: Thursday, June 13, 2019 9:06 PM
> 
> On 6/13/19 8:34 AM, Fredrik Noring wrote:
> > Hi Guenter,
> >
> >> Thanks for the confirmation. Do you see the problem only with the
> >> ohci-sm501 driver or also with others ?
> >
> > All are likely affected, but it depends, because I believe the problem
> is
> > that the USB subsystem runs out of memory. Please try the attached patch!
> >
> > The pool assumed 4096 byte page alignment for every allocation, which is
> > excessive given that many requests are for 16 and 32 bytes. In the patch
> > below, I have turned down the order to 5, which is good enough for the
> ED
> > and TD structures of the OHCI, but not enough for the HCCA that needs
> 256
> > byte alignment. With some luck, the WARN_ON_ONCE will not trigger in
> your
> > test, though. If it does, you may try to increase the order from 5 to 8.
> >
> 
> You are right, the patch below fixes the problem. I did not get the
> warning
> with order==5. Nevertheless, I also tested with order==8; that works as
> well.

Sorry for the late reply, I was OOO for past week+ and many thanks for taking a look at this.
So if my understanding is correct, an order of PAGE_SHIFT is too large and leads to waste of memory and in the end to an out of memory condition. This leaves me wondering what a safe order value would be.
I'll try to look into the legacy dma coherent allocation code maybe I can gather some info on the subject.

---
Best Regards, Laurentiu

> 
> > I have observed strange things happen when the USB subsystem runs out of
> > memory. The mass storage drivers often seem to busy-wait on -ENOMEM,
> > consuming a lot of processor resources. It would be much more efficient
> > to sleep waiting for memory to become available.
> >
> > In your case I suspect that allocation failures are not correctly
> > attributed. Certain kinds of temporary freezes may also occur, as the
> > various devices are reset due to host memory allocation errors.
> >  > Fredrik
> >
> > diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
> > --- a/drivers/usb/core/hcd.c
> > +++ b/drivers/usb/core/hcd.c
> > @@ -3011,7 +3011,7 @@ int usb_hcd_setup_local_mem(struct usb_hcd *hcd,
> phys_addr_t phys_addr,
> >   	int err;
> >   	void __iomem *local_mem;
> >
> > -	hcd->localmem_pool = devm_gen_pool_create(hcd->self.sysdev,
> PAGE_SHIFT,
> > +	hcd->localmem_pool = devm_gen_pool_create(hcd->self.sysdev, 5,
> >   						  dev_to_node(hcd->self.sysdev),
> >   						  dev_name(hcd->self.sysdev));
> >   	if (IS_ERR(hcd->localmem_pool))
> > diff --git a/drivers/usb/host/ohci-hcd.c b/drivers/usb/host/ohci-hcd.c
> > --- a/drivers/usb/host/ohci-hcd.c
> > +++ b/drivers/usb/host/ohci-hcd.c
> > @@ -517,6 +517,7 @@ static int ohci_init (struct ohci_hcd *ohci)
> >   						GFP_KERNEL);
> >   	if (!ohci->hcca)
> >   		return -ENOMEM;
> > +	WARN_ON_ONCE(ohci->hcca_dma & 0xff);
> >
> >   	if ((ret = ohci_mem_init (ohci)) < 0)
> >   		ohci_stop (hcd);
> >
Christoph Hellwig June 24, 2019, 6:35 a.m. UTC | #11
Can you send me the patch formally so that I can queue it up for the
dma-mapping tree?
Fredrik Noring June 24, 2019, 12:59 p.m. UTC | #12
Hi Christoph,

> Can you send me the patch formally so that I can queue it up for the
> dma-mapping tree?

That patch would be detrimental to local memory devices, as previously
discussed, so I would like to suggest a much better approach, as shown below,
where allocations are aligned as required but not necessarily much more than
that.

Fredrik

diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
--- a/drivers/usb/core/hcd.c
+++ b/drivers/usb/core/hcd.c
@@ -3014,7 +3014,7 @@ int usb_hcd_setup_local_mem(struct usb_hcd *hcd, phys_addr_t phys_addr,
 	int err;
 	void __iomem *local_mem;
 
-	hcd->localmem_pool = devm_gen_pool_create(hcd->self.sysdev, PAGE_SHIFT,
+	hcd->localmem_pool = devm_gen_pool_create(hcd->self.sysdev, 4,
 						  dev_to_node(hcd->self.sysdev),
 						  dev_name(hcd->self.sysdev));
 	if (IS_ERR(hcd->localmem_pool))
diff --git a/drivers/usb/host/ohci-hcd.c b/drivers/usb/host/ohci-hcd.c
--- a/drivers/usb/host/ohci-hcd.c
+++ b/drivers/usb/host/ohci-hcd.c
@@ -507,9 +507,9 @@ static int ohci_init (struct ohci_hcd *ohci)
 	ohci->prev_frame_no = IO_WATCHDOG_OFF;
 
 	if (hcd->localmem_pool)
-		ohci->hcca = gen_pool_dma_alloc(hcd->localmem_pool,
+		ohci->hcca = gen_pool_dma_alloc_align(hcd->localmem_pool,
 						sizeof(*ohci->hcca),
-						&ohci->hcca_dma);
+						&ohci->hcca_dma, 256);
 	else
 		ohci->hcca = dma_alloc_coherent(hcd->self.controller,
 						sizeof(*ohci->hcca),
diff --git a/drivers/usb/host/ohci-mem.c b/drivers/usb/host/ohci-mem.c
--- a/drivers/usb/host/ohci-mem.c
+++ b/drivers/usb/host/ohci-mem.c
@@ -94,7 +94,8 @@ td_alloc (struct ohci_hcd *hc, gfp_t mem_flags)
 	struct usb_hcd	*hcd = ohci_to_hcd(hc);
 
 	if (hcd->localmem_pool)
-		td = gen_pool_dma_zalloc(hcd->localmem_pool, sizeof(*td), &dma);
+		td = gen_pool_dma_zalloc_align(hcd->localmem_pool,
+				sizeof(*td), &dma, 32);
 	else
 		td = dma_pool_zalloc(hc->td_cache, mem_flags, &dma);
 	if (td) {
@@ -137,7 +138,8 @@ ed_alloc (struct ohci_hcd *hc, gfp_t mem_flags)
 	struct usb_hcd	*hcd = ohci_to_hcd(hc);
 
 	if (hcd->localmem_pool)
-		ed = gen_pool_dma_zalloc(hcd->localmem_pool, sizeof(*ed), &dma);
+		ed = gen_pool_dma_zalloc_align(hcd->localmem_pool,
+				sizeof(*ed), &dma, 16);
 	else
 		ed = dma_pool_zalloc(hc->ed_cache, mem_flags, &dma);
 	if (ed) {
diff --git a/include/linux/genalloc.h b/include/linux/genalloc.h
--- a/include/linux/genalloc.h
+++ b/include/linux/genalloc.h
@@ -121,7 +121,15 @@ extern unsigned long gen_pool_alloc_algo(struct gen_pool *, size_t,
 		genpool_algo_t algo, void *data);
 extern void *gen_pool_dma_alloc(struct gen_pool *pool, size_t size,
 		dma_addr_t *dma);
-void *gen_pool_dma_zalloc(struct gen_pool *pool, size_t size, dma_addr_t *dma);
+extern void *gen_pool_dma_alloc_algo(struct gen_pool *pool, size_t size,
+		dma_addr_t *dma, genpool_algo_t algo, void *data);
+extern void *gen_pool_dma_alloc_align(struct gen_pool *pool, size_t size,
+		dma_addr_t *dma, int align);
+extern void *gen_pool_dma_zalloc(struct gen_pool *pool, size_t size, dma_addr_t *dma);
+extern void *gen_pool_dma_zalloc_algo(struct gen_pool *pool, size_t size,
+		dma_addr_t *dma, genpool_algo_t algo, void *data);
+extern void *gen_pool_dma_zalloc_align(struct gen_pool *pool, size_t size,
+		dma_addr_t *dma, int align);
 extern void gen_pool_free(struct gen_pool *, unsigned long, size_t);
 extern void gen_pool_for_each_chunk(struct gen_pool *,
 	void (*)(struct gen_pool *, struct gen_pool_chunk *, void *), void *);
diff --git a/lib/genalloc.c b/lib/genalloc.c
--- a/lib/genalloc.c
+++ b/lib/genalloc.c
@@ -347,13 +347,33 @@ EXPORT_SYMBOL(gen_pool_alloc_algo);
  * Return: virtual address of the allocated memory, or %NULL on failure
  */
 void *gen_pool_dma_alloc(struct gen_pool *pool, size_t size, dma_addr_t *dma)
+{
+	return gen_pool_dma_alloc_algo(pool, size, dma, pool->algo, pool->data);
+}
+EXPORT_SYMBOL(gen_pool_dma_alloc);
+
+/**
+ * gen_pool_dma_alloc_algo - allocate special memory from the pool for DMA
+ * usage with the given pool algorithm
+ * @pool: pool to allocate from
+ * @size: number of bytes to allocate from the pool
+ * @dma: dma-view physical address return value.  Use NULL if unneeded.
+ * @algo: algorithm passed from caller
+ * @data: data passed to algorithm
+ *
+ * Allocate the requested number of bytes from the specified pool. Uses the
+ * given pool allocation function. Can not be used in NMI handler on
+ * architectures without NMI-safe cmpxchg implementation.
+ */
+void *gen_pool_dma_alloc_algo(struct gen_pool *pool, size_t size,
+		dma_addr_t *dma, genpool_algo_t algo, void *data)
 {
 	unsigned long vaddr;
 
 	if (!pool)
 		return NULL;
 
-	vaddr = gen_pool_alloc(pool, size);
+	vaddr = gen_pool_alloc_algo(pool, size, algo, data);
 	if (!vaddr)
 		return NULL;
 
@@ -362,7 +382,31 @@ void *gen_pool_dma_alloc(struct gen_pool *pool, size_t size, dma_addr_t *dma)
 
 	return (void *)vaddr;
 }
-EXPORT_SYMBOL(gen_pool_dma_alloc);
+EXPORT_SYMBOL(gen_pool_dma_alloc_algo);
+
+/**
+ * gen_pool_dma_zalloc_align - allocate special from the pool for DMA usage
+ * with the given alignment
+ * @pool: pool to allocate from
+ * @size: number of bytes to allocate from the pool
+ * @dma: dma-view physical address return value.  Use %NULL if unneeded.
+ * @align: alignment in bytes for starting address
+ *
+ * Allocate the requested number bytes from the specified pool, with the given
+ * alignment restriction. Can not be used in NMI handler on architectures
+ * without NMI-safe cmpxchg implementation.
+ *
+ * Return: virtual address of the allocated memory, or %NULL on failure
+ */
+void *gen_pool_dma_alloc_align(struct gen_pool *pool, size_t size,
+		dma_addr_t *dma, int align)
+{
+	struct genpool_data_align data = { .align = align };
+
+	return gen_pool_dma_alloc_algo(pool, size, dma,
+			gen_pool_first_fit_align, &data);
+}
+EXPORT_SYMBOL(gen_pool_dma_alloc_align);
 
 /**
  * gen_pool_dma_zalloc - allocate special zeroed memory from the pool for
@@ -380,14 +424,60 @@ EXPORT_SYMBOL(gen_pool_dma_alloc);
  */
 void *gen_pool_dma_zalloc(struct gen_pool *pool, size_t size, dma_addr_t *dma)
 {
-	void *vaddr = gen_pool_dma_alloc(pool, size, dma);
+	return gen_pool_dma_zalloc_algo(pool, size, dma, pool->algo, pool->data);
+}
+EXPORT_SYMBOL(gen_pool_dma_zalloc);
+
+/**
+ * gen_pool_dma_zalloc_algo - allocate special zeroed memory from the pool for
+ * DMA usage with the given pool algorithm
+ * @pool: pool to allocate from
+ * @size: number of bytes to allocate from the pool
+ * @dma: dma-view physical address return value.  Use %NULL if unneeded.
+ * @algo: algorithm passed from caller
+ * @data: data passed to algorithm
+ *
+ * Allocate the requested number of zeroed bytes from the specified pool. Uses
+ * the pool allocation function. Can not be used in NMI handler on
+ * architectures without NMI-safe cmpxchg implementation.
+ *
+ * Return: virtual address of the allocated zeroed memory, or %NULL on failure
+ */
+void *gen_pool_dma_zalloc_algo(struct gen_pool *pool, size_t size,
+		dma_addr_t *dma, genpool_algo_t algo, void *data)
+{
+	void *vaddr = gen_pool_dma_alloc_algo(pool, size, dma, algo, data);
 
 	if (vaddr)
 		memset(vaddr, 0, size);
 
 	return vaddr;
 }
-EXPORT_SYMBOL(gen_pool_dma_zalloc);
+EXPORT_SYMBOL(gen_pool_dma_zalloc_algo);
+
+/**
+ * gen_pool_dma_zalloc_align - allocate special zeroed memory from the pool for
+ * DMA usage with the given alignment
+ * @pool: pool to allocate from
+ * @size: number of bytes to allocate from the pool
+ * @dma: dma-view physical address return value.  Use %NULL if unneeded.
+ * @align: alignment in bytes for starting address
+ *
+ * Allocate the requested number of zeroed bytes from the specified pool,
+ * with the given alignment restriction. Can not be used in NMI handler on
+ * architectures without NMI-safe cmpxchg implementation.
+ *
+ * Return: virtual address of the allocated zeroed memory, or %NULL on failure
+ */
+void *gen_pool_dma_zalloc_align(struct gen_pool *pool, size_t size,
+		dma_addr_t *dma, int align)
+{
+	struct genpool_data_align data = { .align = align };
+
+	return gen_pool_dma_zalloc_algo(pool, size, dma,
+			gen_pool_first_fit_align, &data);
+}
+EXPORT_SYMBOL(gen_pool_dma_zalloc_align);
 
 /**
  * gen_pool_free - free allocated special memory back to the pool
Christoph Hellwig June 25, 2019, 6 a.m. UTC | #13
On Mon, Jun 24, 2019 at 02:59:16PM +0200, Fredrik Noring wrote:
> Hi Christoph,
> 
> > Can you send me the patch formally so that I can queue it up for the
> > dma-mapping tree?
> 
> That patch would be detrimental to local memory devices, as previously
> discussed, so I would like to suggest a much better approach, as shown below,
> where allocations are aligned as required but not necessarily much more than
> that.

This looks sensible to me.  Can you submit it with a proper patch
description and split into a separate patch for genalloc vs the user of
the new interface?
diff mbox series

Patch

diff --git a/drivers/usb/host/ohci-sm501.c b/drivers/usb/host/ohci-sm501.c
index c26228c25f99..b710e100aec9 100644
--- a/drivers/usb/host/ohci-sm501.c
+++ b/drivers/usb/host/ohci-sm501.c
@@ -110,40 +110,18 @@  static int ohci_hcd_sm501_drv_probe(struct platform_device *pdev)
 		goto err0;
 	}
 
-	/* The sm501 chip is equipped with local memory that may be used
-	 * by on-chip devices such as the video controller and the usb host.
-	 * This driver uses dma_declare_coherent_memory() to make sure
-	 * usb allocations with dma_alloc_coherent() allocate from
-	 * this local memory. The dma_handle returned by dma_alloc_coherent()
-	 * will be an offset starting from 0 for the first local memory byte.
-	 *
-	 * So as long as data is allocated using dma_alloc_coherent() all is
-	 * fine. This is however not always the case - buffers may be allocated
-	 * using kmalloc() - so the usb core needs to be told that it must copy
-	 * data into our local memory if the buffers happen to be placed in
-	 * regular memory. The HCD_LOCAL_MEM flag does just that.
-	 */
-
-	retval = dma_declare_coherent_memory(dev, mem->start,
-					 mem->start - mem->parent->start,
-					 resource_size(mem));
-	if (retval) {
-		dev_err(dev, "cannot declare coherent memory\n");
-		goto err1;
-	}
-
 	/* allocate, reserve and remap resources for registers */
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
 	if (res == NULL) {
 		dev_err(dev, "no resource definition for registers\n");
 		retval = -ENOENT;
-		goto err2;
+		goto err1;
 	}
 
 	hcd = usb_create_hcd(driver, &pdev->dev, dev_name(&pdev->dev));
 	if (!hcd) {
 		retval = -ENOMEM;
-		goto err2;
+		goto err1;
 	}
 
 	hcd->rsrc_start = res->start;
@@ -164,6 +142,24 @@  static int ohci_hcd_sm501_drv_probe(struct platform_device *pdev)
 
 	ohci_hcd_init(hcd_to_ohci(hcd));
 
+	/* The sm501 chip is equipped with local memory that may be used
+	 * by on-chip devices such as the video controller and the usb host.
+	 * This driver uses genalloc so that usb allocations with
+	 * gen_pool_dma_alloc() allocate from this local memory. The dma_handle
+	 * returned by gen_pool_dma_alloc() will be an offset starting from 0
+	 * for the first local memory byte.
+	 *
+	 * So as long as data is allocated using gen_pool_dma_alloc() all is
+	 * fine. This is however not always the case - buffers may be allocated
+	 * using kmalloc() - so the usb core needs to be told that it must copy
+	 * data into our local memory if the buffers happen to be placed in
+	 * regular memory. The HCD_LOCAL_MEM flag does just that.
+	 */
+
+	if (usb_hcd_setup_local_mem(hcd, mem->start,
+				    mem->start - mem->parent->start,
+				    resource_size(mem)) < 0)
+		goto err5;
 	retval = usb_add_hcd(hcd, irq, IRQF_SHARED);
 	if (retval)
 		goto err5;
@@ -181,8 +177,6 @@  static int ohci_hcd_sm501_drv_probe(struct platform_device *pdev)
 	release_mem_region(hcd->rsrc_start, hcd->rsrc_len);
 err3:
 	usb_put_hcd(hcd);
-err2:
-	dma_release_declared_memory(dev);
 err1:
 	release_mem_region(mem->start, resource_size(mem));
 err0:
@@ -197,7 +191,6 @@  static int ohci_hcd_sm501_drv_remove(struct platform_device *pdev)
 	usb_remove_hcd(hcd);
 	release_mem_region(hcd->rsrc_start, hcd->rsrc_len);
 	usb_put_hcd(hcd);
-	dma_release_declared_memory(&pdev->dev);
 	mem = platform_get_resource(pdev, IORESOURCE_MEM, 1);
 	if (mem)
 		release_mem_region(mem->start, resource_size(mem));