Message ID | 1386667657-26355-1-git-send-email-ynvich@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, 2013-12-10 at 13:27 +0400, Sergei Ianovich wrote: > The device works in general, but it is slower than with the old DMA > and it reports sporadic failures like that This took 40 minutes on my ARM device: --->8--- # fsck.ext3 -cvf /dev/mmcblk0p1 e2fsck 1.42.5 (29-Jul-2012) Checking for bad blocks (read-only test): done /dev/mmcblk0p1: Updating bad block inode. Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information /dev/mmcblk0p1: ***** FILE SYSTEM WAS MODIFIED ***** /dev/mmcblk0p1: ***** REBOOT LINUX ***** 8649 inodes used (3.56%, out of 242880) 7 non-contiguous files (0.1%) 7 non-contiguous directories (0.1%) # of inodes with ind/dind/tind blocks: 540/3/0 91255 blocks used (9.40%, out of 970752) 0 bad blocks 1 large file 6502 regular files 860 directories 0 character device files 0 block device files 1 fifo 0 links 1277 symbolic links (1277 fast symbolic links) 0 sockets ------------ 8640 files --->8--- The same card immediately after on my x86_64 PC, in 2 minutes: --->8--- $ sudo fsck -cfv /dev/mmcblk0p1 fsck from util-linux 2.20.1 e2fsck 1.42.8 (20-Jun-2013) Checking for bad blocks (read-only test): done /dev/mmcblk0p1: Updating bad block inode. Pass 1: Checking inodes, blocks, and sizes Inodes that were part of a corrupted orphan linked list found. Fix<y>? yes Inode 24388 was part of the orphaned inode list. FIXED. Inode 24511 was part of the orphaned inode list. FIXED. Inode 24727 was part of the orphaned inode list. FIXED. Inode 32410 was part of the orphaned inode list. FIXED. Inode 32413 was part of the orphaned inode list. FIXED. Inode 57554 was part of the orphaned inode list. FIXED. Inode 57680 was part of the orphaned inode list. FIXED. Inode 57681 was part of the orphaned inode list. FIXED. Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information /dev/mmcblk0p1: ***** FILE SYSTEM WAS MODIFIED ***** 8649 inodes used (3.56%, out of 242880) 7 non-contiguous files (0.1%) 7 non-contiguous directories (0.1%) # of inodes with ind/dind/tind blocks: 540/3/0 91255 blocks used (9.40%, out of 970752) 0 bad blocks 1 large file 6502 regular files 860 directories 0 character device files 0 block device files 1 fifo 0 links 1277 symbolic links (1277 fast symbolic links) 0 sockets ------------ 8640 files --->8--- The same command with old DMA driver took 23 minutes. There was no DMA errors. When retested the card on PC, there was: Inodes that were part of a corrupted orphan linked list found. Fix<y>? yes Inode 57554 was part of the orphaned inode list. FIXED. The card was mounted as readonly rootfs while fsck run on ARM, so there may be some background noise. Conclusions: 1. The card itself is fine. 2. There are issues with new DMA driver. I am ready to test patches, without questions. 3. There are issues with the MMC stack in my ARM device. Please drop a pointer where to start digging.
On 12/10/2013 11:25 AM, Sergei Ianovich wrote: > On Tue, 2013-12-10 at 13:27 +0400, Sergei Ianovich wrote: >> The device works in general, but it is slower than with the old DMA >> and it reports sporadic failures like that > > This took 40 minutes on my ARM device: [...] > The same card immediately after on my x86_64 PC, in 2 minutes: [...] > The same command with old DMA driver took 23 minutes. There was no DMA > errors. I'd say something like bonnie++ is a more deterministic test for such things, considering that the first fsck might have fixed things that the 2nd didn't have to touch. Apart from that, your findings are interesting. I only tested pxamci with a Wifi card connected via SDIO, and didn't see such a big difference in performance. > Conclusions: > 1. The card itself is fine. > > 2. There are issues with new DMA driver. I am ready to test patches, > without questions. > > 3. There are issues with the MMC stack in my ARM device. Please drop a > pointer where to start digging. Alright, thanks a lot for offering help! You could start having a look at the DMA descriptors and see whether they differ in size, compared to the old implementation. Taking time stamps to measure the turnaround cycle of the transfers would be another thing that you can try. I'm not aware of any code in the mmp_pdma driver that would burn lots of CPU cycles, but that doesn't mean there aren't any. Daniel
On Tue, 2013-12-10 at 11:48 +0100, Daniel Mack wrote: > I'd say something like bonnie++ is a more deterministic test for such > things, considering that the first fsck might have fixed things that the > 2nd didn't have to touch. The old driver: Version 1.96 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP pac26 300M 41 81 280 1 246 2 278 96 3056 10 99.2 23 Latency 1424ms 1054ms 510ms 95978us 86277us 3510ms Version 1.96 ------Sequential Create------ --------Random Create-------- pac26 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 354 42 16254 97 325 10 490 30 +++++ +++ 1320 37 Latency 2354ms 48094us 6165ms 2979ms 586us 1116us 1.96,1.96,pac26,1,1386675098,300M,,41,81,280,1,246,2,278,96,3056,10,99.2,23,16,,,,,354,42,16254,97,325,10,490,30,+++++,+++,1320,37,1424ms,1054ms,510ms,95978us,86277us,3510ms,2354ms,48094us,6165ms,2979ms,586us,1116us The new driver: Version 1.96 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP pac26 300M 38 77 0 0 218 2 267 93 1762 6 84.0 19 Latency 2522ms 138668225 792ms 67960us 141ms 16313ms Version 1.96 ------Sequential Create------ --------Random Create-------- pac26 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 304 38 14930 95 383 13 525 31 +++++ +++ 1042 28 Latency 13481ms 49916us 7144ms 4075ms 624us 993us 1.96,1.96,pac26,1,1639,300M,,38,77,0,0,218,2,267,93,1762,6,84.0,19,16,,,,,304,38,14930,95,383,13,525,31,+++++,+++,1042,28,2522ms,138668225,792ms,67960us,141ms,16313ms,13481ms,49916us,7144ms,4075ms,624us,993us I forgot to run 'time bonnie++', so there is no hard numbers about total time. My impression is that the old and new driver tests have run for 30 and 60 minutes respectively. > Apart from that, your findings are interesting. I only tested pxamci > with a Wifi card connected via SDIO, and didn't see such a big > difference in performance. > > > Conclusions: > > 1. The card itself is fine. > > > > 2. There are issues with new DMA driver. I am ready to test patches, > > without questions. > > > > 3. There are issues with the MMC stack in my ARM device. Please drop a > > pointer where to start digging. > > Alright, thanks a lot for offering help! You could start having a look > at the DMA descriptors and see whether they differ in size, compared to > the old implementation. Taking time stamps to measure the turnaround > cycle of the transfers would be another thing that you can try. > > I'm not aware of any code in the mmp_pdma driver that would burn lots of > CPU cycles, but that doesn't mean there aren't any. I can test patches, but I am not sure I can provide any meaningful help with performance. Sorry. In point 3 I was talking about non-DMA related performance. The device has been busy running bonnie++, so I haven't had a change to verify. It looks like my device is working with 1-bit-wide bus instead of 4-bit-wide (Bad block checking ran at 19500000 bits per second which matches bus clock speed).
diff --git a/drivers/mmc/host/pxamci.c b/drivers/mmc/host/pxamci.c index 7aa97eb..e5bafd2 100644 --- a/drivers/mmc/host/pxamci.c +++ b/drivers/mmc/host/pxamci.c @@ -85,7 +85,7 @@ struct pxamci_host { static inline void pxamci_init_ocr(struct pxamci_host *host) { #ifdef CONFIG_REGULATOR - host->vcc = regulator_get(mmc_dev(host->mmc), "vmmc"); + host->vcc = regulator_get_optional(mmc_dev(host->mmc), "vmmc"); if (IS_ERR(host->vcc)) host->vcc = NULL; @@ -555,6 +555,12 @@ static void pxamci_dma_irq(void *param) struct dma_tx_state state; enum dma_status status; struct dma_chan *chan; + unsigned long flags; + + spin_lock_irqsave(&host->lock, flags); + + if (!host->data) + goto out_unlock; if (host->data->flags & MMC_DATA_READ) chan = host->dma_chan_rx; @@ -563,14 +569,17 @@ static void pxamci_dma_irq(void *param) status = dmaengine_tx_status(chan, host->dma_cookie, &state); - if (likely(status == DMA_SUCCESS)) { + if (likely(status == DMA_COMPLETE)) { writel(BUF_PART_FULL, host->base + MMC_PRTBUF); } else { - pr_err("%s: DMA error on %s channel\n", mmc_hostname(host->mmc), - host->data->flags & MMC_DATA_READ ? "rx" : "tx"); + pr_err("%s: DMA error on %s channel, status %i\n", mmc_hostname(host->mmc), + host->data->flags & MMC_DATA_READ ? "rx" : "tx", status); host->data->error = -EIO; pxamci_data_done(host, 0); } + +out_unlock: + spin_unlock_irqrestore(&host->lock, flags); } static irqreturn_t pxamci_detect_irq(int irq, void *devid) @@ -618,11 +627,46 @@ static int pxamci_of_init(struct platform_device *pdev) return 0; } + +static int pxamci_of_init_dma(struct platform_device *pdev, + struct pxamci_host *host) +{ + struct device_node *np = pdev->dev.of_node; + u32 tmp; + int i; + int ret; + + i = of_property_match_string(np, "dma-names", "rx"); + if (i < 0) + return i; + + ret = of_property_read_u32_index(np, "dmas", 2 * i + 1, &tmp); + if (ret < 0) + return ret; + host->dma_drcmrrx = tmp; + + i = of_property_match_string(np, "dma-names", "tx"); + if (i < 0) + return i; + + ret = of_property_read_u32_index(np, "dmas", 2 * i + 1, &tmp); + if (ret < 0) + return ret; + host->dma_drcmrtx = tmp; + + return 0; +} #else static int pxamci_of_init(struct platform_device *pdev) { return 0; } + +static int pxamci_of_init_dma(struct platform_device *pdev, + struct pxamci_host *host) +{ + return -ENODATA; +} #endif static int pxamci_probe(struct platform_device *pdev) @@ -733,7 +777,7 @@ static int pxamci_probe(struct platform_device *pdev) platform_set_drvdata(pdev, mmc); - if (!pdev->dev.of_node) { + if (pxamci_of_init_dma(pdev, host) < 0) { dmarx = platform_get_resource(pdev, IORESOURCE_DMA, 0); if (!dmarx) { ret = -ENXIO; @@ -896,35 +940,6 @@ static int pxamci_remove(struct platform_device *pdev) return 0; } -#ifdef CONFIG_PM -static int pxamci_suspend(struct device *dev) -{ - struct mmc_host *mmc = dev_get_drvdata(dev); - int ret = 0; - - if (mmc) - ret = mmc_suspend_host(mmc); - - return ret; -} - -static int pxamci_resume(struct device *dev) -{ - struct mmc_host *mmc = dev_get_drvdata(dev); - int ret = 0; - - if (mmc) - ret = mmc_resume_host(mmc); - - return ret; -} - -static const struct dev_pm_ops pxamci_pm_ops = { - .suspend = pxamci_suspend, - .resume = pxamci_resume, -}; -#endif - static struct platform_driver pxamci_driver = { .probe = pxamci_probe, .remove = pxamci_remove, @@ -932,9 +947,6 @@ static struct platform_driver pxamci_driver = { .name = DRIVER_NAME, .owner = THIS_MODULE, .of_match_table = of_match_ptr(pxa_mmc_dt_ids), -#ifdef CONFIG_PM - .pm = &pxamci_pm_ops, -#endif }, };