Message ID | b931985f-16b2-0a90-e73b-b6c349ad7d6c@gmail.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
On Wed, 2017-10-11 at 22:46 +0200, Heiner Kallweit wrote: > Hi Jerome, > > we have the known issue that the latest next kernel still fails on > Odroid C2 with a 128GB eMMC card (w/o adjusting the initial tx phase). > > I found the time to dig a little deeper into it and reason is that > there are certain rx/tx phase combinations which are perfectly fine > when tuning but fail in real life. You did not mention it in this mail, but I going to guess this is done, again, in hs400 ... > Don't ask me how this can happen, I just see it happen. Oh I have pretty good idea of what is happening here, and to be absolutely it is not a regression. I think I already explained this but, before the MMC series that went in this cycle, the frequency reported by the mmc driver was twice what it really was for all DDR mode, including HS400. So, up to v4.13, when you thought you were doing hs400 @ 200MHz, the frequency was really 100Mhz With all the mmc patches applied (including the recent tuning fix) the frequency in DDR mode is set as requested by the framework. So on your target, it has increased from 100MHz to 166Mhz. This is a huge difference. You did not change the DT but the frequency did change. The symptoms you are describing, I have seen them while trying SDR104 on boards where the layout proved to be unable to cope with the clock rate involved. Some cards worked well, some did not at all. When looking at the signal amplitude, the problem was clear, the signal quality was just not good enough. The communication might be OK for a short while (while doing the tuning) but may fail while doing "real life" transfers. So why does the tuning succeed and you get errors later on ? Simply because the tuning is not the problem. The HW (and the frequency used) is. As far as I can tell, your eMMC card + your odroidc2 is simply not able to cope (reliably) with 166Mhz. The fact that you continue to have CRC errors with your "hand picked phases" is an evidence of this fact. This is not the only platform out there which can't cope with hs400 @ 166Mhz, the p200 eMMC is hs400 capable, but the platform can't cope with it. That's how it is. Lower your frequency or change the mode to hs200 (which is the setting in the upstream DT BTW) > > To deal with such cases I added some code to avoid known invalid phase > values when retuning. In addition I added some code to deal with the > following (more or less corner) case: > Let's say we have rx = 0° and tx = 0° and working is only combination > rx = 180° and tx = 180°. > Then just tuning rx only or tx only will never result in a working > combination. > > Following patch makes my system work. I just see one CRC error when > the initally tuned rx/tx phase combination fails and then the > retuning results in a stable system w/o further errors. > > I'd appreciate if you could check that this patch doesn't break any > of your systems. > > Rgds, Heiner > > --- > drivers/mmc/host/meson-gx-mmc.c | 29 ++++++++++++++++++++++------- > 1 file changed, 22 insertions(+), 7 deletions(-) > > diff --git a/drivers/mmc/host/meson-gx-mmc.c b/drivers/mmc/host/meson-gx-mmc.c > index 85745ef1..95cb439d 100644 > --- a/drivers/mmc/host/meson-gx-mmc.c > +++ b/drivers/mmc/host/meson-gx-mmc.c > @@ -120,6 +120,7 @@ > #define SD_EMMC_DESC_CHAIN_MODE BIT(1) > > #define MUX_CLK_NUM_PARENTS 2 > +#define MAX_TUNING_ATTEMPTS 10 > > struct sd_emmc_desc { > u32 cmd_cfg; > @@ -687,15 +688,23 @@ static int meson_mmc_find_tuning_point(unsigned long > *test) > static int meson_mmc_clk_phase_tuning(struct mmc_host *mmc, u32 opcode, > struct clk *clk) > { > - int point, ret; > + int point, ret, old_phase; > DECLARE_BITMAP(test, CLK_PHASE_POINT_NUM); > > + old_phase = clk_get_phase(clk); > + if (old_phase < 0) > + return old_phase; > + > dev_dbg(mmc_dev(mmc), "%s phase/delay tunning...\n", > __clk_get_name(clk)); > bitmap_zero(test, CLK_PHASE_POINT_NUM); > > /* Explore tuning points */ > for (point = 0; point < CLK_PHASE_POINT_NUM; point++) { > + /* when retuning avoid the surrounding of where we failed */ > + if (mmc->doing_retune) > + if (abs(point * CLK_PHASE_STEP - old_phase) <= 45) > + continue; This is just a fake alteration of the working window. The point of this whole tuning is to find the middle of the window to get the most stable setting If you still get CRC error with the center of the window, the signal quality is just too low to cope with the timing set. This is patch is just a (convoluted) hack that will force the tuning algorithm to come up with new values each times, hoping to pick setting where the problem is bit masked. > clk_set_phase(clk, point * CLK_PHASE_STEP); > ret = mmc_send_tuning(mmc, opcode, NULL); > if (!ret) > @@ -704,8 +713,11 @@ static int meson_mmc_clk_phase_tuning(struct mmc_host > *mmc, u32 opcode, > > /* Find the optimal tuning point and apply it */ > point = meson_mmc_find_tuning_point(test); > - if (point < 0) > + if (point < 0) { > + /* prevent from getting stuck if we failed */ > + clk_set_phase(clk, (old_phase + 90) % 360); > return point; /* tuning failed */ > + } > > clk_set_phase(clk, point * CLK_PHASE_STEP); > dev_dbg(mmc_dev(mmc), "success with phase: %d\n", > @@ -716,7 +728,7 @@ static int meson_mmc_clk_phase_tuning(struct mmc_host > *mmc, u32 opcode, > static int meson_mmc_execute_tuning(struct mmc_host *mmc, u32 opcode) > { > struct meson_host *host = mmc_priv(mmc); > - int ret; > + int i, ret; > > /* > * If this is the initial tuning, try to get a sane Rx starting > @@ -729,11 +741,14 @@ static int meson_mmc_execute_tuning(struct mmc_host > *mmc, u32 opcode) > return ret; > } > > - ret = meson_mmc_clk_phase_tuning(mmc, opcode, host->tx_clk); > - if (ret) > - return ret; > + for (i = 0; i < MAX_TUNING_ATTEMPTS; i++) { > + meson_mmc_clk_phase_tuning(mmc, opcode, host->tx_clk); > + ret = meson_mmc_clk_phase_tuning(mmc, opcode, host->rx_clk); > + if (!ret) > + return 0; With the hack that is going to skip 25% on the phase, that is up to: 12 phase * 2 (TX/RX) * 10 * 0.75 = 180 MMC tunes !! that's way too much and unnecessary. > + } > > - return meson_mmc_clk_phase_tuning(mmc, opcode, host->rx_clk); > + return ret; > } > > static void meson_mmc_set_ios(struct mmc_host *mmc, struct mmc_ios *ios)
Am 12.10.2017 um 17:22 schrieb Jerome Brunet: > On Wed, 2017-10-11 at 22:46 +0200, Heiner Kallweit wrote: >> Hi Jerome, >> >> we have the known issue that the latest next kernel still fails on >> Odroid C2 with a 128GB eMMC card (w/o adjusting the initial tx phase). >> >> I found the time to dig a little deeper into it and reason is that >> there are certain rx/tx phase combinations which are perfectly fine >> when tuning but fail in real life. > > You did not mention it in this mail, but I going to guess this is done, again, > in hs400 ... > Sorry. I use the default, hs200 with 200MHz. >> Don't ask me how this can happen, I just see it happen. > > Oh I have pretty good idea of what is happening here, and to be absolutely it is > not a regression. > > I think I already explained this but, before the MMC series that went in this > cycle, the frequency reported by the mmc driver was twice what it really was > for all DDR mode, including HS400. > > So, up to v4.13, when you thought you were doing hs400 @ 200MHz, the frequency > was really 100Mhz > > With all the mmc patches applied (including the recent tuning fix) the frequency > in DDR mode is set as requested by the framework. So on your target, it has > increased from 100MHz to 166Mhz. This is a huge difference. You did not change > the DT but the frequency did change. > > The symptoms you are describing, I have seen them while trying SDR104 on boards > where the layout proved to be unable to cope with the clock rate involved. Some > cards worked well, some did not at all. When looking at the signal amplitude, > the problem was clear, the signal quality was just not good enough. The > communication might be OK for a short while (while doing the tuning) but may > fail while doing "real life" transfers. > > So why does the tuning succeed and you get errors later on ? Simply because the > tuning is not the problem. The HW (and the frequency used) is. > > As far as I can tell, your eMMC card + your odroidc2 is simply not able to cope > (reliably) with 166Mhz. The fact that you continue to have CRC errors with your > "hand picked phases" is an evidence of this fact. > When setting the default tx phase to 0 I have a rock-stable system w/o any CRC error (hs200 with DT 200MHz). When leaving the default tx phase at 270, after tuning I end up with rx phase 90 and tx phase 300. This combination works perfect when tuning but fails in real life. I saw in the chip spec that there are few emmc-related calibration values in SD_EMMC_ADJUST and SD_EMMC_CALOUT. However there's no documentation how to use them. Looking at the vendor driver might help, though. Did you have a closer look at these values ? > This is not the only platform out there which can't cope with hs400 @ 166Mhz, > the p200 eMMC is hs400 capable, but the platform can't cope with it. That's how > it is. > > Lower your frequency or change the mode to hs200 (which is the setting in the > upstream DT BTW) > >> >> To deal with such cases I added some code to avoid known invalid phase >> values when retuning. In addition I added some code to deal with the >> following (more or less corner) case: >> Let's say we have rx = 0° and tx = 0° and working is only combination >> rx = 180° and tx = 180°. >> Then just tuning rx only or tx only will never result in a working >> combination. >> >> Following patch makes my system work. I just see one CRC error when >> the initally tuned rx/tx phase combination fails and then the >> retuning results in a stable system w/o further errors. >> >> I'd appreciate if you could check that this patch doesn't break any >> of your systems. >> >> Rgds, Heiner >> >> --- >> drivers/mmc/host/meson-gx-mmc.c | 29 ++++++++++++++++++++++------- >> 1 file changed, 22 insertions(+), 7 deletions(-) >> >> diff --git a/drivers/mmc/host/meson-gx-mmc.c b/drivers/mmc/host/meson-gx-mmc.c >> index 85745ef1..95cb439d 100644 >> --- a/drivers/mmc/host/meson-gx-mmc.c >> +++ b/drivers/mmc/host/meson-gx-mmc.c >> @@ -120,6 +120,7 @@ >> #define SD_EMMC_DESC_CHAIN_MODE BIT(1) >> >> #define MUX_CLK_NUM_PARENTS 2 >> +#define MAX_TUNING_ATTEMPTS 10 >> >> struct sd_emmc_desc { >> u32 cmd_cfg; >> @@ -687,15 +688,23 @@ static int meson_mmc_find_tuning_point(unsigned long >> *test) >> static int meson_mmc_clk_phase_tuning(struct mmc_host *mmc, u32 opcode, >> struct clk *clk) >> { >> - int point, ret; >> + int point, ret, old_phase; >> DECLARE_BITMAP(test, CLK_PHASE_POINT_NUM); >> >> + old_phase = clk_get_phase(clk); >> + if (old_phase < 0) >> + return old_phase; >> + >> dev_dbg(mmc_dev(mmc), "%s phase/delay tunning...\n", >> __clk_get_name(clk)); >> bitmap_zero(test, CLK_PHASE_POINT_NUM); >> >> /* Explore tuning points */ >> for (point = 0; point < CLK_PHASE_POINT_NUM; point++) { >> + /* when retuning avoid the surrounding of where we failed */ >> + if (mmc->doing_retune) >> + if (abs(point * CLK_PHASE_STEP - old_phase) <= 45) >> + continue; > > This is just a fake alteration of the working window. > The point of this whole tuning is to find the middle of the window to get the > most stable setting > > If you still get CRC error with the center of the window, the signal quality is > just too low to cope with the timing set. > > This is patch is just a (convoluted) hack that will force the tuning algorithm > to come up with new values each times, hoping to pick setting where the problem > is bit masked. > > >> clk_set_phase(clk, point * CLK_PHASE_STEP); >> ret = mmc_send_tuning(mmc, opcode, NULL); >> if (!ret) >> @@ -704,8 +713,11 @@ static int meson_mmc_clk_phase_tuning(struct mmc_host >> *mmc, u32 opcode, >> >> /* Find the optimal tuning point and apply it */ >> point = meson_mmc_find_tuning_point(test); >> - if (point < 0) >> + if (point < 0) { >> + /* prevent from getting stuck if we failed */ >> + clk_set_phase(clk, (old_phase + 90) % 360); >> return point; /* tuning failed */ >> + } >> >> clk_set_phase(clk, point * CLK_PHASE_STEP); >> dev_dbg(mmc_dev(mmc), "success with phase: %d\n", >> @@ -716,7 +728,7 @@ static int meson_mmc_clk_phase_tuning(struct mmc_host >> *mmc, u32 opcode, >> static int meson_mmc_execute_tuning(struct mmc_host *mmc, u32 opcode) >> { >> struct meson_host *host = mmc_priv(mmc); >> - int ret; >> + int i, ret; >> >> /* >> * If this is the initial tuning, try to get a sane Rx starting >> @@ -729,11 +741,14 @@ static int meson_mmc_execute_tuning(struct mmc_host >> *mmc, u32 opcode) >> return ret; >> } >> >> - ret = meson_mmc_clk_phase_tuning(mmc, opcode, host->tx_clk); >> - if (ret) >> - return ret; >> + for (i = 0; i < MAX_TUNING_ATTEMPTS; i++) { >> + meson_mmc_clk_phase_tuning(mmc, opcode, host->tx_clk); >> + ret = meson_mmc_clk_phase_tuning(mmc, opcode, host->rx_clk); >> + if (!ret) >> + return 0; > > With the hack that is going to skip 25% on the phase, that is up to: > 12 phase * 2 (TX/RX) * 10 * 0.75 = 180 MMC tunes !! that's way too much and > unnecessary. > >> + } >> >> - return meson_mmc_clk_phase_tuning(mmc, opcode, host->rx_clk); >> + return ret; >> } >> >> static void meson_mmc_set_ios(struct mmc_host *mmc, struct mmc_ios *ios) > >
On Thu, 2017-10-12 at 20:17 +0200, Heiner Kallweit wrote: > > You did not mention it in this mail, but I going to guess this is done, > > again, > > in hs400 ... > > > > Sorry. I use the default, hs200 with 200MHz. > Please state this clearly next time. This seems to change with each of your issue report, I'm done guessing. [...] > > As far as I can tell, your eMMC card + your odroidc2 is simply not able to > > cope > > (reliably) with 166Mhz. The fact that you continue to have CRC errors with > > your > > "hand picked phases" is an evidence of this fact. > > > > When setting the default tx phase to 0 I have a rock-stable system w/o any CRC > error (hs200 with DT 200MHz). Getting CRC Errors is not my definition of "rock stable" > > When leaving the default tx phase at 270, after tuning I end up with rx phase > 90 and tx phase 300. This combination works perfect when tuning but fails > in real life. > Ok, starting from Rx:0 Tx:270 then tuning gives you Tx:300 Rx:90 And what different did it gives you starting Rx:0/Tx:0 ? The result of the tuning does not depends on starting point, so I don't really understand how it would significantly change things. There is nothing vastly innovative in this driver. There is at least 2 other drivers upstream which use the same kind of algorithm to perform the tuning. I've tested it on the vast majority of amlogic supported platforms, including yours. Yes, I don't have your particular eMMC chip but I gave you my understanding of the situation based on my experience : If the tuning succeed but you then get CRC, it means that signal (quality) is not good enough. If you think it is something else, I'm happy to review your changes but you need to provide a better explanation than "Don't ask me how this can happen, I just see it happen." ... because I will ask ! > I saw in the chip spec that there are few emmc-related calibration values in > SD_EMMC_ADJUST Adjust is just another fancy way to change clock phase based on the input clock resampling. When the rate increase the precision of this method decrease (with the divisor value). I tried it, it is not useful. > and SD_EMMC_CALOUT. However there's no documentation how to > use them. Looking at the vendor driver might help, though. The vendor kernel around this calibration is very complex. It is used to set per line delays to adjust for track length differences. Given that I tested the odroidc2 with both hs200 and hs400 w/o this, I doubt it will change anything for you > Did you have a closer look at these values ?
Am 12.10.2017 um 21:34 schrieb Jerome Brunet: > On Thu, 2017-10-12 at 20:17 +0200, Heiner Kallweit wrote: >>> You did not mention it in this mail, but I going to guess this is done, >>> again, >>> in hs400 ... >>> >> >> Sorry. I use the default, hs200 with 200MHz. >> > > Please state this clearly next time. This seems to change with each of your > issue report, I'm done guessing. > > [...] > >>> As far as I can tell, your eMMC card + your odroidc2 is simply not able to >>> cope >>> (reliably) with 166Mhz. The fact that you continue to have CRC errors with >>> your >>> "hand picked phases" is an evidence of this fact. >>> >> >> When setting the default tx phase to 0 I have a rock-stable system w/o any CRC >> error (hs200 with DT 200MHz). > > Getting CRC Errors is not my definition of "rock stable" > When changing the initial tx phase to 0 I have no CRC errors at all with hs200@200. hs200@200 fails with initial tx phase = 270, just with my patch I get one CRC error and after retuning system works w/oo further CRC errors. With initial tx phase = 0 hs400@142 fails whilst hs400@125 works w/o CRC errors and gives me 177 MB/s result when doing dd if=/dev/mmcblk0 of=/dev/null iflag=direct bs=1G count=1 >> >> When leaving the default tx phase at 270, after tuning I end up with rx phase >> 90 and tx phase 300. This combination works perfect when tuning but fails >> in real life. >> > > Ok, starting from Rx:0 Tx:270 then tuning gives you Tx:300 Rx:90 > And what different did it gives you starting Rx:0/Tx:0 ? > > The result of the tuning does not depends on starting point, so I don't really > understand how it would significantly change things. > I think it depends on the tx phase starting point. Tuning does: rx phase tuning tx phase tuning rx phase tuning Result of each step depends on result of previous step. So also the initial rx phase tuning result depends on the starting point of tx phase. > There is nothing vastly innovative in this driver. There is at least 2 other > drivers upstream which use the same kind of algorithm to perform the tuning. > I've tested it on the vast majority of amlogic supported platforms, including > yours. > > Yes, I don't have your particular eMMC chip but I gave you my understanding of > the situation based on my experience : If the tuning succeed but you then get > CRC, it means that signal (quality) is not good enough. > > If you think it is something else, I'm happy to review your changes but you need > to provide a better explanation than "Don't ask me how this can happen, I just > see it happen." ... because I will ask ! > >> I saw in the chip spec that there are few emmc-related calibration values in >> SD_EMMC_ADJUST > > Adjust is just another fancy way to change clock phase based on the input clock > resampling. When the rate increase the precision of this method decrease (with > the divisor value). I tried it, it is not useful. > >> and SD_EMMC_CALOUT. However there's no documentation how to >> use them. Looking at the vendor driver might help, though. > > The vendor kernel around this calibration is very complex. It is used to set per > line delays to adjust for track length differences. Given that I tested the > odroidc2 with both hs200 and hs400 w/o this, I doubt it will change anything for > you > >> Did you have a closer look at these values ? > >
On Thu, 2017-10-12 at 21:49 +0200, Heiner Kallweit wrote: > > The result of the tuning does not depends on starting point, so I don't > > really > > understand how it would significantly change things. > > > > I think it depends on the tx phase starting point. > If the result of the tuning is not independent of the starting, instead of just telling me, it is fairly easy for you to give actual result, like I asked you: > > Ok, starting from Rx:0 Tx:270 then tuning gives you Tx:300 Rx:90 > > And what different did it gives you starting Rx:0/Tx:0 ? And if the result are indeed vastly different, let's debug and get a real explanation. > Tuning does: > rx phase tuning > tx phase tuning > rx phase tuning > > Result of each step depends on result of previous step. Again, we don't agree. > So also the initial > rx phase tuning result depends on the starting point of tx phase. The first Rx tuning is there only to get a sane starting point for the tx tuning ,as explained in the code. This is the reason why is not done when re-tuning.
Am 12.10.2017 um 22:05 schrieb Jerome Brunet: > On Thu, 2017-10-12 at 21:49 +0200, Heiner Kallweit wrote: >>> The result of the tuning does not depends on starting point, so I don't >>> really >>> understand how it would significantly change things. >>> >> >> I think it depends on the tx phase starting point. >> > > If the result of the tuning is not independent of the starting, instead of just > telling me, it is fairly easy for you to give actual result, like I asked you: > With hs200@200 and tx phase starting point 0 I get the following with no further CRC errors. [ 0.726572] meson-gx-mmc d0074000.mmc: d0074000.mmc#rx phase/delay tunning... [ 0.728664] hctosys: unable to open rtc device (rtc0) [ 0.728827] USB_OTG_PWR: disabling [ 0.728831] TFLASH_VDD: disabling [ 0.728833] TF_IO: disabling [ 0.753183] Waiting for root device /dev/mmcblk0p1... [ 0.754352] meson-gx-mmc d0074000.mmc: success with phase: 270 [ 0.758386] meson-gx-mmc d0074000.mmc: d0074000.mmc#tx phase/delay tunning... [ 0.768050] meson-gx-mmc d0074000.mmc: success with phase: 120 [ 0.771235] meson-gx-mmc d0074000.mmc: d0074000.mmc#rx phase/delay tunning... [ 0.780902] meson-gx-mmc d0074000.mmc: success with phase: 0 [ 0.784036] mmc0: new HS200 MMC card at address 0001 [ 0.789090] mmcblk0: mmc0:0001 DJNB4R 116 GiB [ 0.793328] mmcblk0boot0: mmc0:0001 DJNB4R partition 1 4.00 MiB [ 0.799198] mmcblk0boot1: mmc0:0001 DJNB4R partition 2 4.00 MiB [ 0.805048] mmcblk0rpmb: mmc0:0001 DJNB4R partition 3 4.00 MiB, chardev (249:0) [ 0.812799] mmcblk0: response CRC error sending r/w cmd command, card status 0x900 [ 0.819727] meson-gx-mmc d0074000.mmc: d0074000.mmc#tx phase/delay tunning... [ 0.827007] meson-gx-mmc d0074000.mmc: success with phase: 210 [ 0.832559] meson-gx-mmc d0074000.mmc: d0074000.mmc#rx phase/delay tunning... [ 0.839848] meson-gx-mmc d0074000.mmc: success with phase: 0 >>> Ok, starting from Rx:0 Tx:270 then tuning gives you Tx:300 Rx:90 >>> And what different did it gives you starting Rx:0/Tx:0 ? > > And if the result are indeed vastly different, let's debug and get a real > explanation. > >> Tuning does: >> rx phase tuning >> tx phase tuning >> rx phase tuning >> >> Result of each step depends on result of previous step. > > Again, we don't agree. > >> So also the initial >> rx phase tuning result depends on the starting point of tx phase. > > The first Rx tuning is there only to get a sane starting point for the tx tuning > ,as explained in the code. This is the reason why is not done when re-tuning. > > >
On Thu, 2017-10-12 at 22:29 +0200, Heiner Kallweit wrote: > Am 12.10.2017 um 22:05 schrieb Jerome Brunet: > > On Thu, 2017-10-12 at 21:49 +0200, Heiner Kallweit wrote: > > > > The result of the tuning does not depends on starting point, so I don't > > > > really > > > > understand how it would significantly change things. > > > > > > > > > > I think it depends on the tx phase starting point. > > > > > > > If the result of the tuning is not independent of the starting, instead of > > just > > telling me, it is fairly easy for you to give actual result, like I asked > > you: > > > > With hs200@200 and tx phase starting point 0 I get the following with no > further CRC errors. > > [ 0.726572] meson-gx-mmc d0074000.mmc: d0074000.mmc#rx phase/delay > tunning... > [ 0.728664] hctosys: unable to open rtc device (rtc0) > [ 0.728827] USB_OTG_PWR: disabling > [ 0.728831] TFLASH_VDD: disabling > [ 0.728833] TF_IO: disabling > [ 0.753183] Waiting for root device /dev/mmcblk0p1... > [ 0.754352] meson-gx-mmc d0074000.mmc: success with phase: 270 > [ 0.758386] meson-gx-mmc d0074000.mmc: d0074000.mmc#tx phase/delay > tunning... > [ 0.768050] meson-gx-mmc d0074000.mmc: success with phase: 120 > [ 0.771235] meson-gx-mmc d0074000.mmc: d0074000.mmc#rx phase/delay > tunning... > [ 0.780902] meson-gx-mmc d0074000.mmc: success with phase: 0 > [ 0.784036] mmc0: new HS200 MMC card at address 0001 > [ 0.789090] mmcblk0: mmc0:0001 DJNB4R 116 GiB > [ 0.793328] mmcblk0boot0: mmc0:0001 DJNB4R partition 1 4.00 MiB > [ 0.799198] mmcblk0boot1: mmc0:0001 DJNB4R partition 2 4.00 MiB > [ 0.805048] mmcblk0rpmb: mmc0:0001 DJNB4R partition 3 4.00 MiB, chardev > (249:0) > [ 0.812799] mmcblk0: response CRC error sending r/w cmd command, card > status 0x900 > [ 0.819727] meson-gx-mmc d0074000.mmc: d0074000.mmc#tx phase/delay > tunning... > [ 0.827007] meson-gx-mmc d0074000.mmc: success with phase: 210 > [ 0.832559] meson-gx-mmc d0074000.mmc: d0074000.mmc#rx phase/delay > tunning... > [ 0.839848] meson-gx-mmc d0074000.mmc: success with phase: 0 > > > > > > Ok, starting from Rx:0 Tx:270 then tuning gives you Tx:300 Rx:90 > > > > And what different did it gives you starting Rx:0/Tx:0 ? > > > > And if the result are indeed vastly different, let's debug and get a real > > explanation. > > > > > Tuning does: > > > rx phase tuning > > > tx phase tuning > > > rx phase tuning > > > > > > Result of each step depends on result of previous step. > > > > Again, we don't agree. > > > > > So also the initial > > > rx phase tuning result depends on the starting point of tx phase. > > > > The first Rx tuning is there only to get a sane starting point for the tx > > tuning > > ,as explained in the code. This is the reason why is not done when re- > > tuning. > > Heiner, This is the same story again and again ! Getting clear status is just really painful. It's like half the message is ignored each time. I can clearly see that, this result comes from your hack, and this hack makes no sense. I don't care about this result. This is not what I asked. What I asked is: 1) From linux-next what is tuning result in hs200@200Mhz with the starting : * Rx:0/Tx:270/Core:180 (current setting) * Rx:0/Tx:0/Core:180 (the setting you keep on mentioning) Please don't send your logs: the phase set are displayed in debugfs/clk?clk_summary 2) Try to come up with a real explanation: "It just happens ..." is not an explanation. > > > > > >
Am 12.10.2017 um 22:59 schrieb Jerome Brunet: > On Thu, 2017-10-12 at 22:29 +0200, Heiner Kallweit wrote: >> Am 12.10.2017 um 22:05 schrieb Jerome Brunet: >>> On Thu, 2017-10-12 at 21:49 +0200, Heiner Kallweit wrote: >>>>> The result of the tuning does not depends on starting point, so I don't >>>>> really >>>>> understand how it would significantly change things. >>>>> >>>> >>>> I think it depends on the tx phase starting point. >>>> >>> >>> If the result of the tuning is not independent of the starting, instead of >>> just >>> telling me, it is fairly easy for you to give actual result, like I asked >>> you: >>> >> >> With hs200@200 and tx phase starting point 0 I get the following with no >> further CRC errors. >> >> [ 0.726572] meson-gx-mmc d0074000.mmc: d0074000.mmc#rx phase/delay >> tunning... >> [ 0.728664] hctosys: unable to open rtc device (rtc0) >> [ 0.728827] USB_OTG_PWR: disabling >> [ 0.728831] TFLASH_VDD: disabling >> [ 0.728833] TF_IO: disabling >> [ 0.753183] Waiting for root device /dev/mmcblk0p1... >> [ 0.754352] meson-gx-mmc d0074000.mmc: success with phase: 270 >> [ 0.758386] meson-gx-mmc d0074000.mmc: d0074000.mmc#tx phase/delay >> tunning... >> [ 0.768050] meson-gx-mmc d0074000.mmc: success with phase: 120 >> [ 0.771235] meson-gx-mmc d0074000.mmc: d0074000.mmc#rx phase/delay >> tunning... >> [ 0.780902] meson-gx-mmc d0074000.mmc: success with phase: 0 >> [ 0.784036] mmc0: new HS200 MMC card at address 0001 >> [ 0.789090] mmcblk0: mmc0:0001 DJNB4R 116 GiB >> [ 0.793328] mmcblk0boot0: mmc0:0001 DJNB4R partition 1 4.00 MiB >> [ 0.799198] mmcblk0boot1: mmc0:0001 DJNB4R partition 2 4.00 MiB >> [ 0.805048] mmcblk0rpmb: mmc0:0001 DJNB4R partition 3 4.00 MiB, chardev >> (249:0) >> [ 0.812799] mmcblk0: response CRC error sending r/w cmd command, card >> status 0x900 >> [ 0.819727] meson-gx-mmc d0074000.mmc: d0074000.mmc#tx phase/delay >> tunning... >> [ 0.827007] meson-gx-mmc d0074000.mmc: success with phase: 210 >> [ 0.832559] meson-gx-mmc d0074000.mmc: d0074000.mmc#rx phase/delay >> tunning... >> [ 0.839848] meson-gx-mmc d0074000.mmc: success with phase: 0 >> >> >>>>> Ok, starting from Rx:0 Tx:270 then tuning gives you Tx:300 Rx:90 >>>>> And what different did it gives you starting Rx:0/Tx:0 ? >>> >>> And if the result are indeed vastly different, let's debug and get a real >>> explanation. >>> >>>> Tuning does: >>>> rx phase tuning >>>> tx phase tuning >>>> rx phase tuning >>>> >>>> Result of each step depends on result of previous step. >>> >>> Again, we don't agree. >>> >>>> So also the initial >>>> rx phase tuning result depends on the starting point of tx phase. >>> >>> The first Rx tuning is there only to get a sane starting point for the tx >>> tuning >>> ,as explained in the code. This is the reason why is not done when re- >>> tuning. >>> > > Heiner, This is the same story again and again ! > Getting clear status is just really painful. It's like half the message is > ignored each time. > > I can clearly see that, this result comes from your hack, and this hack makes no > sense. I don't care about this result. This is not what I asked. > No, the log provided is w/o my hack. The debugfs values just represent the current values, interesting is how they change from tuning step to tuning step. > What I asked is: > 1) From linux-next what is tuning result in hs200@200Mhz with the starting : > * Rx:0/Tx:270/Core:180 (current setting) > * Rx:0/Tx:0/Core:180 (the setting you keep on mentioning) > > Please don't send your logs: the phase set are displayed in > debugfs/clk?clk_summary > > 2) Try to come up with a real explanation: "It just happens ..." is not an > explanation. > >>> >>> >> >> > >
diff --git a/drivers/mmc/host/meson-gx-mmc.c b/drivers/mmc/host/meson-gx-mmc.c index 85745ef1..95cb439d 100644 --- a/drivers/mmc/host/meson-gx-mmc.c +++ b/drivers/mmc/host/meson-gx-mmc.c @@ -120,6 +120,7 @@ #define SD_EMMC_DESC_CHAIN_MODE BIT(1) #define MUX_CLK_NUM_PARENTS 2 +#define MAX_TUNING_ATTEMPTS 10 struct sd_emmc_desc { u32 cmd_cfg; @@ -687,15 +688,23 @@ static int meson_mmc_find_tuning_point(unsigned long *test) static int meson_mmc_clk_phase_tuning(struct mmc_host *mmc, u32 opcode, struct clk *clk) { - int point, ret; + int point, ret, old_phase; DECLARE_BITMAP(test, CLK_PHASE_POINT_NUM); + old_phase = clk_get_phase(clk); + if (old_phase < 0) + return old_phase; + dev_dbg(mmc_dev(mmc), "%s phase/delay tunning...\n", __clk_get_name(clk)); bitmap_zero(test, CLK_PHASE_POINT_NUM); /* Explore tuning points */ for (point = 0; point < CLK_PHASE_POINT_NUM; point++) { + /* when retuning avoid the surrounding of where we failed */ + if (mmc->doing_retune) + if (abs(point * CLK_PHASE_STEP - old_phase) <= 45) + continue; clk_set_phase(clk, point * CLK_PHASE_STEP); ret = mmc_send_tuning(mmc, opcode, NULL); if (!ret) @@ -704,8 +713,11 @@ static int meson_mmc_clk_phase_tuning(struct mmc_host *mmc, u32 opcode, /* Find the optimal tuning point and apply it */ point = meson_mmc_find_tuning_point(test); - if (point < 0) + if (point < 0) { + /* prevent from getting stuck if we failed */ + clk_set_phase(clk, (old_phase + 90) % 360); return point; /* tuning failed */ + } clk_set_phase(clk, point * CLK_PHASE_STEP); dev_dbg(mmc_dev(mmc), "success with phase: %d\n", @@ -716,7 +728,7 @@ static int meson_mmc_clk_phase_tuning(struct mmc_host *mmc, u32 opcode, static int meson_mmc_execute_tuning(struct mmc_host *mmc, u32 opcode) { struct meson_host *host = mmc_priv(mmc); - int ret; + int i, ret; /* * If this is the initial tuning, try to get a sane Rx starting @@ -729,11 +741,14 @@ static int meson_mmc_execute_tuning(struct mmc_host *mmc, u32 opcode) return ret; } - ret = meson_mmc_clk_phase_tuning(mmc, opcode, host->tx_clk); - if (ret) - return ret; + for (i = 0; i < MAX_TUNING_ATTEMPTS; i++) { + meson_mmc_clk_phase_tuning(mmc, opcode, host->tx_clk); + ret = meson_mmc_clk_phase_tuning(mmc, opcode, host->rx_clk); + if (!ret) + return 0; + } - return meson_mmc_clk_phase_tuning(mmc, opcode, host->rx_clk); + return ret; } static void meson_mmc_set_ios(struct mmc_host *mmc, struct mmc_ios *ios)