diff mbox

[[v4] 2/5] MMC: Use CMD23 for multiblock transfers when we can.

Message ID 1303870235-29041-3-git-send-email-andreiw@motorola.com (mailing list archive)
State New, archived
Headers show

Commit Message

Andrei Warkentin April 27, 2011, 2:10 a.m. UTC
CMD23-prefixed instead of open-ended multiblock transfers
have a performance advantage on some MMC cards.

Cc: arindam.nath@amd.com
Cc: cjb@laptop.org
Cc: arnd@arndb.de
Signed-off-by: Andrei Warkentin <andreiw@motorola.com>
---
 drivers/mmc/card/block.c |  109 +++++++++++++++++++++++++++++++++-------------
 include/linux/mmc/card.h |    1 +
 include/linux/mmc/core.h |    1 +
 include/linux/mmc/host.h |    6 +++
 include/linux/mmc/mmc.h  |    6 +++
 5 files changed, 93 insertions(+), 30 deletions(-)
 mode change 100644 => 100755 drivers/mmc/card/block.c

Comments

Jaehoon Chung May 19, 2011, 2:37 a.m. UTC | #1
Hi Andrei

Andrei Warkentin wrote:
> CMD23-prefixed instead of open-ended multiblock transfers
> have a performance advantage on some MMC cards.
> 
you mentioned about "some MMC cards". 
Conversely, that means the some card didn't have a performance advantage?

Did you find the performance advantage? 
if you found the advantage and you can tell me, 
i want to know what do you have the some MMC cards..

Regards,
Jaehoon Chung
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrei Warkentin May 19, 2011, 5:01 p.m. UTC | #2
On Wed, May 18, 2011 at 9:37 PM, Jaehoon Chung <jh80.chung@samsung.com> wrote:
> Hi Andrei
>
> Andrei Warkentin wrote:
>> CMD23-prefixed instead of open-ended multiblock transfers
>> have a performance advantage on some MMC cards.
>>
> you mentioned about "some MMC cards".
> Conversely, that means the some card didn't have a performance advantage?
>
> Did you find the performance advantage?
> if you found the advantage and you can tell me,
> i want to know what do you have the some MMC cards..
>

I've tested this on a Sandisk eMMC where I saw as good as
a 50% improvement on writes (30% real-life use cases). This was a
SEM32G 4.3+ part.

A
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jaehoon Chung May 20, 2011, 4:38 a.m. UTC | #3
Hi Andrei,

Andrei Warkentin wrote:
> On Wed, May 18, 2011 at 9:37 PM, Jaehoon Chung <jh80.chung@samsung.com> wrote:
>> Hi Andrei
>>
>> Andrei Warkentin wrote:
>>> CMD23-prefixed instead of open-ended multiblock transfers
>>> have a performance advantage on some MMC cards.
>>>
>> you mentioned about "some MMC cards".
>> Conversely, that means the some card didn't have a performance advantage?
>>
>> Did you find the performance advantage?
>> if you found the advantage and you can tell me,
>> i want to know what do you have the some MMC cards..
>>
> 
> I've tested this on a Sandisk eMMC where I saw as good as
> a 50% improvement on writes (30% real-life use cases). This was a
> SEM32G 4.3+ part.

Can you tell me your environment? buswidth, AP information, benchmark etc..
And if you have the performance result's data, can you share them?

Regards,
Jaehoon Chung
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrei Warkentin May 20, 2011, 6:54 a.m. UTC | #4
On Fri, May 20, 2011 at 1:49 AM, Andrei Warkentin <andreiw@motorola.com> wrote:
> On Thu, May 19, 2011 at 11:38 PM, Jaehoon Chung <jh80.chung@samsung.com> wrote:
>> Hi Andrei,
>>
>> Andrei Warkentin wrote:
>>> On Wed, May 18, 2011 at 9:37 PM, Jaehoon Chung <jh80.chung@samsung.com> wrote:
>>>> Hi Andrei
>>>>
>>>> Andrei Warkentin wrote:
>>>>> CMD23-prefixed instead of open-ended multiblock transfers
>>>>> have a performance advantage on some MMC cards.
>>>>>
>>>> you mentioned about "some MMC cards".
>>>> Conversely, that means the some card didn't have a performance advantage?
>>>>
>>>> Did you find the performance advantage?
>>>> if you found the advantage and you can tell me,
>>>> i want to know what do you have the some MMC cards..
>>>>
>>>
>>> I've tested this on a Sandisk eMMC where I saw as good as
>>> a 50% improvement on writes (30% real-life use cases). This was a
>>> SEM32G 4.3+ part.
>>
>> Can you tell me your environment? buswidth, AP information, benchmark etc..
>> And if you have the performance result's data, can you share them?
>>
>
> This was on an SDHCI controller (hence the patch...) on a Tegra
> 2-based system. I was measuring
> throughput on reads and writes (obviously without block cache,
> filesystem, etc) to an eMMC card, 8 bits.
>
> Tested both with my tool (https://github.com/andreiw/superalign) and
> an sqllite-based test.
>
> I'm attaching the data I have.
>
> A
>

Additionally, CMD23 use is a requirement for SDXC cards (Arindam can
comment on that), as well as for MMC reliable writes and eMMC 4.5-spec
features (Yunpeng Gao can comment on that).

These patches allow CMD23 use. They do involve a some changes to host
controller because of interaction with CMD12, as well as Auto-CMD12
and Auto-CMD23 features. I can definitely consult you if you need help
implementing CMD23 support for whatever controller you develop for.

A
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jaehoon Chung May 20, 2011, 9:05 a.m. UTC | #5
Andrei Warkentin wrote:
> On Fri, May 20, 2011 at 1:49 AM, Andrei Warkentin <andreiw@motorola.com> wrote:
>> On Thu, May 19, 2011 at 11:38 PM, Jaehoon Chung <jh80.chung@samsung.com> wrote:
>>> Hi Andrei,
>>>
>>> Andrei Warkentin wrote:
>>>> On Wed, May 18, 2011 at 9:37 PM, Jaehoon Chung <jh80.chung@samsung.com> wrote:
>>>>> Hi Andrei
>>>>>
>>>>> Andrei Warkentin wrote:
>>>>>> CMD23-prefixed instead of open-ended multiblock transfers
>>>>>> have a performance advantage on some MMC cards.
>>>>>>
>>>>> you mentioned about "some MMC cards".
>>>>> Conversely, that means the some card didn't have a performance advantage?
>>>>>
>>>>> Did you find the performance advantage?
>>>>> if you found the advantage and you can tell me,
>>>>> i want to know what do you have the some MMC cards..
>>>>>
>>>> I've tested this on a Sandisk eMMC where I saw as good as
>>>> a 50% improvement on writes (30% real-life use cases). This was a
>>>> SEM32G 4.3+ part.
>>> Can you tell me your environment? buswidth, AP information, benchmark etc..
>>> And if you have the performance result's data, can you share them?
>>>
>> This was on an SDHCI controller (hence the patch...) on a Tegra
>> 2-based system. I was measuring
>> throughput on reads and writes (obviously without block cache,
>> filesystem, etc) to an eMMC card, 8 bits.
>>
>> Tested both with my tool (https://github.com/andreiw/superalign) and
>> an sqllite-based test.
>>
>> I'm attaching the data I have.
>>
>> A
>>
> 
> Additionally, CMD23 use is a requirement for SDXC cards (Arindam can
> comment on that), as well as for MMC reliable writes and eMMC 4.5-spec
> features (Yunpeng Gao can comment on that).
> 
> These patches allow CMD23 use. They do involve a some changes to host
> controller because of interaction with CMD12, as well as Auto-CMD12
> and Auto-CMD23 features. I can definitely consult you if you need help
> implementing CMD23 support for whatever controller you develop for.
> 

I used two host-controller (sdhci and dw_mmc). you implemented them in sdhci.
I applied your patch then tested CMD23...but i didn't applied  auto-CMD23.
I known that auto-CMD23 supported at SD3.0..right?
My controller is supported SD2.0..

If you can consult me, too much helpful to me..

Regards,
Jaehoon Chung


--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jaehoon Chung May 23, 2011, 12:40 p.m. UTC | #6
Hi A..

I tested your patch..(using CMD23)
my environment is the below.
eMMC card : Sandisk SEM8G (eMMC 4.3+)
buswidth : 4bit (SDR)
AP : C110
benchmark : IOzone

I want to know how do you think about this result?
(i can't see your results)

* open-ended 
	      KB  reclen   write rewrite    read    reread
           10240       4    9128    9859    18072    18131
           10240       8    9510   10025    18107    18031
           10240      16    9445   10104    18143    18084
           10240      32    9583   10076    17912    18097
           10240      64    2957    3025    12320    12317
           10240     128    9500    9850    18082    18085
           10240     256    9506    8712    17952    17905
           10240     512    9445    9905    17981    17851
           10240    1024    8894    9724    18079    18196
           10240    2048    9401    9810    18181    18040
           10240    4096    8657    9358    18172    17980
           10240    8192    8779    7730    18067    17943

* pre-defined (CMD23)

              KB  reclen   write rewrite    read    reread
           10240       4    8799    8061    18069    18212
           10240       8    8850    8465    17938    18084
           10240      16    8077    9171    18131    18249
           10240      32    8810    8209    18019    17999
           10240      64    8744    8214    18096    18204
           10240     128    7940   10041    18036    18043
           10240     256    8466    7557    18101    18186
           10240     512    7372    8486    18272    18010
           10240    1024    5565    9922    18155    18196
           10240    2048    9383    9686    18116    18049
           10240    4096    9282    8792    18073    18154
           10240    8192    3770   10136    18083    18155

Regards,
Jaehoon Chung


Andrei Warkentin wrote:
> On Fri, May 20, 2011 at 1:49 AM, Andrei Warkentin <andreiw@motorola.com> wrote:
>> On Thu, May 19, 2011 at 11:38 PM, Jaehoon Chung <jh80.chung@samsung.com> wrote:
>>> Hi Andrei,
>>>
>>> Andrei Warkentin wrote:
>>>> On Wed, May 18, 2011 at 9:37 PM, Jaehoon Chung <jh80.chung@samsung.com> wrote:
>>>>> Hi Andrei
>>>>>
>>>>> Andrei Warkentin wrote:
>>>>>> CMD23-prefixed instead of open-ended multiblock transfers
>>>>>> have a performance advantage on some MMC cards.
>>>>>>
>>>>> you mentioned about "some MMC cards".
>>>>> Conversely, that means the some card didn't have a performance advantage?
>>>>>
>>>>> Did you find the performance advantage?
>>>>> if you found the advantage and you can tell me,
>>>>> i want to know what do you have the some MMC cards..
>>>>>
>>>> I've tested this on a Sandisk eMMC where I saw as good as
>>>> a 50% improvement on writes (30% real-life use cases). This was a
>>>> SEM32G 4.3+ part.
>>> Can you tell me your environment? buswidth, AP information, benchmark etc..
>>> And if you have the performance result's data, can you share them?
>>>
>> This was on an SDHCI controller (hence the patch...) on a Tegra
>> 2-based system. I was measuring
>> throughput on reads and writes (obviously without block cache,
>> filesystem, etc) to an eMMC card, 8 bits.
>>
>> Tested both with my tool (https://github.com/andreiw/superalign) and
>> an sqllite-based test.
>>
>> I'm attaching the data I have.
>>
>> A
>>
> 
> Additionally, CMD23 use is a requirement for SDXC cards (Arindam can
> comment on that), as well as for MMC reliable writes and eMMC 4.5-spec
> features (Yunpeng Gao can comment on that).
> 
> These patches allow CMD23 use. They do involve a some changes to host
> controller because of interaction with CMD12, as well as Auto-CMD12
> and Auto-CMD23 features. I can definitely consult you if you need help
> implementing CMD23 support for whatever controller you develop for.
> 
> A
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrei Warkentin May 23, 2011, 7:25 p.m. UTC | #7
On Mon, May 23, 2011 at 7:40 AM, Jaehoon Chung <jh80.chung@samsung.com> wrote:
> Hi A..
>
> I tested your patch..(using CMD23)
> my environment is the below.
> eMMC card : Sandisk SEM8G (eMMC 4.3+)
> buswidth : 4bit (SDR)
> AP : C110
> benchmark : IOzone
>
> I want to know how do you think about this result?
> (i can't see your results)
>

I think that you should use my tool to measure I/O performance.
Because I want to see the mins, maxes, average and
std dev. Iozone adds way too much noise to the data. You can run it 5
times in a row and get completely different numbers. Please use
https://github.com/andreiw/superalign.

For a more realistic test you can try performing 20000 sqlite inserts
or something of the sort and timing that.

A
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrei Warkentin May 23, 2011, 7:33 p.m. UTC | #8
On Mon, May 23, 2011 at 2:25 PM, Andrei Warkentin <andreiw@motorola.com> wrote:
> On Mon, May 23, 2011 at 7:40 AM, Jaehoon Chung <jh80.chung@samsung.com> wrote:
>> Hi A..
>>
>> I tested your patch..(using CMD23)
>> my environment is the below.
>> eMMC card : Sandisk SEM8G (eMMC 4.3+)
>> buswidth : 4bit (SDR)
>> AP : C110
>> benchmark : IOzone
>>
>> I want to know how do you think about this result?
>> (i can't see your results)
>>
>
> I think that you should use my tool to measure I/O performance.
> Because I want to see the mins, maxes, average and
> std dev. Iozone adds way too much noise to the data. You can run it 5
> times in a row and get completely different numbers. Please use
> https://github.com/andreiw/superalign.
>
> For a more realistic test you can try performing 20000 sqlite inserts
> or something of the sort and timing that.
>
> A
>

Additionally, be careful how you do your testing. I don't want to
sound obvious, but, to ensure you can actually compare the collected
data against each other -
1) Disable all power/frequency scaling/management/gating, suspend/resume, etc.
2) Make sure nothing else uses the eMMC. No root mounted fs, nothing.
3) Make sure you are avoiding block cache and file system. You want
direct block I/O.
4) For extra extra extra reliable results - Make sure you are not
rebooting across testing. You will need to add a flag so you can
disable CMD23 on the fly via debugfs.

For sqlite testing some other helpful hints -
1) Unmount partition containing files on which the SQLite test operates.
2) Perform BLKDISCARD over the partition.
3) Format with desired file system.
4) Mount.
6) Sync && echo 3 > /proc/sys/vm/drop_caches
7) Perform test.
8) Umount.
9) Repeat from (1)

A
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrei Warkentin May 23, 2011, 7:34 p.m. UTC | #9
On Mon, May 23, 2011 at 7:40 AM, Jaehoon Chung <jh80.chung@samsung.com> wrote:
> Hi A..
>
> I tested your patch..(using CMD23)
> my environment is the below.
> eMMC card : Sandisk SEM8G (eMMC 4.3+)
> buswidth : 4bit (SDR)
> AP : C110
> benchmark : IOzone
>

Knowing what controller you are on is helpful too. Unless you are on
SDHCI, you will see no effect.

A
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrei Warkentin May 23, 2011, 8:45 p.m. UTC | #10
On Mon, May 23, 2011 at 2:34 PM, Andrei Warkentin <andreiw@motorola.com> wrote:
> On Mon, May 23, 2011 at 7:40 AM, Jaehoon Chung <jh80.chung@samsung.com> wrote:
>> Hi A..
>>
>> I tested your patch..(using CMD23)
>> my environment is the below.
>> eMMC card : Sandisk SEM8G (eMMC 4.3+)
>> buswidth : 4bit (SDR)
>> AP : C110
>> benchmark : IOzone
>>
>
> Knowing what controller you are on is helpful too. Unless you are on
> SDHCI, you will see no effect.
>
> A
>

If after all you still see no improvement, you should talk to your
Sandisk representative if your particular batch actually is new enough
to support CMD23 improvements. Sandisk ships different versions of
hardware at the same time, so it's pretty impossible to tell it apart
other than some vendor specific EXT_CSD fields...

A
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jaehoon Chung May 24, 2011, 12:07 a.m. UTC | #11
Andrei Warkentin wrote:
> On Mon, May 23, 2011 at 7:40 AM, Jaehoon Chung <jh80.chung@samsung.com> wrote:
>> Hi A..
>>
>> I tested your patch..(using CMD23)
>> my environment is the below.
>> eMMC card : Sandisk SEM8G (eMMC 4.3+)
>> buswidth : 4bit (SDR)
>> AP : C110
>> benchmark : IOzone
>>
> 
> Knowing what controller you are on is helpful too. Unless you are on
> SDHCI, you will see no effect.
> 
I'm using SDHCI controller and tested with them.
I'll try to test with your comment and share the results.

Regards,
Jaehoon Chung
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
old mode 100644
new mode 100755
index 92e4a00..b91bec2
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -55,10 +55,6 @@  MODULE_ALIAS("mmc:block");
 #define INAND_CMD38_ARG_SECTRIM1 0x81
 #define INAND_CMD38_ARG_SECTRIM2 0x88
 
-#define REL_WRITES_SUPPORTED(card) (mmc_card_mmc((card)) &&	\
-    (((card)->ext_csd.rel_param & EXT_CSD_WR_REL_PARAM_EN) ||	\
-     ((card)->ext_csd.rel_sectors)))
-
 static DEFINE_MUTEX(block_mutex);
 
 /*
@@ -86,6 +82,10 @@  struct mmc_blk_data {
 	struct mmc_queue queue;
 	struct list_head part;
 
+	unsigned int	flags;
+#define MMC_BLK_CMD23	(1 << 0)	/* Can do SET_BLOCK_COUNT for multiblock */
+#define MMC_BLK_REL_WR	(1 << 1)	/* MMC Reliable write support */
+
 	unsigned int	usage;
 	unsigned int	read_only;
 	unsigned int	part_type;
@@ -227,6 +227,7 @@  static const struct block_device_operations mmc_bdops = {
 
 struct mmc_blk_request {
 	struct mmc_request	mrq;
+	struct mmc_command	sbc;
 	struct mmc_command	cmd;
 	struct mmc_command	stop;
 	struct mmc_data		data;
@@ -457,13 +458,10 @@  static int mmc_blk_issue_flush(struct mmc_queue *mq, struct request *req)
  * reliable write can handle, thus finish the request in
  * partial completions.
  */
-static inline int mmc_apply_rel_rw(struct mmc_blk_request *brq,
-				   struct mmc_card *card,
-				   struct request *req)
+static inline void mmc_apply_rel_rw(struct mmc_blk_request *brq,
+				    struct mmc_card *card,
+				    struct request *req)
 {
-	int err;
-	struct mmc_command set_count;
-
 	if (!(card->ext_csd.rel_param & EXT_CSD_WR_REL_PARAM_EN)) {
 		/* Legacy mode imposes restrictions on transfers. */
 		if (!IS_ALIGNED(brq->cmd.arg, card->ext_csd.rel_sectors))
@@ -474,16 +472,6 @@  static inline int mmc_apply_rel_rw(struct mmc_blk_request *brq,
 		else if (brq->data.blocks < card->ext_csd.rel_sectors)
 			brq->data.blocks = 1;
 	}
-
-	memset(&set_count, 0, sizeof(struct mmc_command));
-	set_count.opcode = MMC_SET_BLOCK_COUNT;
-	set_count.arg = brq->data.blocks | (1 << 31);
-	set_count.flags = MMC_RSP_R1 | MMC_CMD_AC;
-	err = mmc_wait_for_cmd(card->host, &set_count, 0);
-	if (err)
-		printk(KERN_ERR "%s: error %d SET_BLOCK_COUNT\n",
-		       req->rq_disk->disk_name, err);
-	return err;
 }
 
 static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
@@ -500,7 +488,7 @@  static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 	bool do_rel_wr = ((req->cmd_flags & REQ_FUA) ||
 			  (req->cmd_flags & REQ_META)) &&
 		(rq_data_dir(req) == WRITE) &&
-		REL_WRITES_SUPPORTED(card);
+		(md->flags & MMC_BLK_REL_WR);
 
 	do {
 		struct mmc_command cmd;
@@ -539,11 +527,9 @@  static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 
 		if (brq.data.blocks > 1 || do_rel_wr) {
 			/* SPI multiblock writes terminate using a special
-			 * token, not a STOP_TRANSMISSION request. Reliable
-			 * writes use SET_BLOCK_COUNT and do not use a
-			 * STOP_TRANSMISSION request either.
+			 * token, not a STOP_TRANSMISSION request.
 			 */
-			if ((!mmc_host_is_spi(card->host) && !do_rel_wr) ||
+			if (!mmc_host_is_spi(card->host) ||
 			    rq_data_dir(req) == READ)
 				brq.mrq.stop = &brq.stop;
 			readcmd = MMC_READ_MULTIPLE_BLOCK;
@@ -561,8 +547,37 @@  static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 			brq.data.flags |= MMC_DATA_WRITE;
 		}
 
-		if (do_rel_wr && mmc_apply_rel_rw(&brq, card, req))
-			goto cmd_err;
+		if (do_rel_wr)
+			mmc_apply_rel_rw(&brq, card, req);
+
+		/*
+		 * Pre-defined multi-block transfers are preferable to
+		 * open ended-ones (and necessary for reliable writes).
+		 * However, it is not sufficient to just send CMD23,
+		 * and avoid the final CMD12, as on an error condition
+		 * CMD12 (stop) needs to be sent anyway. This, coupled
+		 * with Auto-CMD23 enhancements provided by some
+		 * hosts, means that the complexity of dealing
+		 * with this is best left to the host. If CMD23 is
+		 * supported by card and host, we'll fill sbc in and let
+		 * the host deal with handling it correctly. This means
+		 * that for hosts that don't expose MMC_CAP_CMD23, no
+		 * change of behavior will be observed.
+		 *
+		 * N.B: Some MMC cards experience perf degradation.
+		 * We'll avoid using CMD23-bounded multiblock writes for
+		 * these, while retaining features like reliable writes.
+		 */
+
+		if ((md->flags & MMC_BLK_CMD23) &&
+		    mmc_op_multi(brq.cmd.opcode) &&
+		    (do_rel_wr || !(card->quirks & MMC_QUIRK_BLK_NO_CMD23))) {
+			brq.sbc.opcode = MMC_SET_BLOCK_COUNT;
+			brq.sbc.arg = brq.data.blocks |
+				(do_rel_wr ? (1 << 31) : 0);
+			brq.sbc.flags = MMC_RSP_R1 | MMC_CMD_AC;
+			brq.mrq.sbc = &brq.sbc;
+		}
 
 		mmc_set_data_timeout(&brq.data, card);
 
@@ -599,7 +614,8 @@  static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 		 * until later as we need to wait for the card to leave
 		 * programming mode even when things go wrong.
 		 */
-		if (brq.cmd.error || brq.data.error || brq.stop.error) {
+		if (brq.sbc.error || brq.cmd.error ||
+		    brq.data.error || brq.stop.error) {
 			if (brq.data.blocks > 1 && rq_data_dir(req) == READ) {
 				/* Redo read one sector at a time */
 				printk(KERN_WARNING "%s: retrying using single "
@@ -610,6 +626,13 @@  static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 			status = get_card_status(card, req);
 		}
 
+		if (brq.sbc.error) {
+			printk(KERN_ERR "%s: error %d sending SET_BLOCK_COUNT "
+			       "command, response %#x, card status %#x\n",
+			       req->rq_disk->disk_name, brq.sbc.error,
+			       brq.sbc.resp[0], status);
+		}
+
 		if (brq.cmd.error) {
 			printk(KERN_ERR "%s: error %d sending read/write "
 			       "command, response %#x, card status %#x\n",
@@ -821,8 +844,6 @@  static struct mmc_blk_data *mmc_blk_alloc_req(struct mmc_card *card,
 	md->disk->queue = md->queue.queue;
 	md->disk->driverfs_dev = parent;
 	set_disk_ro(md->disk, md->read_only || default_ro);
-	if (REL_WRITES_SUPPORTED(card))
-		blk_queue_flush(md->queue.queue, REQ_FLUSH | REQ_FUA);
 
 	/*
 	 * As discussed on lkml, GENHD_FL_REMOVABLE should:
@@ -841,6 +862,19 @@  static struct mmc_blk_data *mmc_blk_alloc_req(struct mmc_card *card,
 
 	blk_queue_logical_block_size(md->queue.queue, 512);
 	set_capacity(md->disk, size);
+
+	if (mmc_host_cmd23(card->host) &&
+	    mmc_card_mmc(card))
+		md->flags |= MMC_BLK_CMD23;
+
+	if (mmc_card_mmc(card) &&
+	    md->flags & MMC_BLK_CMD23 &&
+	    ((card->ext_csd.rel_param & EXT_CSD_WR_REL_PARAM_EN) ||
+	     card->ext_csd.rel_sectors)) {
+		md->flags |= MMC_BLK_REL_WR;
+		blk_queue_flush(md->queue.queue, REQ_FLUSH | REQ_FUA);
+	}
+
 	return md;
 
  err_putdisk:
@@ -995,6 +1029,21 @@  static const struct mmc_fixup blk_fixups[] =
 	MMC_FIXUP("SEM08G", 0x2, 0x100, add_quirk, MMC_QUIRK_INAND_CMD38),
 	MMC_FIXUP("SEM16G", 0x2, 0x100, add_quirk, MMC_QUIRK_INAND_CMD38),
 	MMC_FIXUP("SEM32G", 0x2, 0x100, add_quirk, MMC_QUIRK_INAND_CMD38),
+
+	/*
+	 * Some MMC cards experience performance degradation with CMD23
+	 * instead of CMD12-bounded multiblock transfers. For now we'll
+	 * black list what's bad...
+	 * - Certain Toshiba cards.
+	 *
+	 * N.B. This doesn't affect SD cards.
+	 */
+	MMC_FIXUP("MMC08G", 0x11, CID_OEMID_ANY, add_quirk_mmc,
+		  MMC_QUIRK_BLK_NO_CMD23),
+	MMC_FIXUP("MMC16G", 0x11, CID_OEMID_ANY, add_quirk_mmc,
+		  MMC_QUIRK_BLK_NO_CMD23),
+	MMC_FIXUP("MMC32G", 0x11, CID_OEMID_ANY, add_quirk_mmc,
+		  MMC_QUIRK_BLK_NO_CMD23),
 	END_FIXUP
 };
 
diff --git a/include/linux/mmc/card.h b/include/linux/mmc/card.h
index 6a4ed2a..c758181 100644
--- a/include/linux/mmc/card.h
+++ b/include/linux/mmc/card.h
@@ -134,6 +134,7 @@  struct mmc_card {
 #define MMC_QUIRK_NONSTD_FUNC_IF (1<<4)		/* SDIO card has nonstd function interfaces */
 #define MMC_QUIRK_DISABLE_CD	(1<<5)		/* disconnect CD/DAT[3] resistor */
 #define MMC_QUIRK_INAND_CMD38	(1<<6)		/* iNAND devices have broken CMD38 */
+#define MMC_QUIRK_BLK_NO_CMD23	(1<<7)		/* Avoid CMD23 for regular multiblock */
 
 	unsigned int		erase_size;	/* erase size in sectors */
  	unsigned int		erase_shift;	/* if erase unit is power 2 */
diff --git a/include/linux/mmc/core.h b/include/linux/mmc/core.h
index f8e4bcb..55d7fde 100644
--- a/include/linux/mmc/core.h
+++ b/include/linux/mmc/core.h
@@ -120,6 +120,7 @@  struct mmc_data {
 };
 
 struct mmc_request {
+	struct mmc_command	*sbc;		/* SET_BLOCK_COUNT for multiblock */
 	struct mmc_command	*cmd;
 	struct mmc_data		*data;
 	struct mmc_command	*stop;
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index 4f705eb..ba34fc5 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -173,6 +173,7 @@  struct mmc_host {
 						/* DDR mode at 1.2V */
 #define MMC_CAP_POWER_OFF_CARD	(1 << 13)	/* Can power off after boot */
 #define MMC_CAP_BUS_WIDTH_TEST	(1 << 14)	/* CMD14/CMD19 bus width ok */
+#define MMC_CAP_CMD23		(1 << 15)	/* CMD23 supported */
 
 	mmc_pm_flag_t		pm_caps;	/* supported pm features */
 
@@ -330,5 +331,10 @@  static inline int mmc_card_wake_sdio_irq(struct mmc_host *host)
 {
 	return host->pm_flags & MMC_PM_WAKE_SDIO_IRQ;
 }
+
+static inline int mmc_host_cmd23(struct mmc_host *host)
+{
+	return host->caps & MMC_CAP_CMD23;
+}
 #endif
 
diff --git a/include/linux/mmc/mmc.h b/include/linux/mmc/mmc.h
index 373b2bf..ea74168 100644
--- a/include/linux/mmc/mmc.h
+++ b/include/linux/mmc/mmc.h
@@ -82,6 +82,12 @@ 
 #define MMC_APP_CMD              55   /* ac   [31:16] RCA        R1  */
 #define MMC_GEN_CMD              56   /* adtc [0] RD/WR          R1  */
 
+static inline bool mmc_op_multi(u32 opcode)
+{
+	return opcode == MMC_WRITE_MULTIPLE_BLOCK ||
+		opcode == MMC_READ_MULTIPLE_BLOCK;
+}
+
 /*
  * MMC_SWITCH argument format:
  *