[v3] drm/dp: Use large transactions for I2C over AUX

Message ID	1422373429-10137-1-git-send-email-simon.farnsworth@onelan.co.uk (mailing list archive)
State	New, archived
Headers	show Return-Path: <dri-devel-bounces@lists.freedesktop.org> From: Simon Farnsworth <simon.farnsworth@onelan.co.uk> To: dri-devel@lists.freedesktop.org Subject: [PATCH v3] drm/dp: Use large transactions for I2C over AUX Date: Tue, 27 Jan 2015 15:43:49 +0000 Message-Id: <1422373429-10137-1-git-send-email-simon.farnsworth@onelan.co.uk> Cc: Thierry Reding <treding@nvidia.com> Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>

Simon Farnsworth Jan. 27, 2015, 3:43 p.m. UTC

DisplayPort to DVI-D Dual Link adapters designed by Bizlink have bugs in
their I2C over AUX implementation. They work fine with Windows, but fail
with Linux.

It turns out that they cannot keep an I2C transaction open unless the
previous read was 16 bytes; shorter reads can only be followed by a zero
byte transfer ending the I2C transaction.

Copy Windows's behaviour, and read 16 bytes at a time. If we get a short
reply, assume that there's a hardware bottleneck, and shrink our read size
to match. For this purpose, use the algorithm in the DisplayPort 1.2 spec,
in the hopes that it'll be closest to what Windows does, as no sink I've
found actually gives short replies.

Also provide a module parameter for testing smaller transfer sizes, in case
there are sinks out there that cannot work with Windows.

Signed-off-by: Simon Farnsworth <simon.farnsworth@onelan.co.uk>
---

v3 changes, after feedback from Ville and more testing of Windows:

 * Change the short reply algorithm to match Ville's description of the
   DisplayPort 1.2 spec wording.

 * Add a module parameter to set the default transfer size for
   experiments. Requested over IRC by Ville.

No-one's been able to find a device that does short replies, but experiments
show that bigger reads are faster on most devices. Ville got:

 DP->DVI (OUI 001cf8):  40ms -> 35ms
 DP->VGA (OUI 0022b9):  45ms -> 38ms
 Zotac DP->2xHDMI:      25ms ->  4ms

v2 changes, after feedback from Thierry and Ville:

 * Handle short replies. I've decided (arbitrarily) that a short reply
   results in us dropping back to the newly chosen size for the rest of this
   I2C transaction. Thus, given an attempt to read the first 16 bytes of
   EDID, and a sink that only does 4 bytes of buffering, we will see the
   following AUX transfers for the EDID read (after address is set):

   <set address, block etc>
   Read 16 bytes from I2C over AUX.
   Reply with 4 bytes
   Read 4 bytes
   Reply with 4 bytes
   Read 4 bytes
   Reply with 4 bytes
   Read 4 bytes
   Reply with 4 bytes
   <end I2C transaction>

Note that I've not looked at MST support - I have neither the DP 1.2 spec
nor any MST branch devices, so I can't test anything I write or check it
against a spec. It looks from the code, however, as if MST has the branch
device do the split from a big request into small transactions.

 drivers/gpu/drm/drm_dp_helper.c | 91 ++++++++++++++++++++++++++++++++---------
 include/drm/drm_dp_helper.h     |  5 +++
 2 files changed, 77 insertions(+), 19 deletions(-)

Jani Nikula Jan. 28, 2015, 9:09 a.m. UTC | #1

On Tue, 27 Jan 2015, Simon Farnsworth <simon.farnsworth@onelan.co.uk> wrote:
> DisplayPort to DVI-D Dual Link adapters designed by Bizlink have bugs in
> their I2C over AUX implementation. They work fine with Windows, but fail
> with Linux.
>
> It turns out that they cannot keep an I2C transaction open unless the
> previous read was 16 bytes; shorter reads can only be followed by a zero
> byte transfer ending the I2C transaction.
>
> Copy Windows's behaviour, and read 16 bytes at a time. If we get a short
> reply, assume that there's a hardware bottleneck, and shrink our read size
> to match. For this purpose, use the algorithm in the DisplayPort 1.2 spec,
> in the hopes that it'll be closest to what Windows does, as no sink I've
> found actually gives short replies.
>
> Also provide a module parameter for testing smaller transfer sizes, in case
> there are sinks out there that cannot work with Windows.
>
> Signed-off-by: Simon Farnsworth <simon.farnsworth@onelan.co.uk>
> ---
>
> v3 changes, after feedback from Ville and more testing of Windows:
>
>  * Change the short reply algorithm to match Ville's description of the
>    DisplayPort 1.2 spec wording.
>
>  * Add a module parameter to set the default transfer size for
>    experiments. Requested over IRC by Ville.

IMO module parameters are ABI we shouldn't add just because we might
need them. Also see my reply in the other thread
http://mid.gmane.org/877fw7s2lh.fsf@intel.com

BR,
Jani.

>
> No-one's been able to find a device that does short replies, but experiments
> show that bigger reads are faster on most devices. Ville got:
>
>  DP->DVI (OUI 001cf8):  40ms -> 35ms
>  DP->VGA (OUI 0022b9):  45ms -> 38ms
>  Zotac DP->2xHDMI:      25ms ->  4ms
>
> v2 changes, after feedback from Thierry and Ville:
>
>  * Handle short replies. I've decided (arbitrarily) that a short reply
>    results in us dropping back to the newly chosen size for the rest of this
>    I2C transaction. Thus, given an attempt to read the first 16 bytes of
>    EDID, and a sink that only does 4 bytes of buffering, we will see the
>    following AUX transfers for the EDID read (after address is set):
>
>    <set address, block etc>
>    Read 16 bytes from I2C over AUX.
>    Reply with 4 bytes
>    Read 4 bytes
>    Reply with 4 bytes
>    Read 4 bytes
>    Reply with 4 bytes
>    Read 4 bytes
>    Reply with 4 bytes
>    <end I2C transaction>
>
> Note that I've not looked at MST support - I have neither the DP 1.2 spec
> nor any MST branch devices, so I can't test anything I write or check it
> against a spec. It looks from the code, however, as if MST has the branch
> device do the split from a big request into small transactions.
>
>  drivers/gpu/drm/drm_dp_helper.c | 91 ++++++++++++++++++++++++++++++++---------
>  include/drm/drm_dp_helper.h     |  5 +++
>  2 files changed, 77 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c
> index 79968e3..3db116c 100644
> --- a/drivers/gpu/drm/drm_dp_helper.c
> +++ b/drivers/gpu/drm/drm_dp_helper.c
> @@ -396,11 +396,13 @@ static u32 drm_dp_i2c_functionality(struct i2c_adapter *adapter)
>   * retrying the transaction as appropriate.  It is assumed that the
>   * aux->transfer function does not modify anything in the msg other than the
>   * reply field.
> + *
> + * Returns bytes transferred on success, or a negative error code on failure.
>   */
>  static int drm_dp_i2c_do_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg)
>  {
>  	unsigned int retry;
> -	int err;
> +	int ret;
>  
>  	/*
>  	 * DP1.2 sections 2.7.7.1.5.6.1 and 2.7.7.1.6.6.1: A DP Source device
> @@ -409,14 +411,14 @@ static int drm_dp_i2c_do_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg)
>  	 */
>  	for (retry = 0; retry < 7; retry++) {
>  		mutex_lock(&aux->hw_mutex);
> -		err = aux->transfer(aux, msg);
> +		ret = aux->transfer(aux, msg);
>  		mutex_unlock(&aux->hw_mutex);
> -		if (err < 0) {
> -			if (err == -EBUSY)
> +		if (ret < 0) {
> +			if (ret == -EBUSY)
>  				continue;
>  
> -			DRM_DEBUG_KMS("transaction failed: %d\n", err);
> -			return err;
> +			DRM_DEBUG_KMS("transaction failed: %d\n", ret);
> +			return ret;
>  		}
>  
>  
> @@ -457,9 +459,7 @@ static int drm_dp_i2c_do_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg)
>  			 * Both native ACK and I2C ACK replies received. We
>  			 * can assume the transfer was successful.
>  			 */
> -			if (err < msg->size)
> -				return -EPROTO;
> -			return 0;
> +			return ret;
>  
>  		case DP_AUX_I2C_REPLY_NACK:
>  			DRM_DEBUG_KMS("I2C nack\n");
> @@ -482,14 +482,67 @@ static int drm_dp_i2c_do_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg)
>  	return -EREMOTEIO;
>  }
>  
> +/*
> + * Keep retrying drm_dp_i2c_do_msg until all data has been transferred.
> + *
> + * Returns an error code on failure, or a recommended transfer size on success.
> + */
> +static int drm_dp_i2c_drain_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg)
> +{
> +	int ret;
> +	struct drm_dp_aux_msg drain_msg;
> +	int drain_bytes;
> +
> +	ret = drm_dp_i2c_do_msg(aux, msg);
> +
> +	if (ret == msg->size || ret <= 0)
> +		return ret == 0 ? -EPROTO : ret;
> +
> +	/*
> +	 * We had a partial reply. Drain out the rest of the bytes we
> +	 * originally requested, then return the size of the partial
> +	 * reply. In theory, this should match DP 1.2's desired behaviour
> +	 * for I2C over AUX.
> +	 */
> +	DRM_DEBUG_KMS("Partial I2C reply: requested %zu bytes got %d bytes\n", msg->size, ret);
> +	drain_msg = *msg;
> +	drain_msg.size -= ret;
> +	drain_msg.buffer += ret;
> +	while (drain_msg.size != 0) {
> +		drain_bytes = drm_dp_i2c_do_msg(aux, &drain_msg);
> +		if (drain_bytes <= 0)
> +			return drain_bytes == 0 ? -EPROTO : drain_bytes;
> +		drain_msg.size -= drain_bytes;
> +		drain_msg.buffer += drain_bytes;
> +	}
> +	return ret;
> +}
> +
> +/*
> + * Bizlink designed DP->DVI-D Dual Link adapters require the I2C over AUX
> + * packets to be as large as possible. If not, the I2C transactions never
> + * succeed. Hence the default is maximum.
> + */
> +static int dp_aux_i2c_transfer_size __read_mostly = DP_AUX_MAX_PAYLOAD_BYTES;
> +module_param_named(dp_aux_i2c_transfer_size, dp_aux_i2c_transfer_size, int, 0600);
> +MODULE_PARM_DESC(dp_aux_i2c_transfer_size,
> +		 "Number of bytes to transfer in a single I2C over DP AUX CH message, (1-16, default 16)");
> +
>  static int drm_dp_i2c_xfer(struct i2c_adapter *adapter, struct i2c_msg *msgs,
>  			   int num)
>  {
>  	struct drm_dp_aux *aux = adapter->algo_data;
>  	unsigned int i, j;
> +	int transfer_size;
>  	struct drm_dp_aux_msg msg;
>  	int err = 0;
>  
> +	if (dp_aux_i2c_transfer_size < 1 || dp_aux_i2c_transfer_size > DP_AUX_MAX_PAYLOAD_BYTES) {
> +		DRM_ERROR("dp_aux_i2c_transfer_size invalid - changing from %d to %d\n",
> +			  dp_aux_i2c_transfer_size, DP_AUX_MAX_PAYLOAD_BYTES);
> +		dp_aux_i2c_transfer_size = DP_AUX_MAX_PAYLOAD_BYTES;
> +	}
> +
>  	memset(&msg, 0, sizeof(msg));
>  
>  	for (i = 0; i < num; i++) {
> @@ -507,20 +560,20 @@ static int drm_dp_i2c_xfer(struct i2c_adapter *adapter, struct i2c_msg *msgs,
>  		err = drm_dp_i2c_do_msg(aux, &msg);
>  		if (err < 0)
>  			break;
> -		/*
> -		 * Many hardware implementations support FIFOs larger than a
> -		 * single byte, but it has been empirically determined that
> -		 * transferring data in larger chunks can actually lead to
> -		 * decreased performance. Therefore each message is simply
> -		 * transferred byte-by-byte.
> +		/* We want each transaction to be as large as possible, but
> +		 * we'll go to smaller sizes if the hardware gives us a
> +		 * short reply.
>  		 */
> -		for (j = 0; j < msgs[i].len; j++) {
> +		transfer_size = dp_aux_i2c_transfer_size;
> +		for (j = 0; j < msgs[i].len; j += msg.size) {
>  			msg.buffer = msgs[i].buf + j;
> -			msg.size = 1;
> +			msg.size = min((unsigned)transfer_size, msgs[i].len - j);
>  
> -			err = drm_dp_i2c_do_msg(aux, &msg);
> -			if (err < 0)
> +			transfer_size = drm_dp_i2c_drain_msg(aux, &msg);
> +			if (transfer_size < 0) {
> +				err = transfer_size;
>  				break;
> +			}
>  		}
>  		if (err < 0)
>  			break;
> diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h
> index 11f8c84..444d51b 100644
> --- a/include/drm/drm_dp_helper.h
> +++ b/include/drm/drm_dp_helper.h
> @@ -42,6 +42,8 @@
>   * 1.2 formally includes both eDP and DPI definitions.
>   */
>  
> +#define DP_AUX_MAX_PAYLOAD_BYTES	16
> +
>  #define DP_AUX_I2C_WRITE		0x0
>  #define DP_AUX_I2C_READ			0x1
>  #define DP_AUX_I2C_STATUS		0x2
> @@ -519,6 +521,9 @@ struct drm_dp_aux_msg {
>   * transactions. The drm_dp_aux_register_i2c_bus() function registers an
>   * I2C adapter that can be passed to drm_probe_ddc(). Upon removal, drivers
>   * should call drm_dp_aux_unregister_i2c_bus() to remove the I2C adapter.
> + * The I2C adapter uses long transfers by default; if a partial response is
> + * received, the adapter will drop down to the size given by the partial
> + * response for this transaction only.
>   *
>   * Note that the aux helper code assumes that the .transfer() function
>   * only modifies the reply field of the drm_dp_aux_msg structure.  The
> -- 
> 2.1.0
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

Daniel Vetter Jan. 28, 2015, 10:31 a.m. UTC | #2

On Wed, Jan 28, 2015 at 11:09:21AM +0200, Jani Nikula wrote:
> On Tue, 27 Jan 2015, Simon Farnsworth <simon.farnsworth@onelan.co.uk> wrote:
> > DisplayPort to DVI-D Dual Link adapters designed by Bizlink have bugs in
> > their I2C over AUX implementation. They work fine with Windows, but fail
> > with Linux.
> >
> > It turns out that they cannot keep an I2C transaction open unless the
> > previous read was 16 bytes; shorter reads can only be followed by a zero
> > byte transfer ending the I2C transaction.
> >
> > Copy Windows's behaviour, and read 16 bytes at a time. If we get a short
> > reply, assume that there's a hardware bottleneck, and shrink our read size
> > to match. For this purpose, use the algorithm in the DisplayPort 1.2 spec,
> > in the hopes that it'll be closest to what Windows does, as no sink I've
> > found actually gives short replies.
> >
> > Also provide a module parameter for testing smaller transfer sizes, in case
> > there are sinks out there that cannot work with Windows.
> >
> > Signed-off-by: Simon Farnsworth <simon.farnsworth@onelan.co.uk>
> > ---
> >
> > v3 changes, after feedback from Ville and more testing of Windows:
> >
> >  * Change the short reply algorithm to match Ville's description of the
> >    DisplayPort 1.2 spec wording.
> >
> >  * Add a module parameter to set the default transfer size for
> >    experiments. Requested over IRC by Ville.
> 
> IMO module parameters are ABI we shouldn't add just because we might
> need them. Also see my reply in the other thread
> http://mid.gmane.org/877fw7s2lh.fsf@intel.com

Imo just marking it as _unsafe is good enough to make it clear that it's
for debugging only.
-Daniel

Ville Syrjälä Jan. 29, 2015, 1:30 p.m. UTC | #3

On Tue, Jan 27, 2015 at 03:43:49PM +0000, Simon Farnsworth wrote:
> DisplayPort to DVI-D Dual Link adapters designed by Bizlink have bugs in
> their I2C over AUX implementation. They work fine with Windows, but fail
> with Linux.
> 
> It turns out that they cannot keep an I2C transaction open unless the
> previous read was 16 bytes; shorter reads can only be followed by a zero
> byte transfer ending the I2C transaction.
> 
> Copy Windows's behaviour, and read 16 bytes at a time. If we get a short
> reply, assume that there's a hardware bottleneck, and shrink our read size
> to match. For this purpose, use the algorithm in the DisplayPort 1.2 spec,
> in the hopes that it'll be closest to what Windows does, as no sink I've
> found actually gives short replies.
> 
> Also provide a module parameter for testing smaller transfer sizes, in case
> there are sinks out there that cannot work with Windows.
> 
> Signed-off-by: Simon Farnsworth <simon.farnsworth@onelan.co.uk>
> ---
> 
> v3 changes, after feedback from Ville and more testing of Windows:
> 
>  * Change the short reply algorithm to match Ville's description of the
>    DisplayPort 1.2 spec wording.
> 
>  * Add a module parameter to set the default transfer size for
>    experiments. Requested over IRC by Ville.
> 
> No-one's been able to find a device that does short replies, but experiments
> show that bigger reads are faster on most devices. Ville got:
> 
>  DP->DVI (OUI 001cf8):  40ms -> 35ms
>  DP->VGA (OUI 0022b9):  45ms -> 38ms
>  Zotac DP->2xHDMI:      25ms ->  4ms
> 
> v2 changes, after feedback from Thierry and Ville:
> 
>  * Handle short replies. I've decided (arbitrarily) that a short reply
>    results in us dropping back to the newly chosen size for the rest of this
>    I2C transaction. Thus, given an attempt to read the first 16 bytes of
>    EDID, and a sink that only does 4 bytes of buffering, we will see the
>    following AUX transfers for the EDID read (after address is set):
> 
>    <set address, block etc>
>    Read 16 bytes from I2C over AUX.
>    Reply with 4 bytes
>    Read 4 bytes
>    Reply with 4 bytes
>    Read 4 bytes
>    Reply with 4 bytes
>    Read 4 bytes
>    Reply with 4 bytes
>    <end I2C transaction>
> 
> Note that I've not looked at MST support - I have neither the DP 1.2 spec
> nor any MST branch devices, so I can't test anything I write or check it
> against a spec. It looks from the code, however, as if MST has the branch
> device do the split from a big request into small transactions.
> 
>  drivers/gpu/drm/drm_dp_helper.c | 91 ++++++++++++++++++++++++++++++++---------
>  include/drm/drm_dp_helper.h     |  5 +++
>  2 files changed, 77 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c
> index 79968e3..3db116c 100644
> --- a/drivers/gpu/drm/drm_dp_helper.c
> +++ b/drivers/gpu/drm/drm_dp_helper.c
> @@ -396,11 +396,13 @@ static u32 drm_dp_i2c_functionality(struct i2c_adapter *adapter)
>   * retrying the transaction as appropriate.  It is assumed that the
>   * aux->transfer function does not modify anything in the msg other than the
>   * reply field.
> + *
> + * Returns bytes transferred on success, or a negative error code on failure.
>   */
>  static int drm_dp_i2c_do_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg)
>  {
>  	unsigned int retry;
> -	int err;
> +	int ret;
>  
>  	/*
>  	 * DP1.2 sections 2.7.7.1.5.6.1 and 2.7.7.1.6.6.1: A DP Source device
> @@ -409,14 +411,14 @@ static int drm_dp_i2c_do_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg)
>  	 */
>  	for (retry = 0; retry < 7; retry++) {
>  		mutex_lock(&aux->hw_mutex);
> -		err = aux->transfer(aux, msg);
> +		ret = aux->transfer(aux, msg);
>  		mutex_unlock(&aux->hw_mutex);
> -		if (err < 0) {
> -			if (err == -EBUSY)
> +		if (ret < 0) {
> +			if (ret == -EBUSY)
>  				continue;
>  
> -			DRM_DEBUG_KMS("transaction failed: %d\n", err);
> -			return err;
> +			DRM_DEBUG_KMS("transaction failed: %d\n", ret);
> +			return ret;
>  		}
>  
>  
> @@ -457,9 +459,7 @@ static int drm_dp_i2c_do_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg)
>  			 * Both native ACK and I2C ACK replies received. We
>  			 * can assume the transfer was successful.
>  			 */
> -			if (err < msg->size)
> -				return -EPROTO;
> -			return 0;
> +			return ret;

The s/err/ret/ seems a bit superfluous, but OTOH it does make sense
since it's no longer just an error value. At first glance it just
confused me a bit since I wasn't expecting that many changes to this
function.

>  
>  		case DP_AUX_I2C_REPLY_NACK:
>  			DRM_DEBUG_KMS("I2C nack\n");
> @@ -482,14 +482,67 @@ static int drm_dp_i2c_do_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg)
>  	return -EREMOTEIO;
>  }
>  
> +/*
> + * Keep retrying drm_dp_i2c_do_msg until all data has been transferred.
> + *
> + * Returns an error code on failure, or a recommended transfer size on success.
> + */
> +static int drm_dp_i2c_drain_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg)
> +{
> +	int ret;
> +	struct drm_dp_aux_msg drain_msg;
> +	int drain_bytes;
> +
> +	ret = drm_dp_i2c_do_msg(aux, msg);
> +
> +	if (ret == msg->size || ret <= 0)
> +		return ret == 0 ? -EPROTO : ret;
> +
> +	/*
> +	 * We had a partial reply. Drain out the rest of the bytes we
> +	 * originally requested, then return the size of the partial
> +	 * reply. In theory, this should match DP 1.2's desired behaviour
> +	 * for I2C over AUX.
> +	 */

The spec is a bit self contradictory, but there is a section which seems
to suggest this. It's only mentioned for reads mind you, but I don't see
it causing harm for writes either. Not that we actually handle
partial/deferred writes correctly at the moment.

> +	DRM_DEBUG_KMS("Partial I2C reply: requested %zu bytes got %d bytes\n", msg->size, ret);
> +	drain_msg = *msg;
> +	drain_msg.size -= ret;
> +	drain_msg.buffer += ret;
> +	while (drain_msg.size != 0) {
> +		drain_bytes = drm_dp_i2c_do_msg(aux, &drain_msg);
> +		if (drain_bytes <= 0)
> +			return drain_bytes == 0 ? -EPROTO : drain_bytes;
> +		drain_msg.size -= drain_bytes;
> +		drain_msg.buffer += drain_bytes;
> +	}

Somehow I don't like the duplicated code end up having here. So
putting it all in a single loop would seem nicer to me. Maybe
something along these lines?

struct drm_dp_aux_msg msg = *orig_msg;
int ret = msg.size;
while (msg.size > 0) {
	int err = drm_dp_i2c_do_msg(aux, &msg);
	if (err <= 0)
		return err == 0 ? -EPROTO : err;

	if (err < msg.size && err < ret) {
		DRM_DEBUG_KMS("Partial I2C reply: requested %zu
			       bytes got %d bytes\n", msg.size, err);
		ret = err;
	}

	msg.size -= err;
	msg.buffer += err;
}

It would also reduce the returned preferred transfer size further if we
(for whatever reason) got an even shorter reply while we're draining.


> +	return ret;
> +}
> +
> +/*
> + * Bizlink designed DP->DVI-D Dual Link adapters require the I2C over AUX
> + * packets to be as large as possible. If not, the I2C transactions never
> + * succeed. Hence the default is maximum.
> + */
> +static int dp_aux_i2c_transfer_size __read_mostly = DP_AUX_MAX_PAYLOAD_BYTES;
> +module_param_named(dp_aux_i2c_transfer_size, dp_aux_i2c_transfer_size, int, 0600);
> +MODULE_PARM_DESC(dp_aux_i2c_transfer_size,
> +		 "Number of bytes to transfer in a single I2C over DP AUX CH message, (1-16, default 16)");
> +
>  static int drm_dp_i2c_xfer(struct i2c_adapter *adapter, struct i2c_msg *msgs,
>  			   int num)
>  {
>  	struct drm_dp_aux *aux = adapter->algo_data;
>  	unsigned int i, j;
> +	int transfer_size;
>  	struct drm_dp_aux_msg msg;
>  	int err = 0;
>  
> +	if (dp_aux_i2c_transfer_size < 1 || dp_aux_i2c_transfer_size > DP_AUX_MAX_PAYLOAD_BYTES) {
> +		DRM_ERROR("dp_aux_i2c_transfer_size invalid - changing from %d to %d\n",
> +			  dp_aux_i2c_transfer_size, DP_AUX_MAX_PAYLOAD_BYTES);
> +		dp_aux_i2c_transfer_size = DP_AUX_MAX_PAYLOAD_BYTES;
> +	}

The invalid values could be rejected with custom struct
kernel_param_ops, but maybe that's a bit overkill. If not that, then
I'm not sure the error message really has that much value. So I'm thinking
we could just 'clamp(..., 1, DP_AUX_MAX_PAYLOAD_BYTES)' here.

> +
>  	memset(&msg, 0, sizeof(msg));
>  
>  	for (i = 0; i < num; i++) {
> @@ -507,20 +560,20 @@ static int drm_dp_i2c_xfer(struct i2c_adapter *adapter, struct i2c_msg *msgs,
>  		err = drm_dp_i2c_do_msg(aux, &msg);
>  		if (err < 0)
>  			break;
> -		/*
> -		 * Many hardware implementations support FIFOs larger than a
> -		 * single byte, but it has been empirically determined that
> -		 * transferring data in larger chunks can actually lead to
> -		 * decreased performance. Therefore each message is simply
> -		 * transferred byte-by-byte.
> +		/* We want each transaction to be as large as possible, but
> +		 * we'll go to smaller sizes if the hardware gives us a
> +		 * short reply.
>  		 */
> -		for (j = 0; j < msgs[i].len; j++) {
> +		transfer_size = dp_aux_i2c_transfer_size;
> +		for (j = 0; j < msgs[i].len; j += msg.size) {
>  			msg.buffer = msgs[i].buf + j;
> -			msg.size = 1;
> +			msg.size = min((unsigned)transfer_size, msgs[i].len - j);

Could make transfer_size unsigned in the first place.

>  
> -			err = drm_dp_i2c_do_msg(aux, &msg);
> -			if (err < 0)
> +			transfer_size = drm_dp_i2c_drain_msg(aux, &msg);
> +			if (transfer_size < 0) {
> +				err = transfer_size;
>  				break;
> +			}

Maybe this is a bit more straight forward?

err = drm_dp_i2c_drain_msg(aux, &msg);
if (err < 0)
	break;
transfer_size = err;

At least the diff would be smaller, which is always good when reviewing
stuff.


>  		}
>  		if (err < 0)
>  			break;
> diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h
> index 11f8c84..444d51b 100644
> --- a/include/drm/drm_dp_helper.h
> +++ b/include/drm/drm_dp_helper.h
> @@ -42,6 +42,8 @@
>   * 1.2 formally includes both eDP and DPI definitions.
>   */
>  
> +#define DP_AUX_MAX_PAYLOAD_BYTES	16
> +
>  #define DP_AUX_I2C_WRITE		0x0
>  #define DP_AUX_I2C_READ			0x1
>  #define DP_AUX_I2C_STATUS		0x2
> @@ -519,6 +521,9 @@ struct drm_dp_aux_msg {
>   * transactions. The drm_dp_aux_register_i2c_bus() function registers an
>   * I2C adapter that can be passed to drm_probe_ddc(). Upon removal, drivers
>   * should call drm_dp_aux_unregister_i2c_bus() to remove the I2C adapter.
> + * The I2C adapter uses long transfers by default; if a partial response is
> + * received, the adapter will drop down to the size given by the partial
> + * response for this transaction only.
>   *
>   * Note that the aux helper code assumes that the .transfer() function
>   * only modifies the reply field of the drm_dp_aux_msg structure.  The
> -- 
> 2.1.0

Simon Farnsworth Jan. 29, 2015, 2:24 p.m. UTC | #4

On Thursday 29 January 2015 15:30:36 you wrote:
> On Tue, Jan 27, 2015 at 03:43:49PM +0000, Simon Farnsworth wrote:
--snip--
> > diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c
> > index 79968e3..3db116c 100644
> > --- a/drivers/gpu/drm/drm_dp_helper.c
> > +++ b/drivers/gpu/drm/drm_dp_helper.c
> > @@ -396,11 +396,13 @@ static u32 drm_dp_i2c_functionality(struct i2c_adapter *adapter)
> >   * retrying the transaction as appropriate.  It is assumed that the
> >   * aux->transfer function does not modify anything in the msg other than the
> >   * reply field.
> > + *
> > + * Returns bytes transferred on success, or a negative error code on failure.
> >   */
> >  static int drm_dp_i2c_do_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg)
> >  {
> >  	unsigned int retry;
> > -	int err;
> > +	int ret;
> >  
> >  	/*
> >  	 * DP1.2 sections 2.7.7.1.5.6.1 and 2.7.7.1.6.6.1: A DP Source device
> > @@ -409,14 +411,14 @@ static int drm_dp_i2c_do_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg)
> >  	 */
> >  	for (retry = 0; retry < 7; retry++) {
> >  		mutex_lock(&aux->hw_mutex);
> > -		err = aux->transfer(aux, msg);
> > +		ret = aux->transfer(aux, msg);
> >  		mutex_unlock(&aux->hw_mutex);
> > -		if (err < 0) {
> > -			if (err == -EBUSY)
> > +		if (ret < 0) {
> > +			if (ret == -EBUSY)
> >  				continue;
> >  
> > -			DRM_DEBUG_KMS("transaction failed: %d\n", err);
> > -			return err;
> > +			DRM_DEBUG_KMS("transaction failed: %d\n", ret);
> > +			return ret;
> >  		}
> >  
> >  
> > @@ -457,9 +459,7 @@ static int drm_dp_i2c_do_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg)
> >  			 * Both native ACK and I2C ACK replies received. We
> >  			 * can assume the transfer was successful.
> >  			 */
> > -			if (err < msg->size)
> > -				return -EPROTO;
> > -			return 0;
> > +			return ret;
> 
> The s/err/ret/ seems a bit superfluous, but OTOH it does make sense
> since it's no longer just an error value. At first glance it just
> confused me a bit since I wasn't expecting that many changes to this
> function.
>

I'm going to hang onto that change - it makes things clearer when you read
the function outside the context of this patch.

> >  
> >  		case DP_AUX_I2C_REPLY_NACK:
> >  			DRM_DEBUG_KMS("I2C nack\n");
> > @@ -482,14 +482,67 @@ static int drm_dp_i2c_do_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg)
> >  	return -EREMOTEIO;
> >  }
> >  
> > +/*
> > + * Keep retrying drm_dp_i2c_do_msg until all data has been transferred.
> > + *
> > + * Returns an error code on failure, or a recommended transfer size on success.
> > + */
> > +static int drm_dp_i2c_drain_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg)
> > +{
> > +	int ret;
> > +	struct drm_dp_aux_msg drain_msg;
> > +	int drain_bytes;
> > +
> > +	ret = drm_dp_i2c_do_msg(aux, msg);
> > +
> > +	if (ret == msg->size || ret <= 0)
> > +		return ret == 0 ? -EPROTO : ret;
> > +
> > +	/*
> > +	 * We had a partial reply. Drain out the rest of the bytes we
> > +	 * originally requested, then return the size of the partial
> > +	 * reply. In theory, this should match DP 1.2's desired behaviour
> > +	 * for I2C over AUX.
> > +	 */
> 
> The spec is a bit self contradictory, but there is a section which seems
> to suggest this. It's only mentioned for reads mind you, but I don't see
> it causing harm for writes either. Not that we actually handle
> partial/deferred writes correctly at the moment.
> 
> > +	DRM_DEBUG_KMS("Partial I2C reply: requested %zu bytes got %d bytes\n", msg->size, ret);
> > +	drain_msg = *msg;
> > +	drain_msg.size -= ret;
> > +	drain_msg.buffer += ret;
> > +	while (drain_msg.size != 0) {
> > +		drain_bytes = drm_dp_i2c_do_msg(aux, &drain_msg);
> > +		if (drain_bytes <= 0)
> > +			return drain_bytes == 0 ? -EPROTO : drain_bytes;
> > +		drain_msg.size -= drain_bytes;
> > +		drain_msg.buffer += drain_bytes;
> > +	}
> 
> Somehow I don't like the duplicated code end up having here. So
> putting it all in a single loop would seem nicer to me. Maybe
> something along these lines?
> 
> struct drm_dp_aux_msg msg = *orig_msg;
> int ret = msg.size;
> while (msg.size > 0) {
> 	int err = drm_dp_i2c_do_msg(aux, &msg);
> 	if (err <= 0)
> 		return err == 0 ? -EPROTO : err;
> 
> 	if (err < msg.size && err < ret) {
> 		DRM_DEBUG_KMS("Partial I2C reply: requested %zu
> 			       bytes got %d bytes\n", msg.size, err);
> 		ret = err;
> 	}
> 
> 	msg.size -= err;
> 	msg.buffer += err;
> }
> 
> It would also reduce the returned preferred transfer size further if we
> (for whatever reason) got an even shorter reply while we're draining.
>
I'm not sure that that's the right behaviour, though. If we assume a 3 byte
FIFO in a device that does partial reads, we ask for 16 bytes, causing a
partial response that's 3 bytes long. We then drain out the remaining 13
bytes of the initial request (in case it's set up a 16 byte I2C transfer),
and the last of the reads is guaranteed to be 1 byte long.

We then shrink to 1 byte transfers, when the device would be capable of 3
byte transfers, and could possibly perform better with 3 byte transfers
rather than 1.

Having said that, this is all hypothetical until we find devices that do
partial replies - no-one's been able to find such a device so far.

I'll have a think and see if I can come up with a way to get the behaviour I
want with less code duplication - I might be able to do something by using a
sentinel value to spot first time round the loop.

> 
> > +	return ret;
> > +}
> > +
> > +/*
> > + * Bizlink designed DP->DVI-D Dual Link adapters require the I2C over AUX
> > + * packets to be as large as possible. If not, the I2C transactions never
> > + * succeed. Hence the default is maximum.
> > + */
> > +static int dp_aux_i2c_transfer_size __read_mostly = DP_AUX_MAX_PAYLOAD_BYTES;
> > +module_param_named(dp_aux_i2c_transfer_size, dp_aux_i2c_transfer_size, int, 0600);
> > +MODULE_PARM_DESC(dp_aux_i2c_transfer_size,
> > +		 "Number of bytes to transfer in a single I2C over DP AUX CH message, (1-16, default 16)");
> > +
> >  static int drm_dp_i2c_xfer(struct i2c_adapter *adapter, struct i2c_msg *msgs,
> >  			   int num)
> >  {
> >  	struct drm_dp_aux *aux = adapter->algo_data;
> >  	unsigned int i, j;
> > +	int transfer_size;
> >  	struct drm_dp_aux_msg msg;
> >  	int err = 0;
> >  
> > +	if (dp_aux_i2c_transfer_size < 1 || dp_aux_i2c_transfer_size > DP_AUX_MAX_PAYLOAD_BYTES) {
> > +		DRM_ERROR("dp_aux_i2c_transfer_size invalid - changing from %d to %d\n",
> > +			  dp_aux_i2c_transfer_size, DP_AUX_MAX_PAYLOAD_BYTES);
> > +		dp_aux_i2c_transfer_size = DP_AUX_MAX_PAYLOAD_BYTES;
> > +	}
> 
> The invalid values could be rejected with custom struct
> kernel_param_ops, but maybe that's a bit overkill. If not that, then
> I'm not sure the error message really has that much value. So I'm thinking
> we could just 'clamp(..., 1, DP_AUX_MAX_PAYLOAD_BYTES)' here.
>
I'll remove the message for v4, as well as marking the parameter unsafe with
module_param_named_unsafe, and use clamp instead of if().

> > +
> >  	memset(&msg, 0, sizeof(msg));
> >  
> >  	for (i = 0; i < num; i++) {
> > @@ -507,20 +560,20 @@ static int drm_dp_i2c_xfer(struct i2c_adapter *adapter, struct i2c_msg *msgs,
> >  		err = drm_dp_i2c_do_msg(aux, &msg);
> >  		if (err < 0)
> >  			break;
> > -		/*
> > -		 * Many hardware implementations support FIFOs larger than a
> > -		 * single byte, but it has been empirically determined that
> > -		 * transferring data in larger chunks can actually lead to
> > -		 * decreased performance. Therefore each message is simply
> > -		 * transferred byte-by-byte.
> > +		/* We want each transaction to be as large as possible, but
> > +		 * we'll go to smaller sizes if the hardware gives us a
> > +		 * short reply.
> >  		 */
> > -		for (j = 0; j < msgs[i].len; j++) {
> > +		transfer_size = dp_aux_i2c_transfer_size;
> > +		for (j = 0; j < msgs[i].len; j += msg.size) {
> >  			msg.buffer = msgs[i].buf + j;
> > -			msg.size = 1;
> > +			msg.size = min((unsigned)transfer_size, msgs[i].len - j);
> 
> Could make transfer_size unsigned in the first place.
> 
> >  
> > -			err = drm_dp_i2c_do_msg(aux, &msg);
> > -			if (err < 0)
> > +			transfer_size = drm_dp_i2c_drain_msg(aux, &msg);
> > +			if (transfer_size < 0) {
> > +				err = transfer_size;
> >  				break;
> > +			}
> 
> Maybe this is a bit more straight forward?
> 
> err = drm_dp_i2c_drain_msg(aux, &msg);
> if (err < 0)
> 	break;
> transfer_size = err;
>
That and making transfer_size unsigned seems like a good combination. Will
adopt for v4.

Ville Syrjälä Jan. 29, 2015, 2:36 p.m. UTC | #5

On Thu, Jan 29, 2015 at 02:24:09PM +0000, Simon Farnsworth wrote:
> On Thursday 29 January 2015 15:30:36 you wrote:
> > On Tue, Jan 27, 2015 at 03:43:49PM +0000, Simon Farnsworth wrote:
> --snip--
> > > diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c
> > > index 79968e3..3db116c 100644
> > > --- a/drivers/gpu/drm/drm_dp_helper.c
> > > +++ b/drivers/gpu/drm/drm_dp_helper.c
> > > @@ -396,11 +396,13 @@ static u32 drm_dp_i2c_functionality(struct i2c_adapter *adapter)
> > >   * retrying the transaction as appropriate.  It is assumed that the
> > >   * aux->transfer function does not modify anything in the msg other than the
> > >   * reply field.
> > > + *
> > > + * Returns bytes transferred on success, or a negative error code on failure.
> > >   */
> > >  static int drm_dp_i2c_do_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg)
> > >  {
> > >  	unsigned int retry;
> > > -	int err;
> > > +	int ret;
> > >  
> > >  	/*
> > >  	 * DP1.2 sections 2.7.7.1.5.6.1 and 2.7.7.1.6.6.1: A DP Source device
> > > @@ -409,14 +411,14 @@ static int drm_dp_i2c_do_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg)
> > >  	 */
> > >  	for (retry = 0; retry < 7; retry++) {
> > >  		mutex_lock(&aux->hw_mutex);
> > > -		err = aux->transfer(aux, msg);
> > > +		ret = aux->transfer(aux, msg);
> > >  		mutex_unlock(&aux->hw_mutex);
> > > -		if (err < 0) {
> > > -			if (err == -EBUSY)
> > > +		if (ret < 0) {
> > > +			if (ret == -EBUSY)
> > >  				continue;
> > >  
> > > -			DRM_DEBUG_KMS("transaction failed: %d\n", err);
> > > -			return err;
> > > +			DRM_DEBUG_KMS("transaction failed: %d\n", ret);
> > > +			return ret;
> > >  		}
> > >  
> > >  
> > > @@ -457,9 +459,7 @@ static int drm_dp_i2c_do_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg)
> > >  			 * Both native ACK and I2C ACK replies received. We
> > >  			 * can assume the transfer was successful.
> > >  			 */
> > > -			if (err < msg->size)
> > > -				return -EPROTO;
> > > -			return 0;
> > > +			return ret;
> > 
> > The s/err/ret/ seems a bit superfluous, but OTOH it does make sense
> > since it's no longer just an error value. At first glance it just
> > confused me a bit since I wasn't expecting that many changes to this
> > function.
> >
> 
> I'm going to hang onto that change - it makes things clearer when you read
> the function outside the context of this patch.
> 
> > >  
> > >  		case DP_AUX_I2C_REPLY_NACK:
> > >  			DRM_DEBUG_KMS("I2C nack\n");
> > > @@ -482,14 +482,67 @@ static int drm_dp_i2c_do_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg)
> > >  	return -EREMOTEIO;
> > >  }
> > >  
> > > +/*
> > > + * Keep retrying drm_dp_i2c_do_msg until all data has been transferred.
> > > + *
> > > + * Returns an error code on failure, or a recommended transfer size on success.
> > > + */
> > > +static int drm_dp_i2c_drain_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg)
> > > +{
> > > +	int ret;
> > > +	struct drm_dp_aux_msg drain_msg;
> > > +	int drain_bytes;
> > > +
> > > +	ret = drm_dp_i2c_do_msg(aux, msg);
> > > +
> > > +	if (ret == msg->size || ret <= 0)
> > > +		return ret == 0 ? -EPROTO : ret;
> > > +
> > > +	/*
> > > +	 * We had a partial reply. Drain out the rest of the bytes we
> > > +	 * originally requested, then return the size of the partial
> > > +	 * reply. In theory, this should match DP 1.2's desired behaviour
> > > +	 * for I2C over AUX.
> > > +	 */
> > 
> > The spec is a bit self contradictory, but there is a section which seems
> > to suggest this. It's only mentioned for reads mind you, but I don't see
> > it causing harm for writes either. Not that we actually handle
> > partial/deferred writes correctly at the moment.
> > 
> > > +	DRM_DEBUG_KMS("Partial I2C reply: requested %zu bytes got %d bytes\n", msg->size, ret);
> > > +	drain_msg = *msg;
> > > +	drain_msg.size -= ret;
> > > +	drain_msg.buffer += ret;
> > > +	while (drain_msg.size != 0) {
> > > +		drain_bytes = drm_dp_i2c_do_msg(aux, &drain_msg);
> > > +		if (drain_bytes <= 0)
> > > +			return drain_bytes == 0 ? -EPROTO : drain_bytes;
> > > +		drain_msg.size -= drain_bytes;
> > > +		drain_msg.buffer += drain_bytes;
> > > +	}
> > 
> > Somehow I don't like the duplicated code end up having here. So
> > putting it all in a single loop would seem nicer to me. Maybe
> > something along these lines?
> > 
> > struct drm_dp_aux_msg msg = *orig_msg;
> > int ret = msg.size;
> > while (msg.size > 0) {
> > 	int err = drm_dp_i2c_do_msg(aux, &msg);
> > 	if (err <= 0)
> > 		return err == 0 ? -EPROTO : err;
> > 
> > 	if (err < msg.size && err < ret) {
> > 		DRM_DEBUG_KMS("Partial I2C reply: requested %zu
> > 			       bytes got %d bytes\n", msg.size, err);
> > 		ret = err;
> > 	}
> > 
> > 	msg.size -= err;
> > 	msg.buffer += err;
> > }
> > 
> > It would also reduce the returned preferred transfer size further if we
> > (for whatever reason) got an even shorter reply while we're draining.
> >
> I'm not sure that that's the right behaviour, though. If we assume a 3 byte
> FIFO in a device that does partial reads, we ask for 16 bytes, causing a
> partial response that's 3 bytes long. We then drain out the remaining 13
> bytes of the initial request (in case it's set up a 16 byte I2C transfer),
> and the last of the reads is guaranteed to be 1 byte long.

My proposed code wouldn't reduce the transfer size in that case due to
the err<msg.size check. So it only considers shrinking the transfer size
when the reply came back with less data than was requested.

> 
> We then shrink to 1 byte transfers, when the device would be capable of 3
> byte transfers, and could possibly perform better with 3 byte transfers
> rather than 1.
> 
> Having said that, this is all hypothetical until we find devices that do
> partial replies - no-one's been able to find such a device so far.
> 
> I'll have a think and see if I can come up with a way to get the behaviour I
> want with less code duplication - I might be able to do something by using a
> sentinel value to spot first time round the loop.
> 
> > 
> > > +	return ret;
> > > +}
> > > +
> > > +/*
> > > + * Bizlink designed DP->DVI-D Dual Link adapters require the I2C over AUX
> > > + * packets to be as large as possible. If not, the I2C transactions never
> > > + * succeed. Hence the default is maximum.
> > > + */
> > > +static int dp_aux_i2c_transfer_size __read_mostly = DP_AUX_MAX_PAYLOAD_BYTES;
> > > +module_param_named(dp_aux_i2c_transfer_size, dp_aux_i2c_transfer_size, int, 0600);
> > > +MODULE_PARM_DESC(dp_aux_i2c_transfer_size,
> > > +		 "Number of bytes to transfer in a single I2C over DP AUX CH message, (1-16, default 16)");
> > > +
> > >  static int drm_dp_i2c_xfer(struct i2c_adapter *adapter, struct i2c_msg *msgs,
> > >  			   int num)
> > >  {
> > >  	struct drm_dp_aux *aux = adapter->algo_data;
> > >  	unsigned int i, j;
> > > +	int transfer_size;
> > >  	struct drm_dp_aux_msg msg;
> > >  	int err = 0;
> > >  
> > > +	if (dp_aux_i2c_transfer_size < 1 || dp_aux_i2c_transfer_size > DP_AUX_MAX_PAYLOAD_BYTES) {
> > > +		DRM_ERROR("dp_aux_i2c_transfer_size invalid - changing from %d to %d\n",
> > > +			  dp_aux_i2c_transfer_size, DP_AUX_MAX_PAYLOAD_BYTES);
> > > +		dp_aux_i2c_transfer_size = DP_AUX_MAX_PAYLOAD_BYTES;
> > > +	}
> > 
> > The invalid values could be rejected with custom struct
> > kernel_param_ops, but maybe that's a bit overkill. If not that, then
> > I'm not sure the error message really has that much value. So I'm thinking
> > we could just 'clamp(..., 1, DP_AUX_MAX_PAYLOAD_BYTES)' here.
> >
> I'll remove the message for v4, as well as marking the parameter unsafe with
> module_param_named_unsafe, and use clamp instead of if().
> 
> > > +
> > >  	memset(&msg, 0, sizeof(msg));
> > >  
> > >  	for (i = 0; i < num; i++) {
> > > @@ -507,20 +560,20 @@ static int drm_dp_i2c_xfer(struct i2c_adapter *adapter, struct i2c_msg *msgs,
> > >  		err = drm_dp_i2c_do_msg(aux, &msg);
> > >  		if (err < 0)
> > >  			break;
> > > -		/*
> > > -		 * Many hardware implementations support FIFOs larger than a
> > > -		 * single byte, but it has been empirically determined that
> > > -		 * transferring data in larger chunks can actually lead to
> > > -		 * decreased performance. Therefore each message is simply
> > > -		 * transferred byte-by-byte.
> > > +		/* We want each transaction to be as large as possible, but
> > > +		 * we'll go to smaller sizes if the hardware gives us a
> > > +		 * short reply.
> > >  		 */
> > > -		for (j = 0; j < msgs[i].len; j++) {
> > > +		transfer_size = dp_aux_i2c_transfer_size;
> > > +		for (j = 0; j < msgs[i].len; j += msg.size) {
> > >  			msg.buffer = msgs[i].buf + j;
> > > -			msg.size = 1;
> > > +			msg.size = min((unsigned)transfer_size, msgs[i].len - j);
> > 
> > Could make transfer_size unsigned in the first place.
> > 
> > >  
> > > -			err = drm_dp_i2c_do_msg(aux, &msg);
> > > -			if (err < 0)
> > > +			transfer_size = drm_dp_i2c_drain_msg(aux, &msg);
> > > +			if (transfer_size < 0) {
> > > +				err = transfer_size;
> > >  				break;
> > > +			}
> > 
> > Maybe this is a bit more straight forward?
> > 
> > err = drm_dp_i2c_drain_msg(aux, &msg);
> > if (err < 0)
> > 	break;
> > transfer_size = err;
> >
> That and making transfer_size unsigned seems like a good combination. Will
> adopt for v4.
> 
> -- 
> Simon Farnsworth
> Software Engineer
> ONELAN Ltd
> http://www.onelan.com

Simon Farnsworth Jan. 29, 2015, 3:14 p.m. UTC | #6

On Thursday 29 January 2015 16:36:55 Ville Syrjälä wrote:
> On Thu, Jan 29, 2015 at 02:24:09PM +0000, Simon Farnsworth wrote:
> > On Thursday 29 January 2015 15:30:36 you wrote:
> > > On Tue, Jan 27, 2015 at 03:43:49PM +0000, Simon Farnsworth wrote:
--snip--
> > > > +	DRM_DEBUG_KMS("Partial I2C reply: requested %zu bytes got %d bytes\n", msg->size, ret);
> > > > +	drain_msg = *msg;
> > > > +	drain_msg.size -= ret;
> > > > +	drain_msg.buffer += ret;
> > > > +	while (drain_msg.size != 0) {
> > > > +		drain_bytes = drm_dp_i2c_do_msg(aux, &drain_msg);
> > > > +		if (drain_bytes <= 0)
> > > > +			return drain_bytes == 0 ? -EPROTO : drain_bytes;
> > > > +		drain_msg.size -= drain_bytes;
> > > > +		drain_msg.buffer += drain_bytes;
> > > > +	}
> > > 
> > > Somehow I don't like the duplicated code end up having here. So
> > > putting it all in a single loop would seem nicer to me. Maybe
> > > something along these lines?
> > > 
> > > struct drm_dp_aux_msg msg = *orig_msg;
> > > int ret = msg.size;
> > > while (msg.size > 0) {
> > > 	int err = drm_dp_i2c_do_msg(aux, &msg);
> > > 	if (err <= 0)
> > > 		return err == 0 ? -EPROTO : err;
> > > 
> > > 	if (err < msg.size && err < ret) {
> > > 		DRM_DEBUG_KMS("Partial I2C reply: requested %zu
> > > 			       bytes got %d bytes\n", msg.size, err);
> > > 		ret = err;
> > > 	}
> > > 
> > > 	msg.size -= err;
> > > 	msg.buffer += err;
> > > }
> > > 
> > > It would also reduce the returned preferred transfer size further if we
> > > (for whatever reason) got an even shorter reply while we're draining.
> > >
> > I'm not sure that that's the right behaviour, though. If we assume a 3 byte
> > FIFO in a device that does partial reads, we ask for 16 bytes, causing a
> > partial response that's 3 bytes long. We then drain out the remaining 13
> > bytes of the initial request (in case it's set up a 16 byte I2C transfer),
> > and the last of the reads is guaranteed to be 1 byte long.
> 
> My proposed code wouldn't reduce the transfer size in that case due to
> the err<msg.size check. So it only considers shrinking the transfer size
> when the reply came back with less data than was requested.
>
I see, yes. So the only time it'd reduce further is if the hypothetical
device gave a 3 byte response to the 16 byte transfer, then 2 byte responses
to the remaining requests that try to drain the FIFO.

Given that this is now in the realms of "if hardware designers took pleasure
in making their hardware horrible", rather than just "HW designer takes a
short cut and lets software handle the pain", I'll take your code for v4.
> > 
> > We then shrink to 1 byte transfers, when the device would be capable of 3
> > byte transfers, and could possibly perform better with 3 byte transfers
> > rather than 1.
> > 
> > Having said that, this is all hypothetical until we find devices that do
> > partial replies - no-one's been able to find such a device so far.
> >

Jani Nikula Jan. 30, 2015, 6:11 a.m. UTC | #7

On Tue, 27 Jan 2015, Simon Farnsworth <simon.farnsworth@onelan.co.uk> wrote:
> DisplayPort to DVI-D Dual Link adapters designed by Bizlink have bugs in
> their I2C over AUX implementation. They work fine with Windows, but fail
> with Linux.
>
> It turns out that they cannot keep an I2C transaction open unless the
> previous read was 16 bytes; shorter reads can only be followed by a zero
> byte transfer ending the I2C transaction.
>
> Copy Windows's behaviour, and read 16 bytes at a time. If we get a short
> reply, assume that there's a hardware bottleneck, and shrink our read size
> to match. For this purpose, use the algorithm in the DisplayPort 1.2 spec,
> in the hopes that it'll be closest to what Windows does, as no sink I've
> found actually gives short replies.
>
> Also provide a module parameter for testing smaller transfer sizes, in case
> there are sinks out there that cannot work with Windows.
>
> Signed-off-by: Simon Farnsworth <simon.farnsworth@onelan.co.uk>

Simon -

I've been requesting testing of your patch on various related bugs, and
we have a winner. And you probably have a new friend over there too! ;)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55228
Tested-by: Aidan Marks <aidanamarks@gmail.com>

Please add those to the next version. And thanks for pursuing this, good
job!

BR,
Jani.


> ---
>
> v3 changes, after feedback from Ville and more testing of Windows:
>
>  * Change the short reply algorithm to match Ville's description of the
>    DisplayPort 1.2 spec wording.
>
>  * Add a module parameter to set the default transfer size for
>    experiments. Requested over IRC by Ville.
>
> No-one's been able to find a device that does short replies, but experiments
> show that bigger reads are faster on most devices. Ville got:
>
>  DP->DVI (OUI 001cf8):  40ms -> 35ms
>  DP->VGA (OUI 0022b9):  45ms -> 38ms
>  Zotac DP->2xHDMI:      25ms ->  4ms
>
> v2 changes, after feedback from Thierry and Ville:
>
>  * Handle short replies. I've decided (arbitrarily) that a short reply
>    results in us dropping back to the newly chosen size for the rest of this
>    I2C transaction. Thus, given an attempt to read the first 16 bytes of
>    EDID, and a sink that only does 4 bytes of buffering, we will see the
>    following AUX transfers for the EDID read (after address is set):
>
>    <set address, block etc>
>    Read 16 bytes from I2C over AUX.
>    Reply with 4 bytes
>    Read 4 bytes
>    Reply with 4 bytes
>    Read 4 bytes
>    Reply with 4 bytes
>    Read 4 bytes
>    Reply with 4 bytes
>    <end I2C transaction>
>
> Note that I've not looked at MST support - I have neither the DP 1.2 spec
> nor any MST branch devices, so I can't test anything I write or check it
> against a spec. It looks from the code, however, as if MST has the branch
> device do the split from a big request into small transactions.
>
>  drivers/gpu/drm/drm_dp_helper.c | 91 ++++++++++++++++++++++++++++++++---------
>  include/drm/drm_dp_helper.h     |  5 +++
>  2 files changed, 77 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c
> index 79968e3..3db116c 100644
> --- a/drivers/gpu/drm/drm_dp_helper.c
> +++ b/drivers/gpu/drm/drm_dp_helper.c
> @@ -396,11 +396,13 @@ static u32 drm_dp_i2c_functionality(struct i2c_adapter *adapter)
>   * retrying the transaction as appropriate.  It is assumed that the
>   * aux->transfer function does not modify anything in the msg other than the
>   * reply field.
> + *
> + * Returns bytes transferred on success, or a negative error code on failure.
>   */
>  static int drm_dp_i2c_do_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg)
>  {
>  	unsigned int retry;
> -	int err;
> +	int ret;
>  
>  	/*
>  	 * DP1.2 sections 2.7.7.1.5.6.1 and 2.7.7.1.6.6.1: A DP Source device
> @@ -409,14 +411,14 @@ static int drm_dp_i2c_do_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg)
>  	 */
>  	for (retry = 0; retry < 7; retry++) {
>  		mutex_lock(&aux->hw_mutex);
> -		err = aux->transfer(aux, msg);
> +		ret = aux->transfer(aux, msg);
>  		mutex_unlock(&aux->hw_mutex);
> -		if (err < 0) {
> -			if (err == -EBUSY)
> +		if (ret < 0) {
> +			if (ret == -EBUSY)
>  				continue;
>  
> -			DRM_DEBUG_KMS("transaction failed: %d\n", err);
> -			return err;
> +			DRM_DEBUG_KMS("transaction failed: %d\n", ret);
> +			return ret;
>  		}
>  
>  
> @@ -457,9 +459,7 @@ static int drm_dp_i2c_do_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg)
>  			 * Both native ACK and I2C ACK replies received. We
>  			 * can assume the transfer was successful.
>  			 */
> -			if (err < msg->size)
> -				return -EPROTO;
> -			return 0;
> +			return ret;
>  
>  		case DP_AUX_I2C_REPLY_NACK:
>  			DRM_DEBUG_KMS("I2C nack\n");
> @@ -482,14 +482,67 @@ static int drm_dp_i2c_do_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg)
>  	return -EREMOTEIO;
>  }
>  
> +/*
> + * Keep retrying drm_dp_i2c_do_msg until all data has been transferred.
> + *
> + * Returns an error code on failure, or a recommended transfer size on success.
> + */
> +static int drm_dp_i2c_drain_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg)
> +{
> +	int ret;
> +	struct drm_dp_aux_msg drain_msg;
> +	int drain_bytes;
> +
> +	ret = drm_dp_i2c_do_msg(aux, msg);
> +
> +	if (ret == msg->size || ret <= 0)
> +		return ret == 0 ? -EPROTO : ret;
> +
> +	/*
> +	 * We had a partial reply. Drain out the rest of the bytes we
> +	 * originally requested, then return the size of the partial
> +	 * reply. In theory, this should match DP 1.2's desired behaviour
> +	 * for I2C over AUX.
> +	 */
> +	DRM_DEBUG_KMS("Partial I2C reply: requested %zu bytes got %d bytes\n", msg->size, ret);
> +	drain_msg = *msg;
> +	drain_msg.size -= ret;
> +	drain_msg.buffer += ret;
> +	while (drain_msg.size != 0) {
> +		drain_bytes = drm_dp_i2c_do_msg(aux, &drain_msg);
> +		if (drain_bytes <= 0)
> +			return drain_bytes == 0 ? -EPROTO : drain_bytes;
> +		drain_msg.size -= drain_bytes;
> +		drain_msg.buffer += drain_bytes;
> +	}
> +	return ret;
> +}
> +
> +/*
> + * Bizlink designed DP->DVI-D Dual Link adapters require the I2C over AUX
> + * packets to be as large as possible. If not, the I2C transactions never
> + * succeed. Hence the default is maximum.
> + */
> +static int dp_aux_i2c_transfer_size __read_mostly = DP_AUX_MAX_PAYLOAD_BYTES;
> +module_param_named(dp_aux_i2c_transfer_size, dp_aux_i2c_transfer_size, int, 0600);
> +MODULE_PARM_DESC(dp_aux_i2c_transfer_size,
> +		 "Number of bytes to transfer in a single I2C over DP AUX CH message, (1-16, default 16)");
> +
>  static int drm_dp_i2c_xfer(struct i2c_adapter *adapter, struct i2c_msg *msgs,
>  			   int num)
>  {
>  	struct drm_dp_aux *aux = adapter->algo_data;
>  	unsigned int i, j;
> +	int transfer_size;
>  	struct drm_dp_aux_msg msg;
>  	int err = 0;
>  
> +	if (dp_aux_i2c_transfer_size < 1 || dp_aux_i2c_transfer_size > DP_AUX_MAX_PAYLOAD_BYTES) {
> +		DRM_ERROR("dp_aux_i2c_transfer_size invalid - changing from %d to %d\n",
> +			  dp_aux_i2c_transfer_size, DP_AUX_MAX_PAYLOAD_BYTES);
> +		dp_aux_i2c_transfer_size = DP_AUX_MAX_PAYLOAD_BYTES;
> +	}
> +
>  	memset(&msg, 0, sizeof(msg));
>  
>  	for (i = 0; i < num; i++) {
> @@ -507,20 +560,20 @@ static int drm_dp_i2c_xfer(struct i2c_adapter *adapter, struct i2c_msg *msgs,
>  		err = drm_dp_i2c_do_msg(aux, &msg);
>  		if (err < 0)
>  			break;
> -		/*
> -		 * Many hardware implementations support FIFOs larger than a
> -		 * single byte, but it has been empirically determined that
> -		 * transferring data in larger chunks can actually lead to
> -		 * decreased performance. Therefore each message is simply
> -		 * transferred byte-by-byte.
> +		/* We want each transaction to be as large as possible, but
> +		 * we'll go to smaller sizes if the hardware gives us a
> +		 * short reply.
>  		 */
> -		for (j = 0; j < msgs[i].len; j++) {
> +		transfer_size = dp_aux_i2c_transfer_size;
> +		for (j = 0; j < msgs[i].len; j += msg.size) {
>  			msg.buffer = msgs[i].buf + j;
> -			msg.size = 1;
> +			msg.size = min((unsigned)transfer_size, msgs[i].len - j);
>  
> -			err = drm_dp_i2c_do_msg(aux, &msg);
> -			if (err < 0)
> +			transfer_size = drm_dp_i2c_drain_msg(aux, &msg);
> +			if (transfer_size < 0) {
> +				err = transfer_size;
>  				break;
> +			}
>  		}
>  		if (err < 0)
>  			break;
> diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h
> index 11f8c84..444d51b 100644
> --- a/include/drm/drm_dp_helper.h
> +++ b/include/drm/drm_dp_helper.h
> @@ -42,6 +42,8 @@
>   * 1.2 formally includes both eDP and DPI definitions.
>   */
>  
> +#define DP_AUX_MAX_PAYLOAD_BYTES	16
> +
>  #define DP_AUX_I2C_WRITE		0x0
>  #define DP_AUX_I2C_READ			0x1
>  #define DP_AUX_I2C_STATUS		0x2
> @@ -519,6 +521,9 @@ struct drm_dp_aux_msg {
>   * transactions. The drm_dp_aux_register_i2c_bus() function registers an
>   * I2C adapter that can be passed to drm_probe_ddc(). Upon removal, drivers
>   * should call drm_dp_aux_unregister_i2c_bus() to remove the I2C adapter.
> + * The I2C adapter uses long transfers by default; if a partial response is
> + * received, the adapter will drop down to the size given by the partial
> + * response for this transaction only.
>   *
>   * Note that the aux helper code assumes that the .transfer() function
>   * only modifies the reply field of the drm_dp_aux_msg structure.  The
> -- 
> 2.1.0
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

Ville Syrjälä Jan. 30, 2015, 6:45 p.m. UTC | #8

On Tue, Jan 27, 2015 at 03:43:49PM +0000, Simon Farnsworth wrote:
> DisplayPort to DVI-D Dual Link adapters designed by Bizlink have bugs in
> their I2C over AUX implementation. They work fine with Windows, but fail
> with Linux.
> 
> It turns out that they cannot keep an I2C transaction open unless the
> previous read was 16 bytes; shorter reads can only be followed by a zero
> byte transfer ending the I2C transaction.
> 
> Copy Windows's behaviour, and read 16 bytes at a time. If we get a short
> reply, assume that there's a hardware bottleneck, and shrink our read size
> to match. For this purpose, use the algorithm in the DisplayPort 1.2 spec,
> in the hopes that it'll be closest to what Windows does, as no sink I've
> found actually gives short replies.
> 
> Also provide a module parameter for testing smaller transfer sizes, in case
> there are sinks out there that cannot work with Windows.
> 
> Signed-off-by: Simon Farnsworth <simon.farnsworth@onelan.co.uk>
> ---
> 
> v3 changes, after feedback from Ville and more testing of Windows:
> 
>  * Change the short reply algorithm to match Ville's description of the
>    DisplayPort 1.2 spec wording.
> 
>  * Add a module parameter to set the default transfer size for
>    experiments. Requested over IRC by Ville.
> 
> No-one's been able to find a device that does short replies, but experiments
> show that bigger reads are faster on most devices. Ville got:
> 
>  DP->DVI (OUI 001cf8):  40ms -> 35ms
>  DP->VGA (OUI 0022b9):  45ms -> 38ms
>  Zotac DP->2xHDMI:      25ms ->  4ms

Another datapoint:
Asus PB278 monitor: 22ms -> 3ms

I've been pondering about these massive improvements for the Zotac and
Asus. After reading the DP spec a bit more I've noticed just how wasteful
AUX is. If my understanding of the spec and arithmetic aren't totally off,
the effective bandwidth for EDID reads is ~400kbps when using 16 byte AUX
messages, but it drops to only ~60kbps when doing 1 byte AUX messages.
Those seem to match reasonably well with my measurements. I never realized
just how slow it can be since the 1Mbps number feels like plenty, and no
one ever seems to mention the effective bandwidth.

Simon Farnsworth Feb. 2, 2015, 11:16 a.m. UTC | #9

On Friday 30 January 2015 20:45:28 Ville Syrjälä wrote:
> On Tue, Jan 27, 2015 at 03:43:49PM +0000, Simon Farnsworth wrote:
> > DisplayPort to DVI-D Dual Link adapters designed by Bizlink have bugs in
> > their I2C over AUX implementation. They work fine with Windows, but fail
> > with Linux.
> > 
> > It turns out that they cannot keep an I2C transaction open unless the
> > previous read was 16 bytes; shorter reads can only be followed by a zero
> > byte transfer ending the I2C transaction.
> > 
> > Copy Windows's behaviour, and read 16 bytes at a time. If we get a short
> > reply, assume that there's a hardware bottleneck, and shrink our read size
> > to match. For this purpose, use the algorithm in the DisplayPort 1.2 spec,
> > in the hopes that it'll be closest to what Windows does, as no sink I've
> > found actually gives short replies.
> > 
> > Also provide a module parameter for testing smaller transfer sizes, in case
> > there are sinks out there that cannot work with Windows.
> > 
> > Signed-off-by: Simon Farnsworth <simon.farnsworth@onelan.co.uk>
> > ---
> > 
> > v3 changes, after feedback from Ville and more testing of Windows:
> > 
> >  * Change the short reply algorithm to match Ville's description of the
> >    DisplayPort 1.2 spec wording.
> > 
> >  * Add a module parameter to set the default transfer size for
> >    experiments. Requested over IRC by Ville.
> > 
> > No-one's been able to find a device that does short replies, but experiments
> > show that bigger reads are faster on most devices. Ville got:
> > 
> >  DP->DVI (OUI 001cf8):  40ms -> 35ms
> >  DP->VGA (OUI 0022b9):  45ms -> 38ms
> >  Zotac DP->2xHDMI:      25ms ->  4ms
> 
> Another datapoint:
> Asus PB278 monitor: 22ms -> 3ms
> 
> I've been pondering about these massive improvements for the Zotac and
> Asus. After reading the DP spec a bit more I've noticed just how wasteful
> AUX is. If my understanding of the spec and arithmetic aren't totally off,
> the effective bandwidth for EDID reads is ~400kbps when using 16 byte AUX
> messages, but it drops to only ~60kbps when doing 1 byte AUX messages.
> Those seem to match reasonably well with my measurements. I never realized
> just how slow it can be since the 1Mbps number feels like plenty, and no
> one ever seems to mention the effective bandwidth.
>

Doing the maths myself, to see if you're in the right ballpark:

Each AUX message has 24 bits worth of SYNC/STOP overhead, and between 10 and
16 bits of precharge overhead.

You've got 32 bits of data in the request. In the reply, you have 8 overhead
bits, and however many bits of data you requested.

Each request therefore takes 66 to 72 bits to make. A reply has 42 to 48
bits of overhead, plus the bits you requested. This makes 108 to 120 bits of
overhead for each I2C transfer.

At one byte per request, we're using 116 to 128 bit times for every EDID
byte. This is 14.5 to 16 bit times per payload bit transferred, for a
payload bit rate on a 1 MHz channel between 62.5 kbit/s and 69.0 kbit/s.

At 16 bytes per request, we're using 236 to 248 bit times for every 16 bytes
of EDID. This is 1.84 to 1.94 bit times per payload bit transferred, for a
payload bit rate between 516.1 kbit/s and 542.3 kbit/s.

[v3] drm/dp: Use large transactions for I2C over AUX

Commit Message

Comments

Patch