diff mbox

mlx4_core: Fix crash on uninitialized priv->cmd.slave_sem

Message ID 1348634552-21047-1-git-send-email-roland@kernel.org (mailing list archive)
State Accepted, archived
Headers show

Commit Message

Roland Dreier Sept. 26, 2012, 4:42 a.m. UTC
From: Roland Dreier <roland@purestorage.com>

On an SR-IOV master device, __mlx4_init_one() calls mlx4_init_hca()
before mlx4_multi_func_init().  However, for unlucky configurations,
mlx4_init_hca() might call mlx4_SENSE_PORT() (via mlx4_dev_cap()), and
that calls mlx4_cmd_imm() with MLX4_CMD_WRAPPED set.

However, on a multifunction device with MLX4_CMD_WRAPPED, __mlx4_cmd()
calls into mlx4_slave_cmd(), and that immediately tries to do

	down(&priv->cmd.slave_sem);

but priv->cmd.slave_sem isn't initialized until mlx4_multi_func_init()
(which we haven't called yet).  The next thing it tries to do is access
priv->mfunc.vhcr, but that hasn't been allocated yet.

Fix this by moving the initialization of slave_sem and vhcr up into
mlx4_cmd_init(). Also, since slave_sem is really just being used as a
mutex, convert it into a slave_cmd_mutex.

Signed-off-by: Roland Dreier <roland@purestorage.com>
---
Jack, I needed this to get my (CX3 w/ FW 2.11.500) adapter to work in
SR-IOV mode.  Is it possible you never tested SR-IOV on an adapter
with ports in autosensing mode?

 drivers/net/ethernet/mellanox/mlx4/cmd.c  |   51 ++++++++++++++++++-----------
 drivers/net/ethernet/mellanox/mlx4/main.c |   11 ++++---
 drivers/net/ethernet/mellanox/mlx4/mlx4.h |    2 +-
 3 files changed, 38 insertions(+), 26 deletions(-)

Comments

Roland Dreier Sept. 26, 2012, 4:46 a.m. UTC | #1
By the way, I still get a steady stream of

mlx4_core 0000:05:00.0: Unknown command:0x4d accepted from slave:0
mlx4_core 0000:05:00.0: Sense command failed for port: 1

once I load the driver... it seems SENSE_PORT is a wrapped command
but there's no entry in cmd_info[] for it?

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Roland Dreier Sept. 26, 2012, 5:01 p.m. UTC | #2
On Tue, Sep 25, 2012 at 9:46 PM, Roland Dreier <roland@kernel.org> wrote:
> By the way, I still get a steady stream of
>
> mlx4_core 0000:05:00.0: Unknown command:0x4d accepted from slave:0
> mlx4_core 0000:05:00.0: Sense command failed for port: 1
>
> once I load the driver... it seems SENSE_PORT is a wrapped command
> but there's no entry in cmd_info[] for it?

I guess Jack is on vacation until next week.  Or, anyone else who knows
anything about this stuff?

In particular I'm planning on including the patch in this thread in my pull
request with SR-IOV in it -- the patch is the difference between crashing
on driver load and mostly working on my setup.

But it seems you guys don't use SR-IOV in setups where the driver
calls SENSE_PORT?  Is something different about my test setup?

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Or Gerlitz Sept. 26, 2012, 9:51 p.m. UTC | #3
On Wed, Sep 26, 2012 at 6:42 AM, Roland Dreier <roland@kernel.org> wrote:
> From: Roland Dreier <roland@purestorage.com>
>
> On an SR-IOV master device, __mlx4_init_one() calls mlx4_init_hca()
> before mlx4_multi_func_init().  However, for unlucky configurations,
> mlx4_init_hca() might call mlx4_SENSE_PORT() (via mlx4_dev_cap()), and
> that calls mlx4_cmd_imm() with MLX4_CMD_WRAPPED set.
>
> However, on a multifunction device with MLX4_CMD_WRAPPED, __mlx4_cmd()
> calls into mlx4_slave_cmd(), and that immediately tries to do
>
>         down(&priv->cmd.slave_sem);
>
> but priv->cmd.slave_sem isn't initialized until mlx4_multi_func_init()
> (which we haven't called yet).  The next thing it tries to do is access
> priv->mfunc.vhcr, but that hasn't been allocated yet.
>
> Fix this by moving the initialization of slave_sem and vhcr up into
> mlx4_cmd_init(). Also, since slave_sem is really just being used as a
> mutex, convert it into a slave_cmd_mutex.
>
> Signed-off-by: Roland Dreier <roland@purestorage.com>
> ---
> Jack, I needed this to get my (CX3 w/ FW 2.11.500) adapter to work in
> SR-IOV mode.  Is it possible you never tested SR-IOV on an adapter
> with ports in autosensing mode?

Roland,

What exactly did you mean by saying "for unlucky configurations"
above? what value did you use for mlx4_core's port_array_type module
param?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Roland Dreier Sept. 27, 2012, 6:46 a.m. UTC | #4
On Wed, Sep 26, 2012 at 2:51 PM, Or Gerlitz <or.gerlitz@gmail.com> wrote:
> What exactly did you mean by saying "for unlucky configurations"
> above? what value did you use for mlx4_core's port_array_type module
> param?

I didn't set the parameter at all.  What I meant by "unlucky configs" was
all the conditions in the following code in mlx4_dev_cap():

		/*
		 * Link sensing is allowed on the port if 3 conditions are true:
		 * 1. Both protocols are supported on the port.
		 * 2. Different types are supported on the port
		 * 3. FW declared that it supports link sensing
		 */
		mlx4_priv(dev)->sense.sense_allowed[i] =
			((dev->caps.supported_type[i] == MLX4_PORT_TYPE_AUTO) &&
			 (dev->caps.flags & MLX4_DEV_CAP_FLAG_DPDP) &&
			 (dev->caps.flags & MLX4_DEV_CAP_FLAG_SENSE_SUPPORT));

		/*
		 * If "default_sense" bit is set, we move the port to "AUTO" mode
		 * and perform sense_port FW command to try and set the correct
		 * port type from beginning
		 */
		if (mlx4_priv(dev)->sense.sense_allowed[i] && dev->caps.default_sense[i]) {
			enum mlx4_port_type sensed_port = MLX4_PORT_TYPE_NONE;
			dev->caps.possible_type[i] = MLX4_PORT_TYPE_AUTO;
			mlx4_SENSE_PORT(dev, i, &sensed_port);
			if (sensed_port != MLX4_PORT_TYPE_NONE)
				dev->caps.port_type[i] = sensed_port;
		} else {

I'm beginning to think it might be that I had nothing hooked up
to my adapter initially, which seems to lead to the fatal SENSE_PORT
call here.  I notice that once I bring up the link, the stream of failing
SENSE_PORT calls (which don't ever work in SR-IOV mode
AFAICT) stops as well.

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Or Gerlitz Sept. 27, 2012, 7:57 a.m. UTC | #5
On 27/09/2012 08:46, Roland Dreier wrote:
> On Wed, Sep 26, 2012 at 2:51 PM, Or Gerlitz <or.gerlitz@gmail.com> wrote:
>> What exactly did you mean by saying "for unlucky configurations"
>> above? what value did you use for mlx4_core's port_array_type module
>> param?
> I didn't set the parameter at all.  What I meant by "unlucky configs" was
> all the conditions in the following code in mlx4_dev_cap():
>
> 		/*
> 		 * Link sensing is allowed on the port if 3 conditions are true:
> 		 * 1. Both protocols are supported on the port.
> 		 * 2. Different types are supported on the port
> 		 * 3. FW declared that it supports link sensing
> 		 */
> 		mlx4_priv(dev)->sense.sense_allowed[i] =
> 			((dev->caps.supported_type[i] == MLX4_PORT_TYPE_AUTO) &&
> 			 (dev->caps.flags & MLX4_DEV_CAP_FLAG_DPDP) &&
> 			 (dev->caps.flags & MLX4_DEV_CAP_FLAG_SENSE_SUPPORT));
>
> 		/*
> 		 * If "default_sense" bit is set, we move the port to "AUTO" mode
> 		 * and perform sense_port FW command to try and set the correct
> 		 * port type from beginning
> 		 */
> 		if (mlx4_priv(dev)->sense.sense_allowed[i] && dev->caps.default_sense[i]) {
> 			enum mlx4_port_type sensed_port = MLX4_PORT_TYPE_NONE;
> 			dev->caps.possible_type[i] = MLX4_PORT_TYPE_AUTO;
> 			mlx4_SENSE_PORT(dev, i, &sensed_port);
> 			if (sensed_port != MLX4_PORT_TYPE_NONE)
> 				dev->caps.port_type[i] = sensed_port;
> 		} else {
>
> I'm beginning to think it might be that I had nothing hooked up
> to my adapter initially, which seems to lead to the fatal SENSE_PORT
> call here.  I notice that once I bring up the link, the stream of failing
> SENSE_PORT calls (which don't ever work in SR-IOV mode AFAICT) stops as well.
>
>

Roland,

As I wrote you, the team was off over the last two days (*) and we will 
do some more detailed checks
to understand what is missing in the code, on the architecture level 
auto-sensing for link type isn't
supported in the current code also for the PPF in SRIOV mode, so at 
first sight, the above code
snippest seems  to miss checking if we're on "native" or mfunc mode 
before allowing itself to
go into auto-sense mode, but again, let us look there deeper. What do 
you mean by "I had nothing hooked
up to my adapter initially"?

Or.

(*) BTW - unluckily we're on a HHS {High Holiday Season} now, in 
conjunction with the coming opening
of the 3.7 merge window on the weekend, which means that many people are 
mostly off next week as
well, Jack will be around from Tuesday.



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Or Gerlitz Sept. 27, 2012, 8:22 a.m. UTC | #6
On 27/09/2012 10:17, Roland Dreier wrote:
> I think I had it cabled up directly to another HCA, and that HCA was in a
> system that was either off or at least didn't have the driver loaded.  So the
> port was in the physically DOWN state...
>
> However, I just tried it and even with that other HCA enabled (and running
> opensm), I still see
>
>      mlx4_core 0000:05:00.0: Unknown command:0x4d accepted from slave:0
>      mlx4_core 0000:05:00.0: Sense command failed for port: 1
>
> before init_hca finishes, so I think I would have crashed without my patch
> still.  Not sure I understand how you guys miss hitting this fatal call to
> SENSE_PORT during init (or what the intention for SENSE_PORT is
> since right now in SR-IOV mode it always fails due to no wrapper).
>
> I'm not doing anything funny with the port type, I have a one-port CX3
> running FW built with the exact default .ini file, except for enabling SR-IOV.
> And the only mlx4_core module parameter I'm passing in is "num_vfs=1".
>

Thanks for the further details, sounds like something/s is indeed broken 
here, in the code and/or
in the regression testing of it. We'll do our best to provide deeper 
reasoning of what's going on today,
and if not, by Tuesday when Jack is back. This way (your patch) or 
another (some other direction suggested
by the team or Jack) seems we're safe for the merge window.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Or Gerlitz Sept. 27, 2012, 12:35 p.m. UTC | #7
On 27/09/2012 08:46, Roland Dreier wrote:
> On Wed, Sep 26, 2012 at 2:51 PM, Or Gerlitz <or.gerlitz@gmail.com> wrote:
>> What exactly did you mean by saying "for unlucky configurations"
>> above? what value did you use for mlx4_core's port_array_type module
>> param?
> I didn't set the parameter at all.  What I meant by "unlucky configs" was
> all the conditions in the following code in mlx4_dev_cap():
>
> 		/*
> 		 * Link sensing is allowed on the port if 3 conditions are true:
> 		 * 1. Both protocols are supported on the port.
> 		 * 2. Different types are supported on the port
> 		 * 3. FW declared that it supports link sensing
> 		 */
> 		mlx4_priv(dev)->sense.sense_allowed[i] =
> 			((dev->caps.supported_type[i] == MLX4_PORT_TYPE_AUTO) &&
> 			 (dev->caps.flags & MLX4_DEV_CAP_FLAG_DPDP) &&
> 			 (dev->caps.flags & MLX4_DEV_CAP_FLAG_SENSE_SUPPORT));
>
> 		/*
> 		 * If "default_sense" bit is set, we move the port to "AUTO" mode
> 		 * and perform sense_port FW command to try and set the correct
> 		 * port type from beginning
> 		 */
> 		if (mlx4_priv(dev)->sense.sense_allowed[i] && dev->caps.default_sense[i]) {
> 			enum mlx4_port_type sensed_port = MLX4_PORT_TYPE_NONE;
> 			dev->caps.possible_type[i] = MLX4_PORT_TYPE_AUTO;
> 			mlx4_SENSE_PORT(dev, i, &sensed_port);
> 			if (sensed_port != MLX4_PORT_TYPE_NONE)
> 				dev->caps.port_type[i] = sensed_port;
> 		} else {

Roland,

Looking deeper on the issues you reported, we noted that the bug is up 
here on the RHS of

if (mlx4_priv(dev)->sense.sense_allowed[i] && dev->caps.default_sense[i])

as when a firmware sets the “do_sense” bit as 1 (which is the default in 
2.11.0500), the driver begins to do sensing. This firmware somehow went 
out without enough mfunc checking, bad. A fast solution would be a one 
liner patch that forces the LHS of this line to be false under mfunc 
mode, that is add forth condition to the three conditions above which is 
&& !mfunc

In the overall driver architecture level, auto-sending can/will be abled 
in later state, when the PPF
can signal the VFs to reset/re-create their mlx4 device when the link 
type change, as done in native
mode.

Or.



> I'm beginning to think it might be that I had nothing hooked up
> to my adapter initially, which seems to lead to the fatal SENSE_PORT
> call here.  I notice that once I bring up the link, the stream of failing
> SENSE_PORT calls (which don't ever work in SR-IOV mode AFAICT) stops as well.
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
jackm Oct. 2, 2012, 8:51 a.m. UTC | #8
Acked-by: Jack Morgenstein <jackm@dev.mellanox.co.il>

Thanks, Roland! Good catches and good fixes!

Regarding the mutex replacing the semaphore, at one time we toyed
with the idea of multiple comm channel commands "in the air", but
we did not pursue the idea.

I agree with changing slave_sem to a mutex.  If we do decide at some point
to do multiple comm channel commands, we will change it back.

Regarding moving initialization of slave_sem and vhcr, your later patches
render it unnecessary.  However, I agree with this change -- this fixes
an "Oops waiting to happen". Though the change is no longer needed now, it
will be if SENSE_PORT is allowed in multifunction, and then we WILL start
experiencing the same Oops as we did before your patches disabled
SENSE_PORT for multifunction. Fixes like these are too easy to overlook.

-Jack

On Wednesday 26 September 2012 06:42, Roland Dreier wrote:
> From: Roland Dreier <roland@purestorage.com>
> 
> On an SR-IOV master device, __mlx4_init_one() calls mlx4_init_hca()
> before mlx4_multi_func_init().  However, for unlucky configurations,
> mlx4_init_hca() might call mlx4_SENSE_PORT() (via mlx4_dev_cap()), and
> that calls mlx4_cmd_imm() with MLX4_CMD_WRAPPED set.
> 
> However, on a multifunction device with MLX4_CMD_WRAPPED, __mlx4_cmd()
> calls into mlx4_slave_cmd(), and that immediately tries to do
> 
> 	down(&priv->cmd.slave_sem);
> 
> but priv->cmd.slave_sem isn't initialized until mlx4_multi_func_init()
> (which we haven't called yet).  The next thing it tries to do is access
> priv->mfunc.vhcr, but that hasn't been allocated yet.
> 
> Fix this by moving the initialization of slave_sem and vhcr up into
> mlx4_cmd_init(). Also, since slave_sem is really just being used as a
> mutex, convert it into a slave_cmd_mutex.
> 
> Signed-off-by: Roland Dreier <roland@purestorage.com>
> ---
> Jack, I needed this to get my (CX3 w/ FW 2.11.500) adapter to work in
> SR-IOV mode.  Is it possible you never tested SR-IOV on an adapter
> with ports in autosensing mode?
> 
>  drivers/net/ethernet/mellanox/mlx4/cmd.c  |   51 ++++++++++++++++++-----------
>  drivers/net/ethernet/mellanox/mlx4/main.c |   11 ++++---
>  drivers/net/ethernet/mellanox/mlx4/mlx4.h |    2 +-
>  3 files changed, 38 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c b/drivers/net/ethernet/mellanox/mlx4/cmd.c
> index 90774b7..3d1899f 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
> @@ -395,7 +395,8 @@ static int mlx4_slave_cmd(struct mlx4_dev *dev, u64 in_param, u64 *out_param,
>  	struct mlx4_vhcr_cmd *vhcr = priv->mfunc.vhcr;
>  	int ret;
>  
> -	down(&priv->cmd.slave_sem);
> +	mutex_lock(&priv->cmd.slave_cmd_mutex);
> +
>  	vhcr->in_param = cpu_to_be64(in_param);
>  	vhcr->out_param = out_param ? cpu_to_be64(*out_param) : 0;
>  	vhcr->in_modifier = cpu_to_be32(in_modifier);
> @@ -403,6 +404,7 @@ static int mlx4_slave_cmd(struct mlx4_dev *dev, u64 in_param, u64 *out_param,
>  	vhcr->token = cpu_to_be16(CMD_POLL_TOKEN);
>  	vhcr->status = 0;
>  	vhcr->flags = !!(priv->cmd.use_events) << 6;
> +
>  	if (mlx4_is_master(dev)) {
>  		ret = mlx4_master_process_vhcr(dev, dev->caps.function, vhcr);
>  		if (!ret) {
> @@ -439,7 +441,8 @@ static int mlx4_slave_cmd(struct mlx4_dev *dev, u64 in_param, u64 *out_param,
>  			mlx4_err(dev, "failed execution of VHCR_POST command"
>  				 "opcode 0x%x\n", op);
>  	}
> -	up(&priv->cmd.slave_sem);
> +
> +	mutex_unlock(&priv->cmd.slave_cmd_mutex);
>  	return ret;
>  }
>  
> @@ -1559,14 +1562,15 @@ static void mlx4_master_do_cmd(struct mlx4_dev *dev, int slave, u8 cmd,
>  		if ((slave_state[slave].last_cmd != MLX4_COMM_CMD_VHCR_EN) &&
>  		    (slave_state[slave].last_cmd != MLX4_COMM_CMD_VHCR_POST))
>  			goto reset_slave;
> -		down(&priv->cmd.slave_sem);
> +
> +		mutex_lock(&priv->cmd.slave_cmd_mutex);
>  		if (mlx4_master_process_vhcr(dev, slave, NULL)) {
>  			mlx4_err(dev, "Failed processing vhcr for slave:%d,"
>  				 " resetting slave.\n", slave);
> -			up(&priv->cmd.slave_sem);
> +			mutex_unlock(&priv->cmd.slave_cmd_mutex);
>  			goto reset_slave;
>  		}
> -		up(&priv->cmd.slave_sem);
> +		mutex_unlock(&priv->cmd.slave_cmd_mutex);
>  		break;
>  	default:
>  		mlx4_warn(dev, "Bad comm cmd:%d from slave:%d\n", cmd, slave);
> @@ -1707,14 +1711,6 @@ int mlx4_multi_func_init(struct mlx4_dev *dev)
>  	struct mlx4_slave_state *s_state;
>  	int i, j, err, port;
>  
> -	priv->mfunc.vhcr = dma_alloc_coherent(&(dev->pdev->dev), PAGE_SIZE,
> -					    &priv->mfunc.vhcr_dma,
> -					    GFP_KERNEL);
> -	if (!priv->mfunc.vhcr) {
> -		mlx4_err(dev, "Couldn't allocate vhcr.\n");
> -		return -ENOMEM;
> -	}
> -
>  	if (mlx4_is_master(dev))
>  		priv->mfunc.comm =
>  		ioremap(pci_resource_start(dev->pdev, priv->fw.comm_bar) +
> @@ -1777,7 +1773,6 @@ int mlx4_multi_func_init(struct mlx4_dev *dev)
>  		if (mlx4_init_resource_tracker(dev))
>  			goto err_thread;
>  
> -		sema_init(&priv->cmd.slave_sem, 1);
>  		err = mlx4_ARM_COMM_CHANNEL(dev);
>  		if (err) {
>  			mlx4_err(dev, " Failed to arm comm channel eq: %x\n",
> @@ -1791,8 +1786,6 @@ int mlx4_multi_func_init(struct mlx4_dev *dev)
>  			mlx4_err(dev, "Couldn't sync toggles\n");
>  			goto err_comm;
>  		}
> -
> -		sema_init(&priv->cmd.slave_sem, 1);
>  	}
>  	return 0;
>  
> @@ -1822,6 +1815,7 @@ int mlx4_cmd_init(struct mlx4_dev *dev)
>  	struct mlx4_priv *priv = mlx4_priv(dev);
>  
>  	mutex_init(&priv->cmd.hcr_mutex);
> +	mutex_init(&priv->cmd.slave_cmd_mutex);
>  	sema_init(&priv->cmd.poll_sem, 1);
>  	priv->cmd.use_events = 0;
>  	priv->cmd.toggle     = 1;
> @@ -1838,14 +1832,30 @@ int mlx4_cmd_init(struct mlx4_dev *dev)
>  		}
>  	}
>  
> +	if (mlx4_is_mfunc(dev)) {
> +		priv->mfunc.vhcr = dma_alloc_coherent(&(dev->pdev->dev), PAGE_SIZE,
> +						      &priv->mfunc.vhcr_dma,
> +						      GFP_KERNEL);
> +		if (!priv->mfunc.vhcr) {
> +			mlx4_err(dev, "Couldn't allocate VHCR.\n");
> +			goto err_hcr;
> +		}
> +	}
> +
>  	priv->cmd.pool = pci_pool_create("mlx4_cmd", dev->pdev,
>  					 MLX4_MAILBOX_SIZE,
>  					 MLX4_MAILBOX_SIZE, 0);
>  	if (!priv->cmd.pool)
> -		goto err_hcr;
> +		goto err_vhcr;
>  
>  	return 0;
>  
> +err_vhcr:
> +	if (mlx4_is_mfunc(dev))
> +		dma_free_coherent(&(dev->pdev->dev), PAGE_SIZE,
> +				  priv->mfunc.vhcr, priv->mfunc.vhcr_dma);
> +	priv->mfunc.vhcr = NULL;
> +
>  err_hcr:
>  	if (!mlx4_is_slave(dev))
>  		iounmap(priv->cmd.hcr);
> @@ -1868,9 +1878,6 @@ void mlx4_multi_func_cleanup(struct mlx4_dev *dev)
>  	}
>  
>  	iounmap(priv->mfunc.comm);
> -	dma_free_coherent(&(dev->pdev->dev), PAGE_SIZE,
> -		     priv->mfunc.vhcr, priv->mfunc.vhcr_dma);
> -	priv->mfunc.vhcr = NULL;
>  }
>  
>  void mlx4_cmd_cleanup(struct mlx4_dev *dev)
> @@ -1881,6 +1888,10 @@ void mlx4_cmd_cleanup(struct mlx4_dev *dev)
>  
>  	if (!mlx4_is_slave(dev))
>  		iounmap(priv->cmd.hcr);
> +	if (mlx4_is_mfunc(dev))
> +		dma_free_coherent(&(dev->pdev->dev), PAGE_SIZE,
> +				  priv->mfunc.vhcr, priv->mfunc.vhcr_dma);
> +	priv->mfunc.vhcr = NULL;
>  }
>  
>  /*
> diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
> index 8d3b475..9f06e04 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/main.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/main.c
> @@ -1159,10 +1159,10 @@ static void mlx4_slave_exit(struct mlx4_dev *dev)
>  {
>  	struct mlx4_priv *priv = mlx4_priv(dev);
>  
> -	down(&priv->cmd.slave_sem);
> +	mutex_lock(&priv->cmd.slave_cmd_mutex);
>  	if (mlx4_comm_cmd(dev, MLX4_COMM_CMD_RESET, 0, MLX4_COMM_TIME))
>  		mlx4_warn(dev, "Failed to close slave function.\n");
> -	up(&priv->cmd.slave_sem);
> +	mutex_unlock(&priv->cmd.slave_cmd_mutex);
>  }
>  
>  static int map_bf_area(struct mlx4_dev *dev)
> @@ -1214,7 +1214,7 @@ static int mlx4_init_slave(struct mlx4_dev *dev)
>  	u32 slave_read;
>  	u32 cmd_channel_ver;
>  
> -	down(&priv->cmd.slave_sem);
> +	mutex_lock(&priv->cmd.slave_cmd_mutex);
>  	priv->cmd.max_cmds = 1;
>  	mlx4_warn(dev, "Sending reset\n");
>  	ret_from_reset = mlx4_comm_cmd(dev, MLX4_COMM_CMD_RESET, 0,
> @@ -1263,12 +1263,13 @@ static int mlx4_init_slave(struct mlx4_dev *dev)
>  		goto err;
>  	if (mlx4_comm_cmd(dev, MLX4_COMM_CMD_VHCR_EN, dma, MLX4_COMM_TIME))
>  		goto err;
> -	up(&priv->cmd.slave_sem);
> +
> +	mutex_unlock(&priv->cmd.slave_cmd_mutex);
>  	return 0;
>  
>  err:
>  	mlx4_comm_cmd(dev, MLX4_COMM_CMD_RESET, 0, 0);
> -	up(&priv->cmd.slave_sem);
> +	mutex_unlock(&priv->cmd.slave_cmd_mutex);
>  	return -EIO;
>  }
>  
> diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
> index e5eee31..1bb3e5e 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
> +++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
> @@ -513,9 +513,9 @@ struct mlx4_cmd {
>  	struct pci_pool	       *pool;
>  	void __iomem	       *hcr;
>  	struct mutex		hcr_mutex;
> +	struct mutex		slave_cmd_mutex;
>  	struct semaphore	poll_sem;
>  	struct semaphore	event_sem;
> -	struct semaphore	slave_sem;
>  	int			max_cmds;
>  	spinlock_t		context_lock;
>  	int			free_head;
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c b/drivers/net/ethernet/mellanox/mlx4/cmd.c
index 90774b7..3d1899f 100644
--- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
@@ -395,7 +395,8 @@  static int mlx4_slave_cmd(struct mlx4_dev *dev, u64 in_param, u64 *out_param,
 	struct mlx4_vhcr_cmd *vhcr = priv->mfunc.vhcr;
 	int ret;
 
-	down(&priv->cmd.slave_sem);
+	mutex_lock(&priv->cmd.slave_cmd_mutex);
+
 	vhcr->in_param = cpu_to_be64(in_param);
 	vhcr->out_param = out_param ? cpu_to_be64(*out_param) : 0;
 	vhcr->in_modifier = cpu_to_be32(in_modifier);
@@ -403,6 +404,7 @@  static int mlx4_slave_cmd(struct mlx4_dev *dev, u64 in_param, u64 *out_param,
 	vhcr->token = cpu_to_be16(CMD_POLL_TOKEN);
 	vhcr->status = 0;
 	vhcr->flags = !!(priv->cmd.use_events) << 6;
+
 	if (mlx4_is_master(dev)) {
 		ret = mlx4_master_process_vhcr(dev, dev->caps.function, vhcr);
 		if (!ret) {
@@ -439,7 +441,8 @@  static int mlx4_slave_cmd(struct mlx4_dev *dev, u64 in_param, u64 *out_param,
 			mlx4_err(dev, "failed execution of VHCR_POST command"
 				 "opcode 0x%x\n", op);
 	}
-	up(&priv->cmd.slave_sem);
+
+	mutex_unlock(&priv->cmd.slave_cmd_mutex);
 	return ret;
 }
 
@@ -1559,14 +1562,15 @@  static void mlx4_master_do_cmd(struct mlx4_dev *dev, int slave, u8 cmd,
 		if ((slave_state[slave].last_cmd != MLX4_COMM_CMD_VHCR_EN) &&
 		    (slave_state[slave].last_cmd != MLX4_COMM_CMD_VHCR_POST))
 			goto reset_slave;
-		down(&priv->cmd.slave_sem);
+
+		mutex_lock(&priv->cmd.slave_cmd_mutex);
 		if (mlx4_master_process_vhcr(dev, slave, NULL)) {
 			mlx4_err(dev, "Failed processing vhcr for slave:%d,"
 				 " resetting slave.\n", slave);
-			up(&priv->cmd.slave_sem);
+			mutex_unlock(&priv->cmd.slave_cmd_mutex);
 			goto reset_slave;
 		}
-		up(&priv->cmd.slave_sem);
+		mutex_unlock(&priv->cmd.slave_cmd_mutex);
 		break;
 	default:
 		mlx4_warn(dev, "Bad comm cmd:%d from slave:%d\n", cmd, slave);
@@ -1707,14 +1711,6 @@  int mlx4_multi_func_init(struct mlx4_dev *dev)
 	struct mlx4_slave_state *s_state;
 	int i, j, err, port;
 
-	priv->mfunc.vhcr = dma_alloc_coherent(&(dev->pdev->dev), PAGE_SIZE,
-					    &priv->mfunc.vhcr_dma,
-					    GFP_KERNEL);
-	if (!priv->mfunc.vhcr) {
-		mlx4_err(dev, "Couldn't allocate vhcr.\n");
-		return -ENOMEM;
-	}
-
 	if (mlx4_is_master(dev))
 		priv->mfunc.comm =
 		ioremap(pci_resource_start(dev->pdev, priv->fw.comm_bar) +
@@ -1777,7 +1773,6 @@  int mlx4_multi_func_init(struct mlx4_dev *dev)
 		if (mlx4_init_resource_tracker(dev))
 			goto err_thread;
 
-		sema_init(&priv->cmd.slave_sem, 1);
 		err = mlx4_ARM_COMM_CHANNEL(dev);
 		if (err) {
 			mlx4_err(dev, " Failed to arm comm channel eq: %x\n",
@@ -1791,8 +1786,6 @@  int mlx4_multi_func_init(struct mlx4_dev *dev)
 			mlx4_err(dev, "Couldn't sync toggles\n");
 			goto err_comm;
 		}
-
-		sema_init(&priv->cmd.slave_sem, 1);
 	}
 	return 0;
 
@@ -1822,6 +1815,7 @@  int mlx4_cmd_init(struct mlx4_dev *dev)
 	struct mlx4_priv *priv = mlx4_priv(dev);
 
 	mutex_init(&priv->cmd.hcr_mutex);
+	mutex_init(&priv->cmd.slave_cmd_mutex);
 	sema_init(&priv->cmd.poll_sem, 1);
 	priv->cmd.use_events = 0;
 	priv->cmd.toggle     = 1;
@@ -1838,14 +1832,30 @@  int mlx4_cmd_init(struct mlx4_dev *dev)
 		}
 	}
 
+	if (mlx4_is_mfunc(dev)) {
+		priv->mfunc.vhcr = dma_alloc_coherent(&(dev->pdev->dev), PAGE_SIZE,
+						      &priv->mfunc.vhcr_dma,
+						      GFP_KERNEL);
+		if (!priv->mfunc.vhcr) {
+			mlx4_err(dev, "Couldn't allocate VHCR.\n");
+			goto err_hcr;
+		}
+	}
+
 	priv->cmd.pool = pci_pool_create("mlx4_cmd", dev->pdev,
 					 MLX4_MAILBOX_SIZE,
 					 MLX4_MAILBOX_SIZE, 0);
 	if (!priv->cmd.pool)
-		goto err_hcr;
+		goto err_vhcr;
 
 	return 0;
 
+err_vhcr:
+	if (mlx4_is_mfunc(dev))
+		dma_free_coherent(&(dev->pdev->dev), PAGE_SIZE,
+				  priv->mfunc.vhcr, priv->mfunc.vhcr_dma);
+	priv->mfunc.vhcr = NULL;
+
 err_hcr:
 	if (!mlx4_is_slave(dev))
 		iounmap(priv->cmd.hcr);
@@ -1868,9 +1878,6 @@  void mlx4_multi_func_cleanup(struct mlx4_dev *dev)
 	}
 
 	iounmap(priv->mfunc.comm);
-	dma_free_coherent(&(dev->pdev->dev), PAGE_SIZE,
-		     priv->mfunc.vhcr, priv->mfunc.vhcr_dma);
-	priv->mfunc.vhcr = NULL;
 }
 
 void mlx4_cmd_cleanup(struct mlx4_dev *dev)
@@ -1881,6 +1888,10 @@  void mlx4_cmd_cleanup(struct mlx4_dev *dev)
 
 	if (!mlx4_is_slave(dev))
 		iounmap(priv->cmd.hcr);
+	if (mlx4_is_mfunc(dev))
+		dma_free_coherent(&(dev->pdev->dev), PAGE_SIZE,
+				  priv->mfunc.vhcr, priv->mfunc.vhcr_dma);
+	priv->mfunc.vhcr = NULL;
 }
 
 /*
diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index 8d3b475..9f06e04 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -1159,10 +1159,10 @@  static void mlx4_slave_exit(struct mlx4_dev *dev)
 {
 	struct mlx4_priv *priv = mlx4_priv(dev);
 
-	down(&priv->cmd.slave_sem);
+	mutex_lock(&priv->cmd.slave_cmd_mutex);
 	if (mlx4_comm_cmd(dev, MLX4_COMM_CMD_RESET, 0, MLX4_COMM_TIME))
 		mlx4_warn(dev, "Failed to close slave function.\n");
-	up(&priv->cmd.slave_sem);
+	mutex_unlock(&priv->cmd.slave_cmd_mutex);
 }
 
 static int map_bf_area(struct mlx4_dev *dev)
@@ -1214,7 +1214,7 @@  static int mlx4_init_slave(struct mlx4_dev *dev)
 	u32 slave_read;
 	u32 cmd_channel_ver;
 
-	down(&priv->cmd.slave_sem);
+	mutex_lock(&priv->cmd.slave_cmd_mutex);
 	priv->cmd.max_cmds = 1;
 	mlx4_warn(dev, "Sending reset\n");
 	ret_from_reset = mlx4_comm_cmd(dev, MLX4_COMM_CMD_RESET, 0,
@@ -1263,12 +1263,13 @@  static int mlx4_init_slave(struct mlx4_dev *dev)
 		goto err;
 	if (mlx4_comm_cmd(dev, MLX4_COMM_CMD_VHCR_EN, dma, MLX4_COMM_TIME))
 		goto err;
-	up(&priv->cmd.slave_sem);
+
+	mutex_unlock(&priv->cmd.slave_cmd_mutex);
 	return 0;
 
 err:
 	mlx4_comm_cmd(dev, MLX4_COMM_CMD_RESET, 0, 0);
-	up(&priv->cmd.slave_sem);
+	mutex_unlock(&priv->cmd.slave_cmd_mutex);
 	return -EIO;
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
index e5eee31..1bb3e5e 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
@@ -513,9 +513,9 @@  struct mlx4_cmd {
 	struct pci_pool	       *pool;
 	void __iomem	       *hcr;
 	struct mutex		hcr_mutex;
+	struct mutex		slave_cmd_mutex;
 	struct semaphore	poll_sem;
 	struct semaphore	event_sem;
-	struct semaphore	slave_sem;
 	int			max_cmds;
 	spinlock_t		context_lock;
 	int			free_head;