diff mbox

[v2,for-next,5/7] IB/mlx4: Add IB counters table

Message ID CAAKD3BB5-hUC=0bDqxXxuag9TPqQEeMD74paHrAMOc-MsQwEow@mail.gmail.com (mailing list archive)
State Superseded
Headers show

Commit Message

Matan Barak Dec. 24, 2015, 4:09 p.m. UTC
On Thu, Dec 24, 2015 at 4:07 PM, Matan Barak <matanb@dev.mellanox.co.il> wrote:
> On Thu, Dec 24, 2015 at 2:38 PM, Or Gerlitz <ogerlitz@mellanox.com> wrote:
>> On 12/24/2015 12:42 PM, Sagi Grimberg wrote:
>>>
>>>
>>>>> This patch seems to generate a list corruption [1] when I test
>>>>> with Doug's for-4.5 tree. Eran, care to take a look at this?
>>>>
>>>>
>>>> This patch is part from a series that was introduced in 4.3-rc1 [1],
>>>
>>>
>>> Then something else broke it. Can people check their patches on doug's
>>> tree? At the moment it's unusable...
>>
>
> Leon and I have checked Doug's tree with mlx4_ib disabled and we
> didn't encounter any error.
> We ran ucmatose over IB connection (in mlx5) and it worked flawlessly.
>
>>
>> Yes, I checked the branch up to commit 882f3b3 "Merge branches
>> '4.5/Or-cleanup' and '4.5/rdma-cq' into k.o/for-4.5" and it works (rping,
>> ibv_rc_pingpong over top of mlx4 VPI)
>>

Regarding mlx4, Eran and I analyzed it. We didn't test that, but it
seems like the bug is introduced in the 64bit counters test. Here's a
proposal:



>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Hal Rosenstock Dec. 25, 2015, 2:50 p.m. UTC | #1
On 12/24/2015 11:09 AM, Matan Barak wrote:
> On Thu, Dec 24, 2015 at 4:07 PM, Matan Barak <matanb@dev.mellanox.co.il> wrote:
>> On Thu, Dec 24, 2015 at 2:38 PM, Or Gerlitz <ogerlitz@mellanox.com> wrote:
>>> On 12/24/2015 12:42 PM, Sagi Grimberg wrote:
>>>>
>>>>
>>>>>> This patch seems to generate a list corruption [1] when I test
>>>>>> with Doug's for-4.5 tree. Eran, care to take a look at this?
>>>>>
>>>>>
>>>>> This patch is part from a series that was introduced in 4.3-rc1 [1],
>>>>
>>>>
>>>> Then something else broke it. Can people check their patches on doug's
>>>> tree? At the moment it's unusable...
>>>
>>
>> Leon and I have checked Doug's tree with mlx4_ib disabled and we
>> didn't encounter any error.
>> We ran ucmatose over IB connection (in mlx5) and it worked flawlessly.
>>
>>>
>>> Yes, I checked the branch up to commit 882f3b3 "Merge branches
>>> '4.5/Or-cleanup' and '4.5/rdma-cq' into k.o/for-4.5" and it works (rping,
>>> ibv_rc_pingpong over top of mlx4 VPI)
>>>
> 
> Regarding mlx4, Eran and I analyzed it. We didn't test that, but it
> seems like the bug is introduced in the 64bit counters test. Here's a
> proposal:
> 
> diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
> index 539040f..8da3c83 100644
> --- a/drivers/infiniband/core/sysfs.c
> +++ b/drivers/infiniband/core/sysfs.c
> @@ -714,11 +714,12 @@ err:
>   * Figure out which counter table to use depending on
>   * the device capabilities.
>   */
> -static struct attribute_group *get_counter_table(struct ib_device *dev)
> +static struct attribute_group *get_counter_table(struct ib_device *dev,
> +                                          int port_num)
>  {
>         struct ib_class_port_info cpi;
> 
> -   if (get_perf_mad(dev, 0, IB_PMA_CLASS_PORT_INFO,
> + if (get_perf_mad(dev, port_num, IB_PMA_CLASS_PORT_INFO,
>                                 &cpi, 40, sizeof(cpi)) >= 0) {

Your proposal is similar to earlier version of Christoph's patch but was
changed since ClassPortInfo attribute does not have PortSelect field
like other PerfMgt attributes which is where this port num would be
placed. In ClassPortInfo attribute, that location would be the
ClassVersion field that would be set to port number in PerfMgt Get query.

-- Hal

> 
>                 if (cpi.capability_mask && IB_PMA_CLASS_CAP_EXT_WIDTH)
> @@ -776,7 +777,7 @@ static int add_port(struct ib_device *device, int port_num,
>                 goto err_put;
>         }
> 
> -   p->pma_table = get_counter_table(device);
> + p->pma_table = get_counter_table(device, port_num);
>         ret = sysfs_create_group(&p->kobj, p->pma_table);
>         if (ret)
>                 goto err_put_gid_attrs;
> 
> 
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hal Rosenstock Dec. 25, 2015, 3:56 p.m. UTC | #2
On 12/25/2015 9:50 AM, Hal Rosenstock wrote:
> On 12/24/2015 11:09 AM, Matan Barak wrote:
>> On Thu, Dec 24, 2015 at 4:07 PM, Matan Barak <matanb@dev.mellanox.co.il> wrote:
>>> On Thu, Dec 24, 2015 at 2:38 PM, Or Gerlitz <ogerlitz@mellanox.com> wrote:
>>>> On 12/24/2015 12:42 PM, Sagi Grimberg wrote:
>>>>>
>>>>>
>>>>>>> This patch seems to generate a list corruption [1] when I test
>>>>>>> with Doug's for-4.5 tree. Eran, care to take a look at this?
>>>>>>
>>>>>>
>>>>>> This patch is part from a series that was introduced in 4.3-rc1 [1],
>>>>>
>>>>>
>>>>> Then something else broke it. Can people check their patches on doug's
>>>>> tree? At the moment it's unusable...
>>>>
>>>
>>> Leon and I have checked Doug's tree with mlx4_ib disabled and we
>>> didn't encounter any error.
>>> We ran ucmatose over IB connection (in mlx5) and it worked flawlessly.
>>>
>>>>
>>>> Yes, I checked the branch up to commit 882f3b3 "Merge branches
>>>> '4.5/Or-cleanup' and '4.5/rdma-cq' into k.o/for-4.5" and it works (rping,
>>>> ibv_rc_pingpong over top of mlx4 VPI)
>>>>
>>
>> Regarding mlx4, Eran and I analyzed it. We didn't test that, but it
>> seems like the bug is introduced in the 64bit counters test. Here's a
>> proposal:
>>
>> diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
>> index 539040f..8da3c83 100644
>> --- a/drivers/infiniband/core/sysfs.c
>> +++ b/drivers/infiniband/core/sysfs.c
>> @@ -714,11 +714,12 @@ err:
>>   * Figure out which counter table to use depending on
>>   * the device capabilities.
>>   */
>> -static struct attribute_group *get_counter_table(struct ib_device *dev)
>> +static struct attribute_group *get_counter_table(struct ib_device *dev,
>> +                                          int port_num)
>>  {
>>         struct ib_class_port_info cpi;
>>
>> -   if (get_perf_mad(dev, 0, IB_PMA_CLASS_PORT_INFO,
>> + if (get_perf_mad(dev, port_num, IB_PMA_CLASS_PORT_INFO,
>>                                 &cpi, 40, sizeof(cpi)) >= 0) {
> 
> Your proposal is similar to earlier version of Christoph's patch but was
> changed since ClassPortInfo attribute does not have PortSelect field
> like other PerfMgt attributes which is where this port num would be
> placed. In ClassPortInfo attribute, that location would be the
> ClassVersion field that would be set to port number in PerfMgt Get query.

In actuality, I don't think it really matters as this is a Get not a Set
and the PMA would do the right thing even if some field in the CPI were
stepped on.

> -- Hal
> 
>>
>>                 if (cpi.capability_mask && IB_PMA_CLASS_CAP_EXT_WIDTH)
>> @@ -776,7 +777,7 @@ static int add_port(struct ib_device *device, int port_num,
>>                 goto err_put;
>>         }
>>
>> -   p->pma_table = get_counter_table(device);
>> + p->pma_table = get_counter_table(device, port_num);
>>         ret = sysfs_create_group(&p->kobj, p->pma_table);
>>         if (ret)
>>                 goto err_put_gid_attrs;
>>
>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Matan Barak Dec. 27, 2015, 11:16 a.m. UTC | #3
On Fri, Dec 25, 2015 at 5:56 PM, Hal Rosenstock <hal@dev.mellanox.co.il> wrote:
> On 12/25/2015 9:50 AM, Hal Rosenstock wrote:
>> On 12/24/2015 11:09 AM, Matan Barak wrote:
>>> On Thu, Dec 24, 2015 at 4:07 PM, Matan Barak <matanb@dev.mellanox.co.il> wrote:
>>>> On Thu, Dec 24, 2015 at 2:38 PM, Or Gerlitz <ogerlitz@mellanox.com> wrote:
>>>>> On 12/24/2015 12:42 PM, Sagi Grimberg wrote:
>>>>>>
>>>>>>
>>>>>>>> This patch seems to generate a list corruption [1] when I test
>>>>>>>> with Doug's for-4.5 tree. Eran, care to take a look at this?
>>>>>>>
>>>>>>>
>>>>>>> This patch is part from a series that was introduced in 4.3-rc1 [1],
>>>>>>
>>>>>>
>>>>>> Then something else broke it. Can people check their patches on doug's
>>>>>> tree? At the moment it's unusable...
>>>>>
>>>>
>>>> Leon and I have checked Doug's tree with mlx4_ib disabled and we
>>>> didn't encounter any error.
>>>> We ran ucmatose over IB connection (in mlx5) and it worked flawlessly.
>>>>
>>>>>
>>>>> Yes, I checked the branch up to commit 882f3b3 "Merge branches
>>>>> '4.5/Or-cleanup' and '4.5/rdma-cq' into k.o/for-4.5" and it works (rping,
>>>>> ibv_rc_pingpong over top of mlx4 VPI)
>>>>>
>>>
>>> Regarding mlx4, Eran and I analyzed it. We didn't test that, but it
>>> seems like the bug is introduced in the 64bit counters test. Here's a
>>> proposal:
>>>
>>> diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
>>> index 539040f..8da3c83 100644
>>> --- a/drivers/infiniband/core/sysfs.c
>>> +++ b/drivers/infiniband/core/sysfs.c
>>> @@ -714,11 +714,12 @@ err:
>>>   * Figure out which counter table to use depending on
>>>   * the device capabilities.
>>>   */
>>> -static struct attribute_group *get_counter_table(struct ib_device *dev)
>>> +static struct attribute_group *get_counter_table(struct ib_device *dev,
>>> +                                          int port_num)
>>>  {
>>>         struct ib_class_port_info cpi;
>>>
>>> -   if (get_perf_mad(dev, 0, IB_PMA_CLASS_PORT_INFO,
>>> + if (get_perf_mad(dev, port_num, IB_PMA_CLASS_PORT_INFO,
>>>                                 &cpi, 40, sizeof(cpi)) >= 0) {
>>
>> Your proposal is similar to earlier version of Christoph's patch but was
>> changed since ClassPortInfo attribute does not have PortSelect field
>> like other PerfMgt attributes which is where this port num would be
>> placed. In ClassPortInfo attribute, that location would be the
>> ClassVersion field that would be set to port number in PerfMgt Get query.
>
> In actuality, I don't think it really matters as this is a Get not a Set
> and the PMA would do the right thing even if some field in the CPI were
> stepped on.
>

Well, it does matter as it calls the vendor driver with port_num = 0.
Since the kernel is trusted, the vendor driver expects a valid port number.
Giving it an invalid number might result in memory corruptions, as
demonstrated in this case.

>> -- Hal

Matan

>>
>>>
>>>                 if (cpi.capability_mask && IB_PMA_CLASS_CAP_EXT_WIDTH)
>>> @@ -776,7 +777,7 @@ static int add_port(struct ib_device *device, int port_num,
>>>                 goto err_put;
>>>         }
>>>
>>> -   p->pma_table = get_counter_table(device);
>>> + p->pma_table = get_counter_table(device, port_num);
>>>         ret = sysfs_create_group(&p->kobj, p->pma_table);
>>>         if (ret)
>>>                 goto err_put_gid_attrs;
>>>
>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
index 539040f..8da3c83 100644
--- a/drivers/infiniband/core/sysfs.c
+++ b/drivers/infiniband/core/sysfs.c
@@ -714,11 +714,12 @@  err:
  * Figure out which counter table to use depending on
  * the device capabilities.
  */
-static struct attribute_group *get_counter_table(struct ib_device *dev)
+static struct attribute_group *get_counter_table(struct ib_device *dev,
+                                          int port_num)
 {
        struct ib_class_port_info cpi;

-   if (get_perf_mad(dev, 0, IB_PMA_CLASS_PORT_INFO,
+ if (get_perf_mad(dev, port_num, IB_PMA_CLASS_PORT_INFO,
                                &cpi, 40, sizeof(cpi)) >= 0) {

                if (cpi.capability_mask && IB_PMA_CLASS_CAP_EXT_WIDTH)
@@ -776,7 +777,7 @@  static int add_port(struct ib_device *device, int port_num,
                goto err_put;
        }

-   p->pma_table = get_counter_table(device);
+ p->pma_table = get_counter_table(device, port_num);
        ret = sysfs_create_group(&p->kobj, p->pma_table);
        if (ret)
                goto err_put_gid_attrs;