diff mbox

[1/4] vfio: ccw: fix cleanup if cp_prefetch fails

Message ID 20180321020822.86255-2-bjsdjshi@linux.vnet.ibm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Dong Jia Shi March 21, 2018, 2:08 a.m. UTC
From: Halil Pasic <pasic@linux.vnet.ibm.com>

If the translation of a channel program fails, we may end up attempting
to clean up (free, unpin) stuff that never got translated (and allocated,
pinned) in the first place.

By adjusting the lengths of the chains accordingly (so the element that
failed, and all subsequent elements are excluded) cleanup activities
based on false assumptions can be avoided.

Let's make sure cp_free works properly after cp_prefetch returns with an
error by setting ch_len to 0 for the ccw chains those are not prefetched.

Acked-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
---
 drivers/s390/cio/vfio_ccw_cp.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Comments

Halil Pasic March 21, 2018, 12:49 p.m. UTC | #1
On 03/21/2018 03:08 AM, Dong Jia Shi wrote:
> From: Halil Pasic <pasic@linux.vnet.ibm.com>
> 
> If the translation of a channel program fails, we may end up attempting
> to clean up (free, unpin) stuff that never got translated (and allocated,
> pinned) in the first place.
> 
> By adjusting the lengths of the chains accordingly (so the element that
> failed, and all subsequent elements are excluded) cleanup activities
> based on false assumptions can be avoided.
> 
> Let's make sure cp_free works properly after cp_prefetch returns with an
> error by setting ch_len to 0 for the ccw chains those are not prefetched.

This sentence used to be:

Let's make sure cp_free works properly after cp_prefetch returns with an
error.

@Dong Jia
I find the 'by setting ch_len to 0 for the ccw chains those are not prefetched'
you added for clarification (I guess) somewhat problematic.
The chain in which the translation failure occurred
+	chain->ch_len = idx;
is shortened so that only the translated elements (ccws) are going to
get cleaned up (on a per element basis) by cp_free. This may or may
not be the first ccw. Subsequent chains are shortened to 0 as there
no translation took place.

So as a result of this change only properly translated ccws are going
to get (re)visited by cp_free as only those may have resources bound
to them which need to be released.

I'm not against improving the commit message. But this ain't
an improvement to me.

> 
> Acked-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
> Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
> Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
> ---
>  drivers/s390/cio/vfio_ccw_cp.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> index d9a2fffd034b..2be114db02f9 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.c
> +++ b/drivers/s390/cio/vfio_ccw_cp.c
> @@ -749,11 +749,18 @@ int cp_prefetch(struct channel_program *cp)
>  		for (idx = 0; idx < len; idx++) {
>  			ret = ccwchain_fetch_one(chain, idx, cp);
>  			if (ret)
> -				return ret;
> +				goto out_err;
>  		}
>  	}
> 
>  	return 0;
> +out_err:
> +	/* Only cleanup the chain elements that where actually translated. */
> +	chain->ch_len = idx;
> +	list_for_each_entry_continue(chain, &cp->ccwchain_list, next) {
> +		chain->ch_len = 0;
> +	}
> +	return ret;
>  }
> 
>  /**
>
Pierre Morel March 22, 2018, 9:37 a.m. UTC | #2
On 22/03/2018 03:22, Dong Jia Shi wrote:
> * Halil Pasic <pasic@linux.vnet.ibm.com> [2018-03-21 13:49:54 +0100]:
>
>>
>> On 03/21/2018 03:08 AM, Dong Jia Shi wrote:
>>> From: Halil Pasic <pasic@linux.vnet.ibm.com>
>>>
>>> If the translation of a channel program fails, we may end up attempting
>>> to clean up (free, unpin) stuff that never got translated (and allocated,
>>> pinned) in the first place.
>>>
>>> By adjusting the lengths of the chains accordingly (so the element that
>>> failed, and all subsequent elements are excluded) cleanup activities
>>> based on false assumptions can be avoided.
>>>
>>> Let's make sure cp_free works properly after cp_prefetch returns with an
>>> error by setting ch_len to 0 for the ccw chains those are not prefetched.
>> This sentence used to be:
>>
>> Let's make sure cp_free works properly after cp_prefetch returns with an
>> error.
>>
>> @Dong Jia
>> I find the 'by setting ch_len to 0 for the ccw chains those are not prefetched'
>> you added for clarification (I guess) somewhat problematic.
>> The chain in which the translation failure occurred
>> +	chain->ch_len = idx;
> I made a mistake. When rewording the message, I missed this part...
> Sorry for the problem!
>
>> is shortened so that only the translated elements (ccws) are going to
>> get cleaned up (on a per element basis) by cp_free. This may or may
>> not be the first ccw. Subsequent chains are shortened to 0 as there
>> no translation took place.
>>
>> So as a result of this change only properly translated ccws are going
>> to get (re)visited by cp_free as only those may have resources bound
>> to them which need to be released.
>>
>> I'm not against improving the commit message. But this ain't
>> an improvement to me.
> You are right. How about:
> Let's make sure cp_free works properly after cp_prefetch returns with an
> error by setting ch_len of a ccw chain to the number of the translated
> ccws on that chain.

By the way, since you will propose a new version,
you have a long description of the cp_prefetch function in the code.
I think you should modify it according to the changes and describe how and
why the ch_len field of each chain is used and changed by this function.

Something like:

"
For each chain composing the channel program:
On entry ch_len hold the count of CCW to be translated.
On exit ch_len is adjusted to the count of successfully translated CCW.

This allows cp_free to find in ch_len the count of CCW to free in a chain.
"

Could also be inside the commit message.

Pierre


>
>>> Acked-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
>>> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
>>> Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
>>> Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
>>> ---
>>>   drivers/s390/cio/vfio_ccw_cp.c | 9 ++++++++-
>>>   1 file changed, 8 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
>>> index d9a2fffd034b..2be114db02f9 100644
>>> --- a/drivers/s390/cio/vfio_ccw_cp.c
>>> +++ b/drivers/s390/cio/vfio_ccw_cp.c
>>> @@ -749,11 +749,18 @@ int cp_prefetch(struct channel_program *cp)
>>>   		for (idx = 0; idx < len; idx++) {
>>>   			ret = ccwchain_fetch_one(chain, idx, cp);
>>>   			if (ret)
>>> -				return ret;
>>> +				goto out_err;
>>>   		}
>>>   	}
>>>
>>>   	return 0;
>>> +out_err:
>>> +	/* Only cleanup the chain elements that where actually translated. */
>>> +	chain->ch_len = idx;
>>> +	list_for_each_entry_continue(chain, &cp->ccwchain_list, next) {
>>> +		chain->ch_len = 0;
>>> +	}
>>> +	return ret;
>>>   }
>>>
>>>   /**
>>>
Halil Pasic March 22, 2018, 10:10 a.m. UTC | #3
On 03/22/2018 10:37 AM, Pierre Morel wrote:
> On 22/03/2018 03:22, Dong Jia Shi wrote:
>> * Halil Pasic <pasic@linux.vnet.ibm.com> [2018-03-21 13:49:54 +0100]:
>>
>>>
>>> On 03/21/2018 03:08 AM, Dong Jia Shi wrote:
>>>> From: Halil Pasic <pasic@linux.vnet.ibm.com>
>>>>
>>>> If the translation of a channel program fails, we may end up attempting
>>>> to clean up (free, unpin) stuff that never got translated (and allocated,
>>>> pinned) in the first place.
>>>>
>>>> By adjusting the lengths of the chains accordingly (so the element that
>>>> failed, and all subsequent elements are excluded) cleanup activities
>>>> based on false assumptions can be avoided.
>>>>
>>>> Let's make sure cp_free works properly after cp_prefetch returns with an
>>>> error by setting ch_len to 0 for the ccw chains those are not prefetched.
>>> This sentence used to be:
>>>
>>> Let's make sure cp_free works properly after cp_prefetch returns with an
>>> error.
>>>
>>> @Dong Jia
>>> I find the 'by setting ch_len to 0 for the ccw chains those are not prefetched'
>>> you added for clarification (I guess) somewhat problematic.
>>> The chain in which the translation failure occurred
>>> +    chain->ch_len = idx;
>> I made a mistake. When rewording the message, I missed this part...
>> Sorry for the problem!

np

>>
>>> is shortened so that only the translated elements (ccws) are going to
>>> get cleaned up (on a per element basis) by cp_free. This may or may
>>> not be the first ccw. Subsequent chains are shortened to 0 as there
>>> no translation took place.
>>>
>>> So as a result of this change only properly translated ccws are going
>>> to get (re)visited by cp_free as only those may have resources bound
>>> to them which need to be released.
>>>
>>> I'm not against improving the commit message. But this ain't
>>> an improvement to me.
>> You are right. How about:
>> Let's make sure cp_free works properly after cp_prefetch returns with an
>> error by setting ch_len of a ccw chain to the number of the translated
>> ccws on that chain.

Works with me.

> 
> By the way, since you will propose a new version,
> you have a long description of the cp_prefetch function in the code.
> I think you should modify it according to the changes and describe how and
> why the ch_len field of each chain is used and changed by this function.
> 
> Something like:
> 
> "
> For each chain composing the channel program:
> On entry ch_len hold the count of CCW to be translated.
> On exit ch_len is adjusted to the count of successfully translated CCW.
> 
> This allows cp_free to find in ch_len the count of CCW to free in a chain.
> "

Sounds good to me.

Halil
Cornelia Huck March 26, 2018, 12:28 p.m. UTC | #4
On Thu, 22 Mar 2018 10:37:36 +0100
Pierre Morel <pmorel@linux.vnet.ibm.com> wrote:

> On 22/03/2018 03:22, Dong Jia Shi wrote:
> > * Halil Pasic <pasic@linux.vnet.ibm.com> [2018-03-21 13:49:54 +0100]:
> >  
> >>
> >> On 03/21/2018 03:08 AM, Dong Jia Shi wrote:  
> >>> From: Halil Pasic <pasic@linux.vnet.ibm.com>
> >>>
> >>> If the translation of a channel program fails, we may end up attempting
> >>> to clean up (free, unpin) stuff that never got translated (and allocated,
> >>> pinned) in the first place.
> >>>
> >>> By adjusting the lengths of the chains accordingly (so the element that
> >>> failed, and all subsequent elements are excluded) cleanup activities
> >>> based on false assumptions can be avoided.
> >>>
> >>> Let's make sure cp_free works properly after cp_prefetch returns with an
> >>> error by setting ch_len to 0 for the ccw chains those are not prefetched.  
> >> This sentence used to be:
> >>
> >> Let's make sure cp_free works properly after cp_prefetch returns with an
> >> error.
> >>
> >> @Dong Jia
> >> I find the 'by setting ch_len to 0 for the ccw chains those are not prefetched'
> >> you added for clarification (I guess) somewhat problematic.
> >> The chain in which the translation failure occurred
> >> +	chain->ch_len = idx;  
> > I made a mistake. When rewording the message, I missed this part...
> > Sorry for the problem!
> >  
> >> is shortened so that only the translated elements (ccws) are going to
> >> get cleaned up (on a per element basis) by cp_free. This may or may
> >> not be the first ccw. Subsequent chains are shortened to 0 as there
> >> no translation took place.
> >>
> >> So as a result of this change only properly translated ccws are going
> >> to get (re)visited by cp_free as only those may have resources bound
> >> to them which need to be released.
> >>
> >> I'm not against improving the commit message. But this ain't
> >> an improvement to me.  
> > You are right. How about:
> > Let's make sure cp_free works properly after cp_prefetch returns with an
> > error by setting ch_len of a ccw chain to the number of the translated
> > ccws on that chain.  
> 
> By the way, since you will propose a new version,
> you have a long description of the cp_prefetch function in the code.
> I think you should modify it according to the changes and describe how and
> why the ch_len field of each chain is used and changed by this function.
> 
> Something like:
> 
> "
> For each chain composing the channel program:
> On entry ch_len hold the count of CCW to be translated.

s/hold/holds/ ?

> On exit ch_len is adjusted to the count of successfully translated CCW.
> 
> This allows cp_free to find in ch_len the count of CCW to free in a chain.
> "
> 
> Could also be inside the commit message.
> 
> Pierre
> 
> 
> >  
> >>> Acked-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
> >>> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
> >>> Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
> >>> Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
> >>> ---
> >>>   drivers/s390/cio/vfio_ccw_cp.c | 9 ++++++++-
> >>>   1 file changed, 8 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> >>> index d9a2fffd034b..2be114db02f9 100644
> >>> --- a/drivers/s390/cio/vfio_ccw_cp.c
> >>> +++ b/drivers/s390/cio/vfio_ccw_cp.c
> >>> @@ -749,11 +749,18 @@ int cp_prefetch(struct channel_program *cp)
> >>>   		for (idx = 0; idx < len; idx++) {
> >>>   			ret = ccwchain_fetch_one(chain, idx, cp);
> >>>   			if (ret)
> >>> -				return ret;
> >>> +				goto out_err;
> >>>   		}
> >>>   	}
> >>>
> >>>   	return 0;
> >>> +out_err:
> >>> +	/* Only cleanup the chain elements that where actually translated. */

s/where/were/

> >>> +	chain->ch_len = idx;
> >>> +	list_for_each_entry_continue(chain, &cp->ccwchain_list, next) {
> >>> +		chain->ch_len = 0;
> >>> +	}
> >>> +	return ret;
> >>>   }
> >>>
> >>>   /**
> >>>  
>
Halil Pasic April 20, 2018, 10:54 a.m. UTC | #5
ping

I think get this fixed. Better sooner than later.

On 03/27/2018 03:42 AM, Dong Jia Shi wrote:
> * Cornelia Huck <cohuck@redhat.com> [2018-03-26 14:28:39 +0200]:
> 
> [...]
> 
>>> By the way, since you will propose a new version,
>>> you have a long description of the cp_prefetch function in the code.
>>> I think you should modify it according to the changes and describe how and
>>> why the ch_len field of each chain is used and changed by this function.
>>>
>>> Something like:
>>>
>>> "
>>> For each chain composing the channel program:
>>> On entry ch_len hold the count of CCW to be translated.
>>
>> s/hold/holds/ ?
>>
> Sure.
> 
>>> On exit ch_len is adjusted to the count of successfully translated CCW.
>>>
>>> This allows cp_free to find in ch_len the count of CCW to free in a chain.
>>> "
>>>
>>> Could also be inside the commit message.
>>>
>>> Pierre
>>>
>>>
>>>>   
>>>>>> Acked-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
>>>>>> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
>>>>>> Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
>>>>>> Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
>>>>>> ---
>>>>>>    drivers/s390/cio/vfio_ccw_cp.c | 9 ++++++++-
>>>>>>    1 file changed, 8 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
>>>>>> index d9a2fffd034b..2be114db02f9 100644
>>>>>> --- a/drivers/s390/cio/vfio_ccw_cp.c
>>>>>> +++ b/drivers/s390/cio/vfio_ccw_cp.c
>>>>>> @@ -749,11 +749,18 @@ int cp_prefetch(struct channel_program *cp)
>>>>>>    		for (idx = 0; idx < len; idx++) {
>>>>>>    			ret = ccwchain_fetch_one(chain, idx, cp);
>>>>>>    			if (ret)
>>>>>> -				return ret;
>>>>>> +				goto out_err;
>>>>>>    		}
>>>>>>    	}
>>>>>>
>>>>>>    	return 0;
>>>>>> +out_err:
>>>>>> +	/* Only cleanup the chain elements that where actually translated. */
>>
>> s/where/were/
> Ok.
> 
>>
>>>>>> +	chain->ch_len = idx;
>>>>>> +	list_for_each_entry_continue(chain, &cp->ccwchain_list, next) {
>>>>>> +		chain->ch_len = 0;
>>>>>> +	}
>>>>>> +	return ret;
>>>>>>    }
>>>>>>
>>>>>>    /**
>>>>>>   
>>>
>>
>
Cornelia Huck April 20, 2018, 11:36 a.m. UTC | #6
On Fri, 20 Apr 2018 12:54:26 +0200
Halil Pasic <pasic@linux.vnet.ibm.com> wrote:

> ping
> 
> I think get this fixed. Better sooner than later.
> 
> On 03/27/2018 03:42 AM, Dong Jia Shi wrote:
> > * Cornelia Huck <cohuck@redhat.com> [2018-03-26 14:28:39 +0200]:
> > 
> > [...]
> >   
> >>> By the way, since you will propose a new version,
> >>> you have a long description of the cp_prefetch function in the code.
> >>> I think you should modify it according to the changes and describe how and
> >>> why the ch_len field of each chain is used and changed by this function.
> >>>
> >>> Something like:
> >>>
> >>> "
> >>> For each chain composing the channel program:
> >>> On entry ch_len hold the count of CCW to be translated.  
> >>
> >> s/hold/holds/ ?
> >>  
> > Sure.
> >   
> >>> On exit ch_len is adjusted to the count of successfully translated CCW.
> >>>
> >>> This allows cp_free to find in ch_len the count of CCW to free in a chain.
> >>> "
> >>>
> >>> Could also be inside the commit message.
> >>>
> >>> Pierre
> >>>
> >>>  
> >>>>     
> >>>>>> Acked-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
> >>>>>> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
> >>>>>> Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
> >>>>>> Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
> >>>>>> ---
> >>>>>>    drivers/s390/cio/vfio_ccw_cp.c | 9 ++++++++-
> >>>>>>    1 file changed, 8 insertions(+), 1 deletion(-)
> >>>>>>
> >>>>>> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> >>>>>> index d9a2fffd034b..2be114db02f9 100644
> >>>>>> --- a/drivers/s390/cio/vfio_ccw_cp.c
> >>>>>> +++ b/drivers/s390/cio/vfio_ccw_cp.c
> >>>>>> @@ -749,11 +749,18 @@ int cp_prefetch(struct channel_program *cp)
> >>>>>>    		for (idx = 0; idx < len; idx++) {
> >>>>>>    			ret = ccwchain_fetch_one(chain, idx, cp);
> >>>>>>    			if (ret)
> >>>>>> -				return ret;
> >>>>>> +				goto out_err;
> >>>>>>    		}
> >>>>>>    	}
> >>>>>>
> >>>>>>    	return 0;
> >>>>>> +out_err:
> >>>>>> +	/* Only cleanup the chain elements that where actually translated. */  
> >>
> >> s/where/were/  
> > Ok.
> >   
> >>  
> >>>>>> +	chain->ch_len = idx;
> >>>>>> +	list_for_each_entry_continue(chain, &cp->ccwchain_list, next) {
> >>>>>> +		chain->ch_len = 0;
> >>>>>> +	}
> >>>>>> +	return ret;
> >>>>>>    }
> >>>>>>
> >>>>>>    /**
> >>>>>>     
> >>>  
> >>  
> >   
> 

Hm, it seems that drowned in other patches... can I get a re-send,
please?
Halil Pasic April 20, 2018, 11:55 a.m. UTC | #7
On 04/20/2018 01:36 PM, Cornelia Huck wrote:
> On Fri, 20 Apr 2018 12:54:26 +0200
> Halil Pasic <pasic@linux.vnet.ibm.com> wrote:
> 
>> ping
>>
>> I think get this fixed. Better sooner than later.

[..]

> 
> Hm, it seems that drowned in other patches... can I get a re-send,
> please?

I guess Dong Jia will handle it. If not I will on Monday.

Regards,
Halil
diff mbox

Patch

diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index d9a2fffd034b..2be114db02f9 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -749,11 +749,18 @@  int cp_prefetch(struct channel_program *cp)
 		for (idx = 0; idx < len; idx++) {
 			ret = ccwchain_fetch_one(chain, idx, cp);
 			if (ret)
-				return ret;
+				goto out_err;
 		}
 	}
 
 	return 0;
+out_err:
+	/* Only cleanup the chain elements that where actually translated. */
+	chain->ch_len = idx;
+	list_for_each_entry_continue(chain, &cp->ccwchain_list, next) {
+		chain->ch_len = 0;
+	}
+	return ret;
 }
 
 /**