diff mbox

[RFC] tpm2-space: add handling for global session exhaustion

Message ID 1484772489.2396.2.camel@HansenPartnership.com (mailing list archive)
State New, archived
Headers show

Commit Message

James Bottomley Jan. 18, 2017, 8:48 p.m. UTC
In a TPM2, sessions can be globally exhausted once there are
TPM_PT_ACTIVE_SESSION_MAX of them (even if they're all context saved).
The Strategy for handling this is to keep a global count of all the
sessions along with their creation time.  Then if we see the TPM run
out of sessions (via the TPM_RC_SESSION_HANDLES) we first wait for one
to become free, but if it doesn't, we forcibly evict an existing one.
The eviction strategy waits until the current command is repeated to
evict the session which should guarantee there is an available slot.

On the force eviction case, we make sure that the victim session is at
least SESSION_TIMEOUT old (currently 2 seconds).  The wait queue for
session slots is a FIFO one, ensuring that once we run out of
sessions, everyone will get a session in a bounded time and once they
get one, they'll have SESSION_TIMEOUT to use it before it may be
subject to eviction.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Jarkko Sakkinen Jan. 19, 2017, 12:25 p.m. UTC | #1
On Wed, Jan 18, 2017 at 03:48:09PM -0500, James Bottomley wrote:
> In a TPM2, sessions can be globally exhausted once there are
> TPM_PT_ACTIVE_SESSION_MAX of them (even if they're all context saved).
> The Strategy for handling this is to keep a global count of all the
> sessions along with their creation time.  Then if we see the TPM run
> out of sessions (via the TPM_RC_SESSION_HANDLES) we first wait for one
> to become free, but if it doesn't, we forcibly evict an existing one.
> The eviction strategy waits until the current command is repeated to
> evict the session which should guarantee there is an available slot.
> 
> On the force eviction case, we make sure that the victim session is at
> least SESSION_TIMEOUT old (currently 2 seconds).  The wait queue for
> session slots is a FIFO one, ensuring that once we run out of
> sessions, everyone will get a session in a bounded time and once they
> get one, they'll have SESSION_TIMEOUT to use it before it may be
> subject to eviction.
> 
> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

I didn't yet read the code properly. I'll do a more proper review
once I have v4 of my patch set together. This comment is solely
based on your commit message.

I'm just thinking that do we need this complicated timeout stuff
or could you just kick a session out in LRU fashion as we run
out of them?

Or one variation of what you are doing: couldn't the session that
needs a session handle to do something sleep for 2 seconds and then
take the oldest session? It would have essentially the same effect
but no waitqueue needed.

Yeah, as I said, this is just commentary based on the description.

/Jarkko
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarkko Sakkinen Jan. 19, 2017, 12:41 p.m. UTC | #2
On Thu, Jan 19, 2017 at 02:25:33PM +0200, Jarkko Sakkinen wrote:
> On Wed, Jan 18, 2017 at 03:48:09PM -0500, James Bottomley wrote:
> > In a TPM2, sessions can be globally exhausted once there are
> > TPM_PT_ACTIVE_SESSION_MAX of them (even if they're all context saved).
> > The Strategy for handling this is to keep a global count of all the
> > sessions along with their creation time.  Then if we see the TPM run
> > out of sessions (via the TPM_RC_SESSION_HANDLES) we first wait for one
> > to become free, but if it doesn't, we forcibly evict an existing one.
> > The eviction strategy waits until the current command is repeated to
> > evict the session which should guarantee there is an available slot.
> > 
> > On the force eviction case, we make sure that the victim session is at
> > least SESSION_TIMEOUT old (currently 2 seconds).  The wait queue for
> > session slots is a FIFO one, ensuring that once we run out of
> > sessions, everyone will get a session in a bounded time and once they
> > get one, they'll have SESSION_TIMEOUT to use it before it may be
> > subject to eviction.
> > 
> > Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
> 
> I didn't yet read the code properly. I'll do a more proper review
> once I have v4 of my patch set together. This comment is solely
> based on your commit message.
> 
> I'm just thinking that do we need this complicated timeout stuff
> or could you just kick a session out in LRU fashion as we run
> out of them?
> 
> Or one variation of what you are doing: couldn't the session that
> needs a session handle to do something sleep for 2 seconds and then
> take the oldest session? It would have essentially the same effect
> but no waitqueue needed.
> 
> Yeah, as I said, this is just commentary based on the description.

I actually think that the very best solution would be such that
sessions would be *always* lease based. So when you create a
session you would always loose within a time limit.

There would not be any special victim selection mechanism. You
would just loose your session within a time limit.

This could be already part of the session isolation and would
actually make only isolation usable.

We do not have API yet locked so why not make API that models
the nature of the resource. Here given that the amount of sessions
is always fixed leases make sense.

You just then need a wait queue for those waiting for leases.
They don't need to do any victim selectio or whatever. Everything
that takes above the lease gets flushed.

I strongly feel that this would be the best long term solution.

/Jarkko
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley Jan. 19, 2017, 12:59 p.m. UTC | #3
On Thu, 2017-01-19 at 14:25 +0200, Jarkko Sakkinen wrote:
> On Wed, Jan 18, 2017 at 03:48:09PM -0500, James Bottomley wrote:
> > In a TPM2, sessions can be globally exhausted once there are
> > TPM_PT_ACTIVE_SESSION_MAX of them (even if they're all context
> > saved).
> > The Strategy for handling this is to keep a global count of all the
> > sessions along with their creation time.  Then if we see the TPM
> > run
> > out of sessions (via the TPM_RC_SESSION_HANDLES) we first wait for
> > one
> > to become free, but if it doesn't, we forcibly evict an existing
> > one.
> > The eviction strategy waits until the current command is repeated
> > to
> > evict the session which should guarantee there is an available
> > slot.
> > 
> > On the force eviction case, we make sure that the victim session is
> > at
> > least SESSION_TIMEOUT old (currently 2 seconds).  The wait queue
> > for
> > session slots is a FIFO one, ensuring that once we run out of
> > sessions, everyone will get a session in a bounded time and once
> > they
> > get one, they'll have SESSION_TIMEOUT to use it before it may be
> > subject to eviction.
> > 
> > Signed-off-by: James Bottomley <
> > James.Bottomley@HansenPartnership.com>
> 
> I didn't yet read the code properly. I'll do a more proper review
> once I have v4 of my patch set together. This comment is solely
> based on your commit message.
> 
> I'm just thinking that do we need this complicated timeout stuff
> or could you just kick a session out in LRU fashion as we run
> out of them?
> 
> Or one variation of what you are doing: couldn't the session that
> needs a session handle to do something sleep for 2 seconds and then
> take the oldest session? It would have essentially the same effect
> but no waitqueue needed.
> 
> Yeah, as I said, this is just commentary based on the description.

If you don't have a wait queue you lose fairness in resource allocation
on starvation.  What happens is that you get RC_SESSION_HANDLES and
sleep for 2s and retry.  Meanwhile someone frees a session, then next
user grabs it while you were sleeping and when you wake you still get
RC_SESSION_HANDLES.  I can basically DoS your process if I understand
this. The only way to make the resource fairly allocated: i.e. the
first person to sleep waiting for a session is the one who gets it when
they wake is to make sure that you wake one waiter as soon as a free
session comes in so probabalistically, they get the session.  If you
look, there are two mechanisms for ensuring fairness: one is the FIFO
wait queue (probabalistic) and the other is the reserved session which
really ensures it belongs to you when you wake (deterministic but
expensive, so this is only activated on the penultimate go around).

James


--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarkko Sakkinen Jan. 20, 2017, 1:40 p.m. UTC | #4
On Thu, Jan 19, 2017 at 07:59:04AM -0500, James Bottomley wrote:
> On Thu, 2017-01-19 at 14:25 +0200, Jarkko Sakkinen wrote:
> > On Wed, Jan 18, 2017 at 03:48:09PM -0500, James Bottomley wrote:
> > > In a TPM2, sessions can be globally exhausted once there are
> > > TPM_PT_ACTIVE_SESSION_MAX of them (even if they're all context
> > > saved).
> > > The Strategy for handling this is to keep a global count of all the
> > > sessions along with their creation time.  Then if we see the TPM
> > > run
> > > out of sessions (via the TPM_RC_SESSION_HANDLES) we first wait for
> > > one
> > > to become free, but if it doesn't, we forcibly evict an existing
> > > one.
> > > The eviction strategy waits until the current command is repeated
> > > to
> > > evict the session which should guarantee there is an available
> > > slot.
> > > 
> > > On the force eviction case, we make sure that the victim session is
> > > at
> > > least SESSION_TIMEOUT old (currently 2 seconds).  The wait queue
> > > for
> > > session slots is a FIFO one, ensuring that once we run out of
> > > sessions, everyone will get a session in a bounded time and once
> > > they
> > > get one, they'll have SESSION_TIMEOUT to use it before it may be
> > > subject to eviction.
> > > 
> > > Signed-off-by: James Bottomley <
> > > James.Bottomley@HansenPartnership.com>
> > 
> > I didn't yet read the code properly. I'll do a more proper review
> > once I have v4 of my patch set together. This comment is solely
> > based on your commit message.
> > 
> > I'm just thinking that do we need this complicated timeout stuff
> > or could you just kick a session out in LRU fashion as we run
> > out of them?
> > 
> > Or one variation of what you are doing: couldn't the session that
> > needs a session handle to do something sleep for 2 seconds and then
> > take the oldest session? It would have essentially the same effect
> > but no waitqueue needed.
> > 
> > Yeah, as I said, this is just commentary based on the description.
> 
> If you don't have a wait queue you lose fairness in resource allocation
> on starvation.  What happens is that you get RC_SESSION_HANDLES and
> sleep for 2s and retry.  Meanwhile someone frees a session, then next
> user grabs it while you were sleeping and when you wake you still get
> RC_SESSION_HANDLES.  I can basically DoS your process if I understand
> this. The only way to make the resource fairly allocated: i.e. the
> first person to sleep waiting for a session is the one who gets it when
> they wake is to make sure that you wake one waiter as soon as a free
> session comes in so probabalistically, they get the session.  If you
> look, there are two mechanisms for ensuring fairness: one is the FIFO
> wait queue (probabalistic) and the other is the reserved session which
> really ensures it belongs to you when you wake (deterministic but
> expensive, so this is only activated on the penultimate go around).
> 
> James

Right, I see your point.

/Jarkko
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley Jan. 27, 2017, 9:59 p.m. UTC | #5
On Fri, 2017-01-27 at 16:20 -0500, Ken Goldman wrote:
> On 1/19/2017 7:41 AM, Jarkko Sakkinen wrote:
> > 
> > I actually think that the very best solution would be such that
> > sessions would be *always* lease based. So when you create a
> > session you would always loose within a time limit.
> > 
> > There would not be any special victim selection mechanism. You
> > would just loose your session within a time limit.
> 
> I worry about the time limit.
> 
> I have a proposed use case (policy signed) where the user sends the 
> session nonce along with a "payment" to a vendor and receives back a 
> signature authorization over the nonce.
> 
> The time could be minutes or even hours.

So the problem is that sessions are a limited resource and we need a
way to allocate them when under resource pressure.  Leasing is the
fairest way I can think of but I'm open to other mechanisms if you
propose them.

Note that the lease mechanism doesn't mean every session expires after
the limit, it just means that every session becomes eligible for
reclaim after the limit.  If there's no-one else waiting, you can keep
your session for hours.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley Jan. 27, 2017, 10:04 p.m. UTC | #6
On Fri, 2017-01-27 at 16:42 -0500, Ken Goldman wrote:
> On 1/18/2017 3:48 PM, James Bottomley wrote:
> > In a TPM2, sessions can be globally exhausted once there are
> > TPM_PT_ACTIVE_SESSION_MAX of them (even if they're all context
> > saved).
> > The Strategy for handling this is to keep a global count of all the
> > sessions along with their creation time.  Then if we see the TPM
> > run
> > out of sessions (via the TPM_RC_SESSION_HANDLES) we first wait for
> > one
> > to become free, but if it doesn't, we forcibly evict an existing
> > one.
> > The eviction strategy waits until the current command is repeated
> > to
> > evict the session which should guarantee there is an available
> > slot.
> 
> Beware the nasty corner case:
> 
> - Application asks for a session and gets 02000000
> 
> - Time elapses and 02000000 gets forcibly flushed
> 
> - Later, app comes back, asks for a second session and again gets
> 02000000.
> 
> - App gets very confused.
> 
> May it be better to close the connection completely, which the 
> application can detect, than flush a session and give this corner
> case?

if I look at the code I've written, I don't know what the session
number is, I just save sessionHandle in a variable for later use (lets
say to v1).  If I got the same session number returned at a later time
and placed it in v2, all I'd notice is that an authorization using v1
would fail.  I'm not averse to killing the entire connection but,
assuming you have fallback, it might be kinder simply to ensure that
the operations with the reclaimed session fail (which is what the code
currently does).

> ~~~~
> 
> Part of me says to defer this.  That is:
> 
> 64 sessions / 3 = 21 simultaneous applications.  If we have 21 
> simultaneous TCG applications, we'll all celebrate.  For the DoS,
> chmod and chgrp /dev/tpm and let only well behaved applications in 
> the group.
> 
> Agreed, it's not a long term solution.

My use case is secret protection in the cloud.  I can certainly see >
21 applications wanting to do this at roughly the same time. However,
the periods over which they actually all need sessions should be very
short, hence the leasing proposal which would stagger them.

James

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe Jan. 27, 2017, 11:35 p.m. UTC | #7
On Fri, Jan 27, 2017 at 02:04:59PM -0800, James Bottomley wrote:

> if I look at the code I've written, I don't know what the session
> number is, I just save sessionHandle in a variable for later use (lets
> say to v1).  If I got the same session number returned at a later time
> and placed it in v2, all I'd notice is that an authorization using v1
> would fail.

Is there any way that could be used to cause an op thinking it is
using v1 to authorize something it shouldn't?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley Jan. 27, 2017, 11:48 p.m. UTC | #8
On Fri, 2017-01-27 at 16:35 -0700, Jason Gunthorpe wrote:
> On Fri, Jan 27, 2017 at 02:04:59PM -0800, James Bottomley wrote:
> 
> > if I look at the code I've written, I don't know what the session
> > number is, I just save sessionHandle in a variable for later use 
> > (lets say to v1).  If I got the same session number returned at a 
> > later time and placed it in v2, all I'd notice is that an 
> > authorization using v1 would fail.
> 
> Is there any way that could be used to cause an op thinking it is
> using v1 to authorize something it shouldn't?

Not really: in the parameter or HMAC case, you have to compute based on
the initial nonce given by the TPM when the session was created. 
 Assuming the initial nonce belonged to the evicted session, the HMAC
will now fail because the nonce of the v2 session is different.  There
is a corner case where you track the nonce in a table indexed by
handle, so when v2 is created, its nonce replaces the old v1 nonce in
the table.  Now you can use v1 and v2 without error (because use picks
up the correct nonce) and effectively they're interchangeable as the
same session.  Even in this case, you're not authorising something you
shouldn't, you're just using one session for the authorisations where
you thought you had two.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ken Goldman Jan. 30, 2017, 12:52 a.m. UTC | #9
On 1/27/2017 5:04 PM, James Bottomley wrote:

>> Beware the nasty corner case:
>>
>> - Application asks for a session and gets 02000000
>>
>> - Time elapses and 02000000 gets forcibly flushed
>>
>> - Later, app comes back, asks for a second session and again gets
>> 02000000.
>>
>> - App gets very confused.
>>
>> May it be better to close the connection completely, which the
>> application can detect, than flush a session and give this corner
>> case?
>
> if I look at the code I've written, I don't know what the session
> number is, I just save sessionHandle in a variable for later use (lets
> say to v1).  If I got the same session number returned at a later time
> and placed it in v2, all I'd notice is that an authorization using v1
> would fail.  I'm not averse to killing the entire connection but,
> assuming you have fallback, it might be kinder simply to ensure that
> the operations with the reclaimed session fail (which is what the code
> currently does).

My worry is that this session failure cannot be detected by the 
application.  An HMAC failure could cause the app to tell a user that 
they entered the wrong password.  Misleading.  On the TPM, it could 
trigger the dictionary attack lockout.  For a PIN index, it could 
consume a failure count.  Killing a policy session that has e.g., a 
policy signed term could cause the application to go back to some 
external entity for another authorization signature.

Let's go up to the stack.  What's the attack?

If we're worried about many simultaneous applications (wouldn't that be 
wonderful), why not just let startauthsession fail?  The application can 
just retry periodically.  Just allocate them in triples so there's no 
deadlock.

If we're worried about a DoS attack, killing a session just helps the 
attacker.  The attacker can create a few connections and spin on 
startauthsession, locking everyone out anyway.

~~

Also, let's remember that this is a rare application.  Sessions are only 
needed for remote access (requiring encryption, HMAC or salt), or policy 
sessions.

~~

Should the code also reserve a session for the kernel?  Mark it not 
kill'able?



--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley Jan. 30, 2017, 4:04 p.m. UTC | #10
On Sun, 2017-01-29 at 19:52 -0500, Ken Goldman wrote:
> On 1/27/2017 5:04 PM, James Bottomley wrote:
> 
> > > Beware the nasty corner case:
> > > 
> > > - Application asks for a session and gets 02000000
> > > 
> > > - Time elapses and 02000000 gets forcibly flushed
> > > 
> > > - Later, app comes back, asks for a second session and again gets
> > > 02000000.
> > > 
> > > - App gets very confused.
> > > 
> > > May it be better to close the connection completely, which the
> > > application can detect, than flush a session and give this corner
> > > case?
> > 
> > if I look at the code I've written, I don't know what the session
> > number is, I just save sessionHandle in a variable for later use 
> > (lets say to v1).  If I got the same session number returned at a 
> > later time and placed it in v2, all I'd notice is that an 
> > authorization using v1 would fail.  I'm not averse to killing the 
> > entire connection but, assuming you have fallback, it might be 
> > kinder simply to ensure that the operations with the reclaimed 
> > session fail (which is what the code currently does).
> 
> My worry is that this session failure cannot be detected by the 
> application.  An HMAC failure could cause the app to tell a user that
> they entered the wrong password.  Misleading.  On the TPM, it could 
> trigger the dictionary attack lockout.  For a PIN index, it could 
> consume a failure count.  Killing a policy session that has e.g., a 
> policy signed term could cause the application to go back to some 
> external entity for another authorization signature.
> 
> Let's go up to the stack.  What's the attack?
> 
> If we're worried about many simultaneous applications (wouldn't that 
> be wonderful), why not just let startauthsession fail?  The 
> application can just retry periodically.

How in that scenario do we ensure that a session becomes available? 
 Once that's established, there's no real difference between retrying
the startauthsession in the kernel when we know the session is
available and forcing userspace to do the retry except that the former
has a far greater chance of success (and it's only about 6 lines of
code).

>   Just allocate them in triples so there's no deadlock.

Is this the application or the kernel?  If it's the kernel, that adds a
lot of complexity.

> If we're worried about a DoS attack, killing a session just helps the
> attacker.  The attacker can create a few connections and spin on 
> startauthsession, locking everyone out anyway.

There are two considerations here: firstly we'd need to introduce a
mechanism to "kill" the connection.  Probably we'd simply error every
command on the space until it was closed.  The second is which scenario
is more reasonable: Say the application simply forgot to flush the
session and will never use it again.  Simply reclaiming the session
would produce no effect at all on the application in this scenario. 
 However, I have no data to say what's likely.

> ~~
> 
> Also, let's remember that this is a rare application.  Sessions are 
> only needed for remote access (requiring encryption, HMAC or salt), 
> or policy sessions.

This depends what your threat model is.  For ssh keys, you worry that
someone might be watching, so you use HMAC authority even for a local
TPM.  In the cloud, you don't quite know where the TPM is, so again
you'd use HMAC sessions ... however, in both use cases the sessions
should be very short lived.

> ~~
> 
> Should the code also reserve a session for the kernel?  Mark it not 
> kill'able?

At the moment, the kernel doesn't use sessions, so let's worry about
that problem at the point it arises (if it ever arises).

James

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarkko Sakkinen Jan. 30, 2017, 9:58 p.m. UTC | #11
On Mon, Jan 30, 2017 at 08:04:55AM -0800, James Bottomley wrote:
> On Sun, 2017-01-29 at 19:52 -0500, Ken Goldman wrote:
> > On 1/27/2017 5:04 PM, James Bottomley wrote:
> > 
> > > > Beware the nasty corner case:
> > > > 
> > > > - Application asks for a session and gets 02000000
> > > > 
> > > > - Time elapses and 02000000 gets forcibly flushed
> > > > 
> > > > - Later, app comes back, asks for a second session and again gets
> > > > 02000000.
> > > > 
> > > > - App gets very confused.
> > > > 
> > > > May it be better to close the connection completely, which the
> > > > application can detect, than flush a session and give this corner
> > > > case?
> > > 
> > > if I look at the code I've written, I don't know what the session
> > > number is, I just save sessionHandle in a variable for later use 
> > > (lets say to v1).  If I got the same session number returned at a 
> > > later time and placed it in v2, all I'd notice is that an 
> > > authorization using v1 would fail.  I'm not averse to killing the 
> > > entire connection but, assuming you have fallback, it might be 
> > > kinder simply to ensure that the operations with the reclaimed 
> > > session fail (which is what the code currently does).
> > 
> > My worry is that this session failure cannot be detected by the 
> > application.  An HMAC failure could cause the app to tell a user that
> > they entered the wrong password.  Misleading.  On the TPM, it could 
> > trigger the dictionary attack lockout.  For a PIN index, it could 
> > consume a failure count.  Killing a policy session that has e.g., a 
> > policy signed term could cause the application to go back to some 
> > external entity for another authorization signature.
> > 
> > Let's go up to the stack.  What's the attack?
> > 
> > If we're worried about many simultaneous applications (wouldn't that 
> > be wonderful), why not just let startauthsession fail?  The 
> > application can just retry periodically.
> 
> How in that scenario do we ensure that a session becomes available? 
>  Once that's established, there's no real difference between retrying
> the startauthsession in the kernel when we know the session is
> available and forcing userspace to do the retry except that the former
> has a far greater chance of success (and it's only about 6 lines of
> code).
> 
> >   Just allocate them in triples so there's no deadlock.
> 
> Is this the application or the kernel?  If it's the kernel, that adds a
> lot of complexity.
> 
> > If we're worried about a DoS attack, killing a session just helps the
> > attacker.  The attacker can create a few connections and spin on 
> > startauthsession, locking everyone out anyway.
> 
> There are two considerations here: firstly we'd need to introduce a
> mechanism to "kill" the connection.  Probably we'd simply error every
> command on the space until it was closed.  The second is which scenario
> is more reasonable: Say the application simply forgot to flush the
> session and will never use it again.  Simply reclaiming the session
> would produce no effect at all on the application in this scenario. 
>  However, I have no data to say what's likely.
> 
> > ~~
> > 
> > Also, let's remember that this is a rare application.  Sessions are 
> > only needed for remote access (requiring encryption, HMAC or salt), 
> > or policy sessions.
> 
> This depends what your threat model is.  For ssh keys, you worry that
> someone might be watching, so you use HMAC authority even for a local
> TPM.  In the cloud, you don't quite know where the TPM is, so again
> you'd use HMAC sessions ... however, in both use cases the sessions
> should be very short lived.
> 
> > ~~
> > 
> > Should the code also reserve a session for the kernel?  Mark it not 
> > kill'able?
> 
> At the moment, the kernel doesn't use sessions, so let's worry about
> that problem at the point it arises (if it ever arises).
> 
> James

It does. My trusted keys implementation actually uses sessions.

I'm kind dilating to an opinion that we would leave this commit out from
the first kernel release that will contain the resource manager with
similar rationale as Jason gave me for whitelisting: get the basic stuff
in and once it is used with some workloads whitelisting and exhaustion
will take eventually the right form.

How would you feel about this?

/Jarkko
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley Jan. 30, 2017, 10:13 p.m. UTC | #12
On Mon, 2017-01-30 at 23:58 +0200, Jarkko Sakkinen wrote:
> On Mon, Jan 30, 2017 at 08:04:55AM -0800, James Bottomley wrote:
> > On Sun, 2017-01-29 at 19:52 -0500, Ken Goldman wrote:
> > > On 1/27/2017 5:04 PM, James Bottomley wrote:
> > > 
> > > > > Beware the nasty corner case:
> > > > > 
> > > > > - Application asks for a session and gets 02000000
> > > > > 
> > > > > - Time elapses and 02000000 gets forcibly flushed
> > > > > 
> > > > > - Later, app comes back, asks for a second session and again
> > > > > gets
> > > > > 02000000.
> > > > > 
> > > > > - App gets very confused.
> > > > > 
> > > > > May it be better to close the connection completely, which
> > > > > the
> > > > > application can detect, than flush a session and give this
> > > > > corner
> > > > > case?
> > > > 
> > > > if I look at the code I've written, I don't know what the
> > > > session
> > > > number is, I just save sessionHandle in a variable for later
> > > > use 
> > > > (lets say to v1).  If I got the same session number returned at
> > > > a 
> > > > later time and placed it in v2, all I'd notice is that an 
> > > > authorization using v1 would fail.  I'm not averse to killing
> > > > the 
> > > > entire connection but, assuming you have fallback, it might be 
> > > > kinder simply to ensure that the operations with the reclaimed 
> > > > session fail (which is what the code currently does).
> > > 
> > > My worry is that this session failure cannot be detected by the 
> > > application.  An HMAC failure could cause the app to tell a user
> > > that
> > > they entered the wrong password.  Misleading.  On the TPM, it
> > > could 
> > > trigger the dictionary attack lockout.  For a PIN index, it could
> > > consume a failure count.  Killing a policy session that has e.g.,
> > > a 
> > > policy signed term could cause the application to go back to some
> > > external entity for another authorization signature.
> > > 
> > > Let's go up to the stack.  What's the attack?
> > > 
> > > If we're worried about many simultaneous applications (wouldn't
> > > that 
> > > be wonderful), why not just let startauthsession fail?  The 
> > > application can just retry periodically.
> > 
> > How in that scenario do we ensure that a session becomes available?
> >  Once that's established, there's no real difference between
> > retrying
> > the startauthsession in the kernel when we know the session is
> > available and forcing userspace to do the retry except that the
> > former
> > has a far greater chance of success (and it's only about 6 lines of
> > code).
> > 
> > >   Just allocate them in triples so there's no deadlock.
> > 
> > Is this the application or the kernel?  If it's the kernel, that
> > adds a
> > lot of complexity.
> > 
> > > If we're worried about a DoS attack, killing a session just helps
> > > the
> > > attacker.  The attacker can create a few connections and spin on 
> > > startauthsession, locking everyone out anyway.
> > 
> > There are two considerations here: firstly we'd need to introduce a
> > mechanism to "kill" the connection.  Probably we'd simply error
> > every
> > command on the space until it was closed.  The second is which
> > scenario
> > is more reasonable: Say the application simply forgot to flush the
> > session and will never use it again.  Simply reclaiming the session
> > would produce no effect at all on the application in this scenario.
> >  However, I have no data to say what's likely.
> > 
> > > ~~
> > > 
> > > Also, let's remember that this is a rare application.  Sessions
> > > are 
> > > only needed for remote access (requiring encryption, HMAC or
> > > salt), 
> > > or policy sessions.
> > 
> > This depends what your threat model is.  For ssh keys, you worry
> > that
> > someone might be watching, so you use HMAC authority even for a
> > local
> > TPM.  In the cloud, you don't quite know where the TPM is, so again
> > you'd use HMAC sessions ... however, in both use cases the sessions
> > should be very short lived.
> > 
> > > ~~
> > > 
> > > Should the code also reserve a session for the kernel?  Mark it
> > > not 
> > > kill'able?
> > 
> > At the moment, the kernel doesn't use sessions, so let's worry
> > about
> > that problem at the point it arises (if it ever arises).
> > 
> > James
> 
> It does. My trusted keys implementation actually uses sessions.

But as I read the code, I can't find where the kernel creates a
session.  It looks like the session and hmac are passed in as option
arguments, aren't they?

> I'm kind dilating to an opinion that we would leave this commit out 
> from the first kernel release that will contain the resource manager 
> with similar rationale as Jason gave me for whitelisting: get the 
> basic stuff in and once it is used with some workloads whitelisting 
> and exhaustion will take eventually the right form.
> 
> How would you feel about this?

As long as we get patch 1/2 then applications using sessions will
actually work with spaces, so taking more time with 2/2 is fine by me.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarkko Sakkinen Jan. 31, 2017, 1:31 p.m. UTC | #13
On Mon, Jan 30, 2017 at 02:13:08PM -0800, James Bottomley wrote:
> On Mon, 2017-01-30 at 23:58 +0200, Jarkko Sakkinen wrote:
> > On Mon, Jan 30, 2017 at 08:04:55AM -0800, James Bottomley wrote:
> > > On Sun, 2017-01-29 at 19:52 -0500, Ken Goldman wrote:
> > > > On 1/27/2017 5:04 PM, James Bottomley wrote:
> > > > 
> > > > > > Beware the nasty corner case:
> > > > > > 
> > > > > > - Application asks for a session and gets 02000000
> > > > > > 
> > > > > > - Time elapses and 02000000 gets forcibly flushed
> > > > > > 
> > > > > > - Later, app comes back, asks for a second session and again
> > > > > > gets
> > > > > > 02000000.
> > > > > > 
> > > > > > - App gets very confused.
> > > > > > 
> > > > > > May it be better to close the connection completely, which
> > > > > > the
> > > > > > application can detect, than flush a session and give this
> > > > > > corner
> > > > > > case?
> > > > > 
> > > > > if I look at the code I've written, I don't know what the
> > > > > session
> > > > > number is, I just save sessionHandle in a variable for later
> > > > > use 
> > > > > (lets say to v1).  If I got the same session number returned at
> > > > > a 
> > > > > later time and placed it in v2, all I'd notice is that an 
> > > > > authorization using v1 would fail.  I'm not averse to killing
> > > > > the 
> > > > > entire connection but, assuming you have fallback, it might be 
> > > > > kinder simply to ensure that the operations with the reclaimed 
> > > > > session fail (which is what the code currently does).
> > > > 
> > > > My worry is that this session failure cannot be detected by the 
> > > > application.  An HMAC failure could cause the app to tell a user
> > > > that
> > > > they entered the wrong password.  Misleading.  On the TPM, it
> > > > could 
> > > > trigger the dictionary attack lockout.  For a PIN index, it could
> > > > consume a failure count.  Killing a policy session that has e.g.,
> > > > a 
> > > > policy signed term could cause the application to go back to some
> > > > external entity for another authorization signature.
> > > > 
> > > > Let's go up to the stack.  What's the attack?
> > > > 
> > > > If we're worried about many simultaneous applications (wouldn't
> > > > that 
> > > > be wonderful), why not just let startauthsession fail?  The 
> > > > application can just retry periodically.
> > > 
> > > How in that scenario do we ensure that a session becomes available?
> > >  Once that's established, there's no real difference between
> > > retrying
> > > the startauthsession in the kernel when we know the session is
> > > available and forcing userspace to do the retry except that the
> > > former
> > > has a far greater chance of success (and it's only about 6 lines of
> > > code).
> > > 
> > > >   Just allocate them in triples so there's no deadlock.
> > > 
> > > Is this the application or the kernel?  If it's the kernel, that
> > > adds a
> > > lot of complexity.
> > > 
> > > > If we're worried about a DoS attack, killing a session just helps
> > > > the
> > > > attacker.  The attacker can create a few connections and spin on 
> > > > startauthsession, locking everyone out anyway.
> > > 
> > > There are two considerations here: firstly we'd need to introduce a
> > > mechanism to "kill" the connection.  Probably we'd simply error
> > > every
> > > command on the space until it was closed.  The second is which
> > > scenario
> > > is more reasonable: Say the application simply forgot to flush the
> > > session and will never use it again.  Simply reclaiming the session
> > > would produce no effect at all on the application in this scenario.
> > >  However, I have no data to say what's likely.
> > > 
> > > > ~~
> > > > 
> > > > Also, let's remember that this is a rare application.  Sessions
> > > > are 
> > > > only needed for remote access (requiring encryption, HMAC or
> > > > salt), 
> > > > or policy sessions.
> > > 
> > > This depends what your threat model is.  For ssh keys, you worry
> > > that
> > > someone might be watching, so you use HMAC authority even for a
> > > local
> > > TPM.  In the cloud, you don't quite know where the TPM is, so again
> > > you'd use HMAC sessions ... however, in both use cases the sessions
> > > should be very short lived.
> > > 
> > > > ~~
> > > > 
> > > > Should the code also reserve a session for the kernel?  Mark it
> > > > not 
> > > > kill'able?
> > > 
> > > At the moment, the kernel doesn't use sessions, so let's worry
> > > about
> > > that problem at the point it arises (if it ever arises).
> > > 
> > > James
> > 
> > It does. My trusted keys implementation actually uses sessions.
> 
> But as I read the code, I can't find where the kernel creates a
> session.  It looks like the session and hmac are passed in as option
> arguments, aren't they?

Yes. Sorry, I mixed up things.

> > I'm kind dilating to an opinion that we would leave this commit out 
> > from the first kernel release that will contain the resource manager 
> > with similar rationale as Jason gave me for whitelisting: get the 
> > basic stuff in and once it is used with some workloads whitelisting 
> > and exhaustion will take eventually the right form.
> > 
> > How would you feel about this?
> 
> As long as we get patch 1/2 then applications using sessions will
> actually work with spaces, so taking more time with 2/2 is fine by me.
> 
> James

1/2 contains code that with a few iterations it is in the form that I'm
able to merge it.

With 2/2 I'm not saying it is wrong approach but I cannot yet say that
I'm confident that it would be the best approach.

I think that the transient object and infrastructure stuff that is
already in the patch set and 1/2 session is the subset of commits where
we can be fairly confident that we are doing the right thing.

I'll start preparing a patch set with this content without RFC tag.

/Jarkko
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley Jan. 31, 2017, 7:55 p.m. UTC | #14
On Tue, 2017-01-31 at 14:28 -0500, Ken Goldman wrote:
> On 1/30/2017 11:04 AM, James Bottomley wrote:
> > 
> > This depends what your threat model is.  For ssh keys, you worry
> > that someone might be watching, so you use HMAC authority even for 
> > a local TPM.
> 
> If someone can "watch" my local process, they can capture my password
> anyway.  Does using a password that the attacker knows to HMAC the 
> command help?

It's about attack surface.  If you want my password and I use TPM_RS_PW
then you either prise it out of my app or snoop the command path.  If I
always use HMAC, I know you can only prise it out of my app (reduction
in attack surface) and I can plan defences accordingly (not saying I'll
be successful, just saying I have a better idea where the attack is
coming from).

> > In the cloud, you don't quite know where the TPM is, so again you'd
> > use HMAC sessions ... however, in both use cases the sessions 
> > should be very short lived.
> 
> If your entire application is in the cloud, then I think the same 
> question as above applies.
> 
> If you have your application on one platform (that you trust) and the
> TPM is on another (that you don't trust), then I absolutely agree 
> that HMAC (and parameter encryption) are necessary.

It's attack surface again ... although lengthening the transmission
pathway, which happens in the cloud, correspondingly increases that sur
face.

Look at it this way: if your TPM were network remote, would you still
think TPM_RS_PW to be appropriate?  I suspect not because the network
is seen as a very insecure pathway.  We can argue about the relative
security or insecurity of other pathways to the TPM, but it's
unarguable that using HMAC and parameter encryption means we don't have
to (and so is best practice).

James

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index a625884..c959b09 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -164,6 +164,7 @@  struct tpm_chip *tpm_chip_alloc(struct device *pdev,
 
 	mutex_init(&chip->tpm_mutex);
 	init_rwsem(&chip->ops_sem);
+	init_waitqueue_head(&chip->session_wait);
 
 	chip->ops = ops;
 
diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
index 9923daa..38cc21c 100644
--- a/drivers/char/tpm/tpm.h
+++ b/drivers/char/tpm/tpm.h
@@ -95,6 +95,7 @@  enum tpm2_return_codes {
 	TPM2_RC_HANDLE		= 0x008B,
 	TPM2_RC_INITIALIZE	= 0x0100, /* RC_VER1 */
 	TPM2_RC_DISABLED	= 0x0120,
+	TPM2_RC_SESSION_HANDLES	= 0x0905,
 	TPM2_RC_TESTING		= 0x090A, /* RC_WARN */
 };
 
@@ -136,7 +137,8 @@  enum tpm2_capabilities {
 };
 
 enum tpm2_properties {
-	TPM_PT_TOTAL_COMMANDS	= 0x0129,
+	TPM_PT_TOTAL_COMMANDS		= 0x0129,
+	TPM_PT_ACTIVE_SESSIONS_MAX	= 0x0111,
 };
 
 enum tpm2_startup_types {
@@ -160,8 +162,24 @@  struct tpm_space {
 	u8 *context_buf;
 	u32 session_tbl[6];
 	u8 *session_buf;
+	u32 reserved_handle;
 };
 
+#define TPM2_HANDLE_FORCE_EVICT 0xFFFFFFFF
+
+static inline void tpm2_session_force_evict(struct tpm_space *space)
+{
+	/* if reserved handle is not empty, we already have a
+	 * session for eviction, so no need to force one
+	 */
+	if (space->reserved_handle == 0)
+		space->reserved_handle = TPM2_HANDLE_FORCE_EVICT;
+}
+static inline bool tpm2_is_session_force_evict(struct tpm_space *space)
+{
+	return space->reserved_handle == TPM2_HANDLE_FORCE_EVICT;
+}
+
 enum tpm_chip_flags {
 	TPM_CHIP_FLAG_TPM2		= BIT(1),
 	TPM_CHIP_FLAG_IRQ		= BIT(2),
@@ -174,6 +192,12 @@  struct tpm_chip_seqops {
 	const struct seq_operations *seqops;
 };
 
+struct tpm_sessions {
+	struct tpm_space *space;
+	u32 handle;
+	unsigned long created;
+};
+
 struct tpm_chip {
 	struct device dev, devrm;
 	struct cdev cdev, cdevrm;
@@ -214,8 +238,12 @@  struct tpm_chip {
 #endif /* CONFIG_ACPI */
 
 	struct tpm_space work_space;
+	struct tpm_space *space;
 	u32 nr_commands;
 	u32 *cc_attrs_tbl;
+	struct tpm_sessions *sessions;
+	int max_sessions;
+	wait_queue_head_t session_wait;
 };
 
 #define to_tpm_chip(d) container_of(d, struct tpm_chip, dev)
@@ -568,6 +596,13 @@  int tpm2_pcr_extend(struct tpm_chip *chip, int pcr_idx, const u8 *hash);
 int tpm2_get_random(struct tpm_chip *chip, u8 *out, size_t max);
 void tpm2_flush_context_cmd(struct tpm_chip *chip, u32 handle,
 			    unsigned int flags);
+static inline void tpm2_session_clear_reserved(struct tpm_chip *chip,
+					       struct tpm_space *space)
+{
+	if (space->reserved_handle && !tpm2_is_session_force_evict(space))
+		tpm2_flush_context_cmd(chip, space->reserved_handle, 0);
+	space->reserved_handle = 0;
+}
 int tpm2_seal_trusted(struct tpm_chip *chip,
 		      struct trusted_key_payload *payload,
 		      struct trusted_key_options *options);
diff --git a/drivers/char/tpm/tpm2-cmd.c b/drivers/char/tpm/tpm2-cmd.c
index e1c1bbd..ac5c0a2 100644
--- a/drivers/char/tpm/tpm2-cmd.c
+++ b/drivers/char/tpm/tpm2-cmd.c
@@ -1007,6 +1007,7 @@  int tpm2_auto_startup(struct tpm_chip *chip)
 {
 	struct tpm_buf buf;
 	u32 nr_commands;
+	u32 nr_sessions;
 	int rc;
 	int i;
 
@@ -1067,6 +1068,20 @@  int tpm2_auto_startup(struct tpm_chip *chip)
 	chip->nr_commands = nr_commands;
 	tpm_buf_destroy(&buf);
 
+	rc = tpm2_get_tpm_pt(chip, TPM_PT_ACTIVE_SESSIONS_MAX,
+			     &nr_sessions, NULL);
+	if (rc)
+		goto out;
+
+	if (nr_sessions > 256)
+		nr_sessions = 256;
+
+	chip->max_sessions = nr_sessions;
+	chip->sessions = devm_kzalloc(&chip->dev,
+				      nr_sessions * sizeof(*chip->sessions),
+				      GFP_KERNEL);
+	if (!chip->sessions)
+		rc = -ENOMEM;
 out:
 	if (rc > 0)
 		rc = -ENODEV;
diff --git a/drivers/char/tpm/tpm2-space.c b/drivers/char/tpm/tpm2-space.c
index 04c9431..42c8c84 100644
--- a/drivers/char/tpm/tpm2-space.c
+++ b/drivers/char/tpm/tpm2-space.c
@@ -34,6 +34,169 @@  struct tpm2_context {
 	__be16 blob_size;
 } __packed;
 
+static struct tpm_sessions *tpm2_session_chip_get(struct tpm_chip *chip)
+{
+	int i;
+
+	for (i = 0; i < chip->max_sessions; i++)
+		if (chip->sessions[i].space == NULL)
+			return &chip->sessions[i];
+
+	return NULL;
+}
+
+static struct tpm_sessions *tpm2_session_chip_find_oldest(struct tpm_chip *chip)
+{
+	struct tpm_sessions *sess = NULL;
+	int i;
+
+	for (i = 0; i < chip->max_sessions; i++) {
+		if (chip->sessions[i].space == NULL)
+			continue;
+
+		if (!sess || time_after(sess->created,
+					chip->sessions[i].created))
+			sess = &chip->sessions[i];
+	}
+
+	return sess;
+}
+
+static void tpm2_session_chip_add(struct tpm_chip *chip,
+				  struct tpm_space *space, u32 h)
+{
+	struct tpm_sessions *sess = tpm2_session_chip_get(chip);
+
+	sess->space = space;
+	sess->handle = h;
+	sess->created = jiffies;
+	dev_info(&chip->dev, "Added Session at %ld, handle %08x", sess - chip->sessions, h);
+}
+
+static void tpm2_session_chip_remove(struct tpm_chip *chip, u32 h)
+{
+	int i;
+
+	for (i = 0; i < chip->max_sessions; i++)
+		if (chip->sessions[i].handle == h)
+			break;
+	if (i == chip->max_sessions) {
+		dev_warn(&chip->dev, "Missing session %08x", h);
+		return;
+	}
+
+	memset(&chip->sessions[i], 0, sizeof(chip->sessions[i]));
+	dev_info(&chip->dev, "Removed session at %d\n", i);
+	wake_up(&chip->session_wait);
+}
+
+static int tpm2_session_forget(struct tpm_chip *chip, struct tpm_space *space,
+			       u32 handle)
+{
+	int i, j;
+	struct tpm2_context *ctx;
+
+	for (i = 0, j = 0; i < ARRAY_SIZE(space->session_tbl); i++) {
+		if (space->session_tbl[i] == 0)
+			continue;
+
+		ctx = (struct tpm2_context *)&space->session_buf[j];
+		j += sizeof(*ctx) + get_unaligned_be16(&ctx->blob_size);
+
+		if (space->session_tbl[i] != handle)
+			continue;
+
+		/* forget the session context */
+		memcpy(ctx, &space->session_buf[j], PAGE_SIZE - j);
+		tpm2_session_chip_remove(chip, handle);
+		space->session_tbl[i] = 0;
+		break;
+	}
+	if (i == ARRAY_SIZE(space->session_tbl))
+		return -EINVAL;
+	return 0;
+}
+
+static int tpm2_session_wait(struct tpm_chip *chip, struct tpm_space *space)
+{
+	int rc, failed;
+	struct tpm_sessions *sess;
+	const unsigned long min_timeout = msecs_to_jiffies(2000);
+	unsigned long timeout = min_timeout;
+	DEFINE_WAIT(wait);
+
+	for (failed = 0; ; ) {
+		prepare_to_wait(&chip->session_wait, &wait, TASK_INTERRUPTIBLE);
+
+		mutex_unlock(&chip->tpm_mutex);
+		rc = schedule_timeout_interruptible(timeout);
+		mutex_lock(&chip->tpm_mutex);
+
+		finish_wait(&chip->session_wait, &wait);
+
+		if (signal_pending(current))
+			/* got interrupted */
+			return -EINTR;
+
+		if (rc > 0 && !tpm2_is_session_force_evict(space))
+			/* got woken, so slot is free.  We don't
+			 * reserve the slot here because a) we can't
+			 * (no pending session in the TPM to evict)
+			 * and b) no-one is hogging sessions, so no
+			 * evidence of need.
+			 */
+			return 0;
+
+		/* timed out or victim required; select a victim
+		 * session to kill
+		 */
+		sess = tpm2_session_chip_find_oldest(chip);
+		if (sess == NULL) {
+			/* we get here when we can't create a session
+			 * but there are no listed active sessions
+			 * meaning they're all in various space
+			 * structures as victim sessions.  The wait
+			 * queue is a fair sequence, so we need to
+			 * wait a bit harder
+			 */
+			if (failed++ > 3)
+				break;
+			timeout *= 2;
+			dev_info(&chip->dev, "failed to get session, waiting for %us\n", jiffies_to_msecs(timeout)/1000);
+			continue;
+		}
+		/* is the victim old enough? */
+		timeout = jiffies - sess->created;
+		if (timeout > min_timeout)
+			break;
+		/* otherwise wait until the victim is old enough */
+		timeout = min_timeout - timeout;
+	}
+	if (sess == NULL)
+		/* still can't get a victim, give up */
+		return -EINVAL;
+
+	/* store the physical handle */
+	space->reserved_handle = sess->handle;
+	dev_info(&chip->dev, "Selecting handle %08x for eviction\n",
+		 space->reserved_handle);
+
+	/* cause a mapping failure if this session handle is
+	 * ever used in the victim space again
+	 */
+	tpm2_session_forget(chip, sess->space, sess->handle);
+	/* clear the session, but don't wake any other waiters */
+	memset(sess, 0, sizeof(*sess));
+	/* so now we have a saved physical handle but this handle is
+	 * still in the tpm.  After this we repeat the command, but
+	 * flush the handle once we obtain the tpm_mutex on the repeat
+	 * so, in theory, we should have a free handle to
+	 * re-execute
+	 */
+
+	return 0;
+}
+
 static int tpm2_context_save(struct tpm_chip *chip, u8 *area,
 			     int *offset, u32 handle)
 {
@@ -124,9 +287,9 @@  static int tpm2_session_find(struct tpm_space *space, u32 handle)
 	return i;
 }
 
-static int tpm2_session_add(struct tpm_chip *chip,
-			    struct tpm_space *space, u32 handle)
+static int tpm2_session_add(struct tpm_chip *chip, u32 handle)
 {
+	struct tpm_space *space = &chip->work_space;
 	int i;
 
 	for (i = 0; i < ARRAY_SIZE(space->session_tbl); i++)
@@ -139,35 +302,11 @@  static int tpm2_session_add(struct tpm_chip *chip,
 	}
 
 	space->session_tbl[i] = handle;
+	tpm2_session_chip_add(chip, chip->space, handle);
 
 	return 0;
 }
 
-static int tpm2_session_forget(struct tpm_space *space, u32 handle)
-{
-	int i, j;
-	struct tpm2_context *ctx;
-
-	for (i = 0, j = 0; i < ARRAY_SIZE(space->session_tbl); i++) {
-		if (space->session_tbl[i] == 0)
-			continue;
-
-		ctx = (struct tpm2_context *)&space->session_buf[j];
-		j += sizeof(*ctx) + get_unaligned_be16(&ctx->blob_size);
-
-		if (space->session_tbl[i] != handle)
-			continue;
-
-		/* forget the session context */
-		memcpy(ctx, &space->session_buf[j], PAGE_SIZE - j);
-		space->session_tbl[i] = 0;
-		break;
-	}
-	if (i == ARRAY_SIZE(space->session_tbl))
-		return -EINVAL;
-	return 0;
-}
-
 /* if a space is active, emulate some commands */
 static int tpm2_intercept(struct tpm_chip *chip, u32 cc, u8 *buf, size_t bufsiz)
 {
@@ -187,7 +326,7 @@  static int tpm2_intercept(struct tpm_chip *chip, u32 cc, u8 *buf, size_t bufsiz)
 		/* let the TPM figure out and return the error */
 		return 0;
 
-	return tpm2_session_forget(space, handle);
+	return tpm2_session_forget(chip, space, handle);
 }
 
 void tpm2_flush_space(struct tpm_chip *chip, struct tpm_space *space)
@@ -200,10 +339,19 @@  void tpm2_flush_space(struct tpm_chip *chip, struct tpm_space *space)
 					       TPM_TRANSMIT_UNLOCKED);
 
 	for (i = 0; i < ARRAY_SIZE(space->session_tbl); i++) {
+		if (!space->session_tbl[i])
+			continue;
+
 		space->session_tbl[i] &= ~TPM2_HT_TAG_FOR_FLUSH;
-		if (space->session_tbl[i])
-			tpm2_flush_context_cmd(chip, space->session_tbl[i],
-					       TPM_TRANSMIT_UNLOCKED);
+		tpm2_session_chip_remove(chip, space->session_tbl[i]);
+		tpm2_flush_context_cmd(chip, space->session_tbl[i],
+				       TPM_TRANSMIT_UNLOCKED);
+	}
+	if (space->reserved_handle && !tpm2_is_session_force_evict(space)) {
+		tpm2_flush_context_cmd(chip, space->reserved_handle,
+				       TPM_TRANSMIT_UNLOCKED);
+		space->reserved_handle = 0;
+		/* subtlety here: if force evict is set, we don't clear it */
 	}
 }
 
@@ -264,11 +412,13 @@  static void tpm2_unmap_sessions(struct tpm_chip *chip, u32 rc)
 		if ((space->session_tbl[i] & TPM2_HT_TAG_FOR_FLUSH) !=
 		    TPM2_HT_TAG_FOR_FLUSH)
 			continue;
-		if (rc == TPM2_RC_SUCCESS)
+
+		/* for unsuccessful command, keep session */
+		space->session_tbl[i] &= ~TPM2_HT_TAG_FOR_FLUSH;
+		if (rc == TPM2_RC_SUCCESS) {
+			tpm2_session_chip_remove(chip, space->session_tbl[i]);
 			space->session_tbl[i] = 0;
-		else
-			/* for unsuccessful command, keep session */
-			space->session_tbl[i] &= ~TPM2_HT_TAG_FOR_FLUSH;
+		}
 	}
 }
 
@@ -387,6 +537,7 @@  int tpm2_prepare_space(struct tpm_chip *chip, struct tpm_space *space,
 	       sizeof(space->session_tbl));
 	memcpy(chip->work_space.context_buf, space->context_buf, PAGE_SIZE);
 	memcpy(chip->work_space.session_buf, space->session_buf, PAGE_SIZE);
+	chip->space = space;
 
 	rc = tpm2_intercept(chip, cc, buf, bufsiz);
 	if (rc)
@@ -400,16 +551,28 @@  int tpm2_prepare_space(struct tpm_chip *chip, struct tpm_space *space,
 	if (rc)
 		return rc;
 
+	if (space->reserved_handle && !tpm2_is_session_force_evict(space)) {
+		/* this is a trick to allow a previous command which
+		 * failed because it was out of handle space to
+		 * succeed.  The handle is still in the TPM, so now we
+		 * flush it under the tpm_mutex which should ensure we
+		 * can create a new one
+		 */
+		tpm2_flush_context_cmd(chip, space->reserved_handle,
+				       TPM_TRANSMIT_UNLOCKED);
+		space->reserved_handle = 0;
+	}
+
 	return 0;
 }
 
-static int tpm2_map_response(struct tpm_chip *chip, u32 cc, u8 *rsp, size_t len)
+static int tpm2_map_response(struct tpm_chip *chip, u32 cc, u8 *rsp, size_t len,
+			     u32 return_code)
 {
 	struct tpm_space *space = &chip->work_space;
 	u32 phandle, phandle_type;
 	u32 vhandle;
 	u32 attrs;
-	u32 return_code = get_unaligned_be32((__be32 *)&rsp[6]);
 	u16 tag = get_unaligned_be16((__be16 *)rsp);
 	int i;
 	int rc;
@@ -439,7 +602,7 @@  static int tpm2_map_response(struct tpm_chip *chip, u32 cc, u8 *rsp, size_t len)
 		return 0;
 
 	if (phandle_type != TPM2_HT_TRANSIENT)
-		return tpm2_session_add(chip, space, phandle);
+		return tpm2_session_add(chip, phandle);
 
 	/* Garbage collect a dead context. */
 	for (i = 0; i < ARRAY_SIZE(space->context_tbl); i++) {
@@ -521,11 +684,12 @@  int tpm2_commit_space(struct tpm_chip *chip, struct tpm_space *space,
 		      u32 cc, u8 *buf, size_t bufsiz)
 {
 	int rc;
+	u32 return_code = get_unaligned_be32((__be32 *)&buf[6]);
 
 	if (!space)
 		return 0;
 
-	rc = tpm2_map_response(chip, cc, buf, bufsiz);
+	rc = tpm2_map_response(chip, cc, buf, bufsiz, return_code);
 	if (rc)
 		return rc;
 
@@ -539,6 +703,12 @@  int tpm2_commit_space(struct tpm_chip *chip, struct tpm_space *space,
 	       sizeof(space->session_tbl));
 	memcpy(space->context_buf, chip->work_space.context_buf, PAGE_SIZE);
 	memcpy(space->session_buf, chip->work_space.session_buf, PAGE_SIZE);
+	chip->space = NULL;
+
+	if (return_code == TPM2_RC_SESSION_HANDLES) {
+		tpm2_session_wait(chip, space);
+		return -EAGAIN;
+	}
 
 	return 0;
 }
diff --git a/drivers/char/tpm/tpms-dev.c b/drivers/char/tpm/tpms-dev.c
index 12b6e34..b13b000 100644
--- a/drivers/char/tpm/tpms-dev.c
+++ b/drivers/char/tpm/tpms-dev.c
@@ -56,8 +56,23 @@  ssize_t tpms_write(struct file *file, const char __user *buf,
 {
 	struct file_priv *fpriv = file->private_data;
 	struct tpms_priv *priv = container_of(fpriv, struct tpms_priv, priv);
+	int count = 0;
+	const int max_count = 3; /* number of retries */
+	int rc;
 
-	return tpm_common_write(file, buf, size, off, &priv->space);
+	for (count = 0; count < max_count; count++) {
+		rc = tpm_common_write(file, buf, size, off, &priv->space);
+		if (rc != -EAGAIN)
+			break;
+		if (count == max_count - 2)
+			/* second to last go around, force an eviction if
+			 * this go fails, so final go should succeed
+			 */
+			tpm2_session_force_evict(&priv->space);
+	}
+	tpm2_session_clear_reserved(fpriv->chip, &priv->space);
+
+	return rc;
 }
 
 const struct file_operations tpm_rm_fops = {