Message ID | 20160713111006.GF28723@dhcp22.suse.cz (mailing list archive) |
---|---|
State | Not Applicable, archived |
Delegated to: | Mike Snitzer |
Headers | show |
On Wed, 13 Jul 2016, Milan Broz wrote: > On 07/13/2016 02:50 PM, Michal Hocko wrote: > > On Wed 13-07-16 13:10:06, Michal Hocko wrote: > >> On Tue 12-07-16 19:44:11, Mikulas Patocka wrote: > > [...] > >>> As long as swapping is in progress, the free memory is below the limit > >>> (because the swapping activity itself consumes any memory over the limit). > >>> And that triggered the OOM killer prematurely. > >> > >> I am not sure I understand the last part. Are you saing that we trigger > >> OOM because the initiated swapout will not be able to finish the IO thus > >> release the page in time? > >> > >> The oom detection checks waits for an ongoing writeout if there is no > >> reclaim progress and at least half of the reclaimable memory is either > >> dirty or under writeback. Pages under swaout are marked as under > >> writeback AFAIR. The writeout path (dm-crypt worker in this case) should > >> be able to allocate a memory from the mempool, hand over to the crypt > >> layer and finish the IO. Is it possible this might take a lot of time? > > > > I am not familiar with the crypto API but from what I understood from > > crypt_convert the encryption is done asynchronously. Then I got lost in > > the indirection. Who is completing the request and from what kind of > > context? Is it possible it wouldn't be runable for a long time? > > If you mean crypt_convert in dm-crypt, then it can do asynchronous completion > but usually (with AES-NI ans sw implementations) it run the operation completely > synchronously. > Asynchronous processing is quite rare, usually only on some specific hardware > crypto accelerators. > > Once the encryption is finished, the cloned bio is sent to the block > layer for processing. > (There is also some magic with sorting writes but Mikulas knows this better.) dm-crypt receives requests in crypt_map, then it distributes write requests to multiple encryption threads. Encryption is done usually synchronously; asynchronous completion is used only when using some PCI cards that accelerate encryption. When encryption finishes, the encrypted pages are submitted to a thread dmcrypt_write that sorts the requests using rbtree and submits them. The block layer has a deficiency that it cannot merge adjacent requests submitted by the different threads. If we submitted requests directly from encryption threads, lack of merging degraded performance seriously. Mikulas > Milan > p.s. I added cc to dm-devel, some dmcrypt people reads only this list. > -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Wed 13-07-16 11:21:41, Mikulas Patocka wrote: > > > On Wed, 13 Jul 2016, Milan Broz wrote: > > > On 07/13/2016 02:50 PM, Michal Hocko wrote: > > > On Wed 13-07-16 13:10:06, Michal Hocko wrote: > > >> On Tue 12-07-16 19:44:11, Mikulas Patocka wrote: > > > [...] > > >>> As long as swapping is in progress, the free memory is below the limit > > >>> (because the swapping activity itself consumes any memory over the limit). > > >>> And that triggered the OOM killer prematurely. > > >> > > >> I am not sure I understand the last part. Are you saing that we trigger > > >> OOM because the initiated swapout will not be able to finish the IO thus > > >> release the page in time? > > >> > > >> The oom detection checks waits for an ongoing writeout if there is no > > >> reclaim progress and at least half of the reclaimable memory is either > > >> dirty or under writeback. Pages under swaout are marked as under > > >> writeback AFAIR. The writeout path (dm-crypt worker in this case) should > > >> be able to allocate a memory from the mempool, hand over to the crypt > > >> layer and finish the IO. Is it possible this might take a lot of time? > > > > > > I am not familiar with the crypto API but from what I understood from > > > crypt_convert the encryption is done asynchronously. Then I got lost in > > > the indirection. Who is completing the request and from what kind of > > > context? Is it possible it wouldn't be runable for a long time? > > > > If you mean crypt_convert in dm-crypt, then it can do asynchronous completion > > but usually (with AES-NI ans sw implementations) it run the operation completely > > synchronously. > > Asynchronous processing is quite rare, usually only on some specific hardware > > crypto accelerators. > > > > Once the encryption is finished, the cloned bio is sent to the block > > layer for processing. > > (There is also some magic with sorting writes but Mikulas knows this better.) > > dm-crypt receives requests in crypt_map, then it distributes write > requests to multiple encryption threads. Encryption is done usually > synchronously; asynchronous completion is used only when using some PCI > cards that accelerate encryption. When encryption finishes, the encrypted > pages are submitted to a thread dmcrypt_write that sorts the requests > using rbtree and submits them. OK. I was worried that the async context would depend on WQ and a lack of workers could lead to long stalls. Dedicated kernel threads seem sufficient. Thanks for the clarification.
On 07/14/2016 11:09 AM, Michal Hocko wrote: > On Wed 13-07-16 11:21:41, Mikulas Patocka wrote: >> >> >> On Wed, 13 Jul 2016, Milan Broz wrote: >> >>> On 07/13/2016 02:50 PM, Michal Hocko wrote: >>>> On Wed 13-07-16 13:10:06, Michal Hocko wrote: >>>>> On Tue 12-07-16 19:44:11, Mikulas Patocka wrote: >>>> [...] >>>>>> As long as swapping is in progress, the free memory is below the limit >>>>>> (because the swapping activity itself consumes any memory over the limit). >>>>>> And that triggered the OOM killer prematurely. >>>>> >>>>> I am not sure I understand the last part. Are you saing that we trigger >>>>> OOM because the initiated swapout will not be able to finish the IO thus >>>>> release the page in time? >>>>> >>>>> The oom detection checks waits for an ongoing writeout if there is no >>>>> reclaim progress and at least half of the reclaimable memory is either >>>>> dirty or under writeback. Pages under swaout are marked as under >>>>> writeback AFAIR. The writeout path (dm-crypt worker in this case) should >>>>> be able to allocate a memory from the mempool, hand over to the crypt >>>>> layer and finish the IO. Is it possible this might take a lot of time? >>>> >>>> I am not familiar with the crypto API but from what I understood from >>>> crypt_convert the encryption is done asynchronously. Then I got lost in >>>> the indirection. Who is completing the request and from what kind of >>>> context? Is it possible it wouldn't be runable for a long time? >>> >>> If you mean crypt_convert in dm-crypt, then it can do asynchronous completion >>> but usually (with AES-NI ans sw implementations) it run the operation completely >>> synchronously. >>> Asynchronous processing is quite rare, usually only on some specific hardware >>> crypto accelerators. >>> >>> Once the encryption is finished, the cloned bio is sent to the block >>> layer for processing. >>> (There is also some magic with sorting writes but Mikulas knows this better.) >> >> dm-crypt receives requests in crypt_map, then it distributes write >> requests to multiple encryption threads. Encryption is done usually >> synchronously; asynchronous completion is used only when using some PCI >> cards that accelerate encryption. When encryption finishes, the encrypted >> pages are submitted to a thread dmcrypt_write that sorts the requests >> using rbtree and submits them. > > OK. I was worried that the async context would depend on WQ and a lack > of workers could lead to long stalls. Dedicated kernel threads seem > sufficient. Just for the record - if there is a suspicion that some crypto operation causes problem, dmcrypt can use null cipher. This degrades encryption/decryption to just plain memcpy inside crypto API but leaves all workqueues and tooling around the same. (I added it to cryptsetup to easily configure it and it was intended to test dmcrypt non-crypto overherad in fact.) Anyway, thanks for looking into this! Milan -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c index 4f3cb3554944..0b806810efab 100644 --- a/drivers/md/dm-crypt.c +++ b/drivers/md/dm-crypt.c @@ -1392,11 +1392,14 @@ static void kcryptd_async_done(struct crypto_async_request *async_req, static void kcryptd_crypt(struct work_struct *work) { struct dm_crypt_io *io = container_of(work, struct dm_crypt_io, work); + unsigned int pflags = current->flags; + current->flags |= PF_LESS_THROTTLE; if (bio_data_dir(io->base_bio) == READ) kcryptd_crypt_read_convert(io); else kcryptd_crypt_write_convert(io); + tsk_restore_flags(current, pflags, PF_LESS_THROTTLE); } static void kcryptd_queue_crypt(struct dm_crypt_io *io)