diff mbox series

[v2,11/11] crypto: qat - add resubmit logic for decompression

Message ID 20221123121032.71991-12-giovanni.cabiddu@intel.com (mailing list archive)
State Changes Requested
Delegated to: Herbert Xu
Headers show
Series crypto: qat - enable compression deflate algorithm | expand

Commit Message

Cabiddu, Giovanni Nov. 23, 2022, 12:10 p.m. UTC
The acomp API allows to send requests with a NULL destination buffer. In
this case, the algorithm implementation needs to allocate the
destination
scatter list, perform the operation and return the buffer to the user.
For decompression, data is likely to expand and be bigger than the
allocated buffer.

This implements a re-submission mechanism for decompression requests
that is triggered if the destination buffer, allocated by the driver,
is not sufficiently big to store the output from decompression.

If an overflow is detected when processing the callback for a
decompression request with a NULL destination buffer, a workqueue is
scheduled. This allocates a new scatter list of the size of the
previously allocated destination buffer, combines it with that, creates
a new firmware scatter list and resubmits the job to the hardware
accelerator.
If the number of retries exceeds 5, an error is returned to the user.

Suggested-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
---
 drivers/crypto/qat/qat_common/qat_bl.c        | 227 ++++++++++++++++++
 drivers/crypto/qat/qat_common/qat_bl.h        |   6 +
 drivers/crypto/qat/qat_common/qat_comp_algs.c |  76 ++++++
 drivers/crypto/qat/qat_common/qat_comp_req.h  |  10 +
 4 files changed, 319 insertions(+)

Comments

Herbert Xu Nov. 25, 2022, 4:54 a.m. UTC | #1
On Wed, Nov 23, 2022 at 12:10:32PM +0000, Giovanni Cabiddu wrote:
>
> +#define MAX_NULL_DST_RETRIES 5

Why do we need this limit? Doesn't that mean that anything that
compresses by a factor of more than 2^5 or 2^6 will fail?

Perhaps it would also be wise to enforce a minimum for dlen in
case of a NULL dst, perhaps PAGE_SIZE just in case slen is very
small and it would take many doublings to get to the right size.

Thanks,
Cabiddu, Giovanni Nov. 25, 2022, 9:46 a.m. UTC | #2
On Fri, Nov 25, 2022 at 12:54:27PM +0800, Herbert Xu wrote:
> On Wed, Nov 23, 2022 at 12:10:32PM +0000, Giovanni Cabiddu wrote:
> >
> > +#define MAX_NULL_DST_RETRIES 5
> 
> Why do we need this limit?
I wanted a cap on the number of retries. What if the input is a zip
bomb [1]?

The alternative would be to retry up to the point the allocation of the
destination scatterlist fails.
What do you suggest?

> Doesn't that mean that anything that
> compresses by a factor of more than 2^5 or 2^6 will fail?
Anything that decompresses by more than (src size * 2 rounded up to 4K) * 2^6
will fail.

> Perhaps it would also be wise to enforce a minimum for dlen in
> case of a NULL dst, perhaps PAGE_SIZE just in case slen is very
> small and it would take many doublings to get to the right size.
This is done already. See this snippet from qat_comp_alg_compress_decompress():

	/* Handle acomp requests that require the allocation of a destination
	 * buffer. The size of the destination buffer is double the source
	 * buffer (rounded up to the size of a page) to fit the
	 * decompressed output or an expansion on the data for compression.
	 */
	if (!areq->dst) {
		qat_req->dst.is_null = true;

		dlen = round_up(2 * slen, PAGE_SIZE);
		areq->dst = sgl_alloc(dlen, f, NULL);

[1] https://en.wikipedia.org/wiki/Zip_bomb

Regards,
Herbert Xu Nov. 25, 2022, 9:55 a.m. UTC | #3
On Fri, Nov 25, 2022 at 09:46:04AM +0000, Giovanni Cabiddu wrote:
>
> I wanted a cap on the number of retries. What if the input is a zip
> bomb [1]?

OK, but in that case we should just have a flat limit.  In which
case it also makes no sense to double when retrying (as the limit
is so small), if the first decompression fails, we should go straight
to the limit.

Something like 128K should be sufficient for our current users.
Please put this into include/crypto/acompress.h.

Thanks,
Cabiddu, Giovanni Nov. 25, 2022, 10:04 a.m. UTC | #4
On Fri, Nov 25, 2022 at 05:55:17PM +0800, Herbert Xu wrote:
> On Fri, Nov 25, 2022 at 09:46:04AM +0000, Giovanni Cabiddu wrote:
> >
> > I wanted a cap on the number of retries. What if the input is a zip
> > bomb [1]?
> 
> OK, but in that case we should just have a flat limit.  In which
> case it also makes no sense to double when retrying (as the limit
> is so small), if the first decompression fails, we should go straight
> to the limit.
Just double checking if I understood correctly.
At the first iteration, i.e. first call to decompress, allocate a
destination buffer of 2 * src_len rounded up to 4K. If this job fails
allocate a destination buffer of 128K and retry. If this fails, terminate.

Thanks,
Herbert Xu Nov. 25, 2022, 10:07 a.m. UTC | #5
On Fri, Nov 25, 2022 at 10:04:28AM +0000, Giovanni Cabiddu wrote:
>
> Just double checking if I understood correctly.
> At the first iteration, i.e. first call to decompress, allocate a
> destination buffer of 2 * src_len rounded up to 4K. If this job fails
> allocate a destination buffer of 128K and retry. If this fails, terminate.

Right, that's what I was suggesting.

Oh and please add the limit to include/crypto/acompress.h so that
it's part of the API.

Thanks,
Cabiddu, Giovanni Nov. 25, 2022, 10:14 a.m. UTC | #6
On Fri, Nov 25, 2022 at 06:07:39PM +0800, Herbert Xu wrote:
> On Fri, Nov 25, 2022 at 10:04:28AM +0000, Giovanni Cabiddu wrote:
> >
> > Just double checking if I understood correctly.
> > At the first iteration, i.e. first call to decompress, allocate a
> > destination buffer of 2 * src_len rounded up to 4K. If this job fails
> > allocate a destination buffer of 128K and retry. If this fails, terminate.
> 
> Right, that's what I was suggesting.
> 
> Oh and please add the limit to include/crypto/acompress.h so that
> it's part of the API.
Yes, that is what I was going to do. Is the name of the define ok with
you?

diff --git a/include/crypto/acompress.h b/include/crypto/acompress.h
index cb3d6b1c655d..bc16e0208169 100644
--- a/include/crypto/acompress.h
+++ b/include/crypto/acompress.h
@@ -11,6 +11,7 @@
 #include <linux/crypto.h>

 #define CRYPTO_ACOMP_ALLOC_OUTPUT      0x00000001
+#define CRYPTO_ACOMP_NULL_DST_LIMIT    131072

 /**
   * struct acomp_req - asynchronous (de)compression request


Thanks,
Herbert Xu Nov. 25, 2022, 10:20 a.m. UTC | #7
On Fri, Nov 25, 2022 at 10:14:37AM +0000, Giovanni Cabiddu wrote:
>
> diff --git a/include/crypto/acompress.h b/include/crypto/acompress.h
> index cb3d6b1c655d..bc16e0208169 100644
> --- a/include/crypto/acompress.h
> +++ b/include/crypto/acompress.h
> @@ -11,6 +11,7 @@
>  #include <linux/crypto.h>
> 
>  #define CRYPTO_ACOMP_ALLOC_OUTPUT      0x00000001
> +#define CRYPTO_ACOMP_NULL_DST_LIMIT    131072

How about simply

	CRYPTO_ACOMP_DST_MAX

Even if it isn't enforced strictly, this is basically the ceiling
on how much effort we are willing to spend on decompressing the
input.

Cheers,
Cabiddu, Giovanni Nov. 25, 2022, 10:23 a.m. UTC | #8
On Fri, Nov 25, 2022 at 06:20:49PM +0800, Herbert Xu wrote:
> On Fri, Nov 25, 2022 at 10:14:37AM +0000, Giovanni Cabiddu wrote:
> >
> > diff --git a/include/crypto/acompress.h b/include/crypto/acompress.h
> > index cb3d6b1c655d..bc16e0208169 100644
> > --- a/include/crypto/acompress.h
> > +++ b/include/crypto/acompress.h
> > @@ -11,6 +11,7 @@
> >  #include <linux/crypto.h>
> > 
> >  #define CRYPTO_ACOMP_ALLOC_OUTPUT      0x00000001
> > +#define CRYPTO_ACOMP_NULL_DST_LIMIT    131072
> 
> How about simply
> 
> 	CRYPTO_ACOMP_DST_MAX
> 
> Even if it isn't enforced strictly, this is basically the ceiling
> on how much effort we are willing to spend on decompressing the
> input.
That's perfect.

Thanks,
kernel test robot Nov. 27, 2022, 3:36 a.m. UTC | #9
Hi Giovanni,

I love your patch! Perhaps something to improve:

[auto build test WARNING on herbert-cryptodev-2.6/master]
[also build test WARNING on herbert-crypto-2.6/master linus/master v6.1-rc6 next-20221125]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Giovanni-Cabiddu/crypto-qat-enable-compression-deflate-algorithm/20221123-201220
base:   https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git master
patch link:    https://lore.kernel.org/r/20221123121032.71991-12-giovanni.cabiddu%40intel.com
patch subject: [PATCH v2 11/11] crypto: qat - add resubmit logic for decompression
config: i386-randconfig-m021
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>

smatch warnings:
drivers/crypto/qat/qat_common/qat_comp_algs.c:89 qat_comp_resubmit() warn: inconsistent indenting

vim +89 drivers/crypto/qat/qat_common/qat_comp_algs.c

    62	
    63	static void qat_comp_resubmit(struct work_struct *work)
    64	{
    65		struct qat_compression_req *qat_req =
    66			container_of(work, struct qat_compression_req, resubmit);
    67		struct qat_compression_ctx *ctx = qat_req->qat_compression_ctx;
    68		struct adf_accel_dev *accel_dev = ctx->inst->accel_dev;
    69		struct qat_request_buffs *qat_bufs = &qat_req->buf;
    70		struct qat_compression_instance *inst = ctx->inst;
    71		struct acomp_req *areq = qat_req->acompress_req;
    72		struct crypto_acomp *tfm = crypto_acomp_reqtfm(areq);
    73		unsigned int dlen = qat_req->dst.total_dlen;
    74		int retries = ++qat_req->dst.retries;
    75		u8 *req = qat_req->req;
    76		int ret = -EINVAL;
    77		dma_addr_t dfbuf;
    78	
    79		dev_dbg(&GET_DEV(accel_dev), "[%s][%s] retry NULL dst request - retries = %d dlen = %d\n",
    80			crypto_tfm_alg_driver_name(crypto_acomp_tfm(tfm)),
    81			qat_req->dir == COMPRESSION ? "comp" : "decomp", retries,
    82			dlen * 2);
    83	
    84		if (retries > MAX_NULL_DST_RETRIES) {
    85			dev_dbg(&GET_DEV(accel_dev),
    86				"[%s] exceeded max number of retries for NULL dst %s request\n",
    87				crypto_tfm_alg_driver_name(crypto_acomp_tfm(tfm)),
    88				qat_req->dir == COMPRESSION ? "comp" : "decomp");
  > 89				ret = -EOVERFLOW;
    90				goto err;
    91		}
    92	
    93		ret = qat_bl_realloc_map_new_dst(accel_dev, &areq->dst, &dlen, qat_bufs,
    94						 qat_algs_alloc_flags(&areq->base));
    95		if (ret)
    96			goto err;
    97	
    98		qat_req->dst.total_dlen = dlen;
    99	
   100		dfbuf = qat_req->buf.bloutp;
   101		qat_comp_override_dst(req, dfbuf, dlen);
   102	
   103		ret = qat_alg_send_dc_message(qat_req, inst, &areq->base);
   104		if (ret != -ENOSPC)
   105			return;
   106	
   107	err:
   108		qat_bl_free_bufl(accel_dev, qat_bufs);
   109		areq->base.complete(&areq->base, ret);
   110	}
   111
diff mbox series

Patch

diff --git a/drivers/crypto/qat/qat_common/qat_bl.c b/drivers/crypto/qat/qat_common/qat_bl.c
index 221a4eb610a3..517938b2e211 100644
--- a/drivers/crypto/qat/qat_common/qat_bl.c
+++ b/drivers/crypto/qat/qat_common/qat_bl.c
@@ -222,3 +222,230 @@  int qat_bl_sgl_to_bufl(struct adf_accel_dev *accel_dev,
 				    extra_dst_buff, sz_extra_dst_buff,
 				    flags);
 }
+
+static void qat_bl_sgl_unmap(struct adf_accel_dev *accel_dev,
+			     struct qat_alg_buf_list *bl)
+{
+	struct device *dev = &GET_DEV(accel_dev);
+	int n = bl->num_bufs;
+	int i;
+
+	for (i = 0; i < n; i++)
+		if (!dma_mapping_error(dev, bl->bufers[i].addr))
+			dma_unmap_single(dev, bl->bufers[i].addr,
+					 bl->bufers[i].len, DMA_FROM_DEVICE);
+
+	kfree(bl);
+}
+
+static int qat_bl_sgl_map(struct adf_accel_dev *accel_dev,
+			  struct scatterlist *sgl,
+			  struct qat_alg_buf_list **bl)
+{
+	struct device *dev = &GET_DEV(accel_dev);
+	struct qat_alg_buf_list *bufl;
+	int node = dev_to_node(dev);
+	struct scatterlist *sg;
+	int n, i, sg_nctr;
+	size_t sz;
+
+	n = sg_nents(sgl);
+	sz = struct_size(bufl, bufers, n);
+	bufl = kzalloc_node(sz, GFP_KERNEL, node);
+	if (unlikely(!bufl))
+		return -ENOMEM;
+
+	for (i = 0; i < n; i++)
+		bufl->bufers[i].addr = DMA_MAPPING_ERROR;
+
+	sg_nctr = 0;
+	for_each_sg(sgl, sg, n, i) {
+		int y = sg_nctr;
+
+		if (!sg->length)
+			continue;
+
+		bufl->bufers[y].addr = dma_map_single(dev, sg_virt(sg),
+						      sg->length,
+						      DMA_FROM_DEVICE);
+		bufl->bufers[y].len = sg->length;
+		if (unlikely(dma_mapping_error(dev, bufl->bufers[y].addr)))
+			goto err_map;
+		sg_nctr++;
+	}
+	bufl->num_bufs = sg_nctr;
+	bufl->num_mapped_bufs = sg_nctr;
+
+	*bl = bufl;
+
+	return 0;
+
+err_map:
+	for (i = 0; i < n; i++)
+		if (!dma_mapping_error(dev, bufl->bufers[i].addr))
+			dma_unmap_single(dev, bufl->bufers[i].addr,
+					 bufl->bufers[i].len,
+					 DMA_FROM_DEVICE);
+	kfree(bufl);
+	*bl = NULL;
+
+	return -ENOMEM;
+}
+
+static void qat_bl_sgl_free_unmap(struct adf_accel_dev *accel_dev,
+				  struct scatterlist *sgl,
+				  struct qat_alg_buf_list *bl)
+{
+	if (bl)
+		qat_bl_sgl_unmap(accel_dev, bl);
+	if (sgl)
+		sgl_free(sgl);
+}
+
+static int qat_bl_sgl_alloc_map(struct adf_accel_dev *accel_dev,
+				struct scatterlist **sgl,
+				struct qat_alg_buf_list **bl,
+				unsigned int dlen,
+				gfp_t gfp)
+{
+	struct scatterlist *dst;
+	int ret;
+
+	dst = sgl_alloc(dlen, gfp, NULL);
+	if (!dst) {
+		dev_err(&GET_DEV(accel_dev), "sg_alloc failed\n");
+		return -ENOMEM;
+	}
+
+	ret = qat_bl_sgl_map(accel_dev, dst, bl);
+	if (ret)
+		goto err;
+
+	*sgl = dst;
+
+	return 0;
+
+err:
+	sgl_free(dst);
+	*sgl = NULL;
+	return ret;
+}
+
+static void qat_bl_merge_fw_sgl(struct qat_alg_buf_list *out,
+				struct qat_alg_buf_list *in_one,
+				struct qat_alg_buf_list *in_two)
+{
+	size_t cpy_sz;
+	void *s, *d;
+
+	cpy_sz = in_one->num_bufs * sizeof(struct qat_alg_buf);
+	s = in_one->bufers;
+	d = out->bufers;
+	memcpy(d, s, cpy_sz);
+
+	s = in_two->bufers;
+	d += cpy_sz;
+	cpy_sz = in_two->num_bufs * sizeof(struct qat_alg_buf);
+	memcpy(d, s, cpy_sz);
+
+	out->num_bufs = in_one->num_bufs + in_two->num_bufs;
+	out->num_mapped_bufs = in_one->num_mapped_bufs + in_two->num_mapped_bufs;
+}
+
+static void qat_bl_merge_scatterlists(struct scatterlist *out,
+				      struct scatterlist *in_one,
+				      struct scatterlist *in_two)
+{
+	struct scatterlist *dst = out;
+	int i, nents_one, nents_two;
+	struct scatterlist *sg;
+
+	nents_one = sg_nents(in_one);
+	nents_two = sg_nents(in_two);
+
+	sg_init_table(dst, nents_one + nents_two);
+
+	for_each_sg(in_one, sg, nents_one, i) {
+		sg_set_page(dst, sg_page(sg), sg->length, 0);
+		dst = sg_next(dst);
+	}
+
+	for_each_sg(in_two, sg, nents_two, i) {
+		sg_set_page(dst, sg_page(sg), sg->length, 0);
+		dst = sg_next(dst);
+	}
+}
+
+int qat_bl_realloc_map_new_dst(struct adf_accel_dev *accel_dev,
+			       struct scatterlist **sg,
+			       unsigned int *dlen,
+			       struct qat_request_buffs *qat_bufs,
+			       gfp_t gfp)
+{
+	struct qat_alg_buf_list *tmp_bl, *dst_bl, *old_bl;
+	struct device *dev = &GET_DEV(accel_dev);
+	dma_addr_t dst_blp = DMA_MAPPING_ERROR;
+	struct scatterlist *dst = NULL;
+	struct scatterlist *tmp_dst;
+	int node = dev_to_node(dev);
+	size_t dst_bl_size;
+	int ret, nents;
+
+	ret = qat_bl_sgl_alloc_map(accel_dev, &tmp_dst, &tmp_bl, *dlen, gfp);
+	if (ret)
+		return ret;
+
+	old_bl = qat_bufs->blout;
+	nents = tmp_bl->num_bufs + old_bl->num_bufs;
+	dst_bl_size = struct_size(dst_bl, bufers, nents);
+
+	/* Allocate new FW SGL descriptor */
+	dst_bl = kzalloc_node(dst_bl_size, gfp, node);
+	if (!dst_bl)
+		goto err;
+
+	dst_blp = dma_map_single(dev, dst_bl, dst_bl_size, DMA_TO_DEVICE);
+	if (unlikely(dma_mapping_error(dev, dst_blp)))
+		goto err;
+
+	/* Allocate new scatter list */
+	dst = kmalloc_array(nents, sizeof(struct scatterlist), gfp);
+	if (!dst)
+		goto err;
+
+	/* Create new FW SGL and scatterlist as composition of the old one
+	 * and the newly allocated one
+	 */
+	qat_bl_merge_fw_sgl(dst_bl, tmp_bl, old_bl);
+	qat_bl_merge_scatterlists(dst, tmp_dst, *sg);
+
+	/* Unmap old firmware SGL descriptor */
+	dma_unmap_single(dev, qat_bufs->bloutp, qat_bufs->sz_out, DMA_TO_DEVICE);
+
+	/* Free temporary FW sgl descriptor */
+	kfree(tmp_bl);
+
+	if (!qat_bufs->sgl_dst_valid)
+		kfree(qat_bufs->blout);
+	qat_bufs->sgl_dst_valid = false;
+	qat_bufs->blout = dst_bl;
+	qat_bufs->bloutp = dst_blp;
+	qat_bufs->sz_out = dst_bl_size;
+
+	/* Free old scatterlist and return newly created one */
+	kfree(*sg);
+	*sg = dst;
+
+	*dlen *= 2;
+
+	return 0;
+err:
+	qat_bl_sgl_free_unmap(accel_dev, tmp_dst, tmp_bl);
+	kfree(dst_bl);
+	kfree(dst);
+
+	if (!dma_mapping_error(dev, dst_blp))
+		dma_unmap_single(dev, dst_blp, dst_bl_size, DMA_TO_DEVICE);
+
+	return -ENOMEM;
+}
diff --git a/drivers/crypto/qat/qat_common/qat_bl.h b/drivers/crypto/qat/qat_common/qat_bl.h
index 5f2ea8f352f7..bfe9ed62b174 100644
--- a/drivers/crypto/qat/qat_common/qat_bl.h
+++ b/drivers/crypto/qat/qat_common/qat_bl.h
@@ -58,4 +58,10 @@  static inline gfp_t qat_algs_alloc_flags(struct crypto_async_request *req)
 	return req->flags & CRYPTO_TFM_REQ_MAY_SLEEP ? GFP_KERNEL : GFP_ATOMIC;
 }
 
+int qat_bl_realloc_map_new_dst(struct adf_accel_dev *accel_dev,
+			       struct scatterlist **newd,
+			       unsigned int *dlen,
+			       struct qat_request_buffs *qat_bufs,
+			       gfp_t gfp);
+
 #endif
diff --git a/drivers/crypto/qat/qat_common/qat_comp_algs.c b/drivers/crypto/qat/qat_common/qat_comp_algs.c
index bdc5107b359b..9099d6c3f74b 100644
--- a/drivers/crypto/qat/qat_common/qat_comp_algs.c
+++ b/drivers/crypto/qat/qat_common/qat_comp_algs.c
@@ -5,6 +5,7 @@ 
 #include <crypto/internal/acompress.h>
 #include <crypto/scatterwalk.h>
 #include <linux/dma-mapping.h>
+#include <linux/workqueue.h>
 #include "adf_accel_devices.h"
 #include "adf_common_drv.h"
 #include "qat_bl.h"
@@ -25,6 +26,14 @@  struct qat_compression_ctx {
 	struct qat_compression_instance *inst;
 };
 
+#define MAX_NULL_DST_RETRIES 5
+
+struct qat_dst {
+	bool is_null;
+	int retries;
+	unsigned int total_dlen;
+};
+
 struct qat_compression_req {
 	u8 req[QAT_COMP_REQ_SIZE];
 	struct qat_compression_ctx *qat_compression_ctx;
@@ -33,6 +42,8 @@  struct qat_compression_req {
 	enum direction dir;
 	int actual_dlen;
 	struct qat_alg_req alg_req;
+	struct work_struct resubmit;
+	struct qat_dst dst;
 };
 
 static int qat_alg_send_dc_message(struct qat_compression_req *qat_req,
@@ -49,6 +60,55 @@  static int qat_alg_send_dc_message(struct qat_compression_req *qat_req,
 	return qat_alg_send_message(alg_req);
 }
 
+static void qat_comp_resubmit(struct work_struct *work)
+{
+	struct qat_compression_req *qat_req =
+		container_of(work, struct qat_compression_req, resubmit);
+	struct qat_compression_ctx *ctx = qat_req->qat_compression_ctx;
+	struct adf_accel_dev *accel_dev = ctx->inst->accel_dev;
+	struct qat_request_buffs *qat_bufs = &qat_req->buf;
+	struct qat_compression_instance *inst = ctx->inst;
+	struct acomp_req *areq = qat_req->acompress_req;
+	struct crypto_acomp *tfm = crypto_acomp_reqtfm(areq);
+	unsigned int dlen = qat_req->dst.total_dlen;
+	int retries = ++qat_req->dst.retries;
+	u8 *req = qat_req->req;
+	int ret = -EINVAL;
+	dma_addr_t dfbuf;
+
+	dev_dbg(&GET_DEV(accel_dev), "[%s][%s] retry NULL dst request - retries = %d dlen = %d\n",
+		crypto_tfm_alg_driver_name(crypto_acomp_tfm(tfm)),
+		qat_req->dir == COMPRESSION ? "comp" : "decomp", retries,
+		dlen * 2);
+
+	if (retries > MAX_NULL_DST_RETRIES) {
+		dev_dbg(&GET_DEV(accel_dev),
+			"[%s] exceeded max number of retries for NULL dst %s request\n",
+			crypto_tfm_alg_driver_name(crypto_acomp_tfm(tfm)),
+			qat_req->dir == COMPRESSION ? "comp" : "decomp");
+			ret = -EOVERFLOW;
+			goto err;
+	}
+
+	ret = qat_bl_realloc_map_new_dst(accel_dev, &areq->dst, &dlen, qat_bufs,
+					 qat_algs_alloc_flags(&areq->base));
+	if (ret)
+		goto err;
+
+	qat_req->dst.total_dlen = dlen;
+
+	dfbuf = qat_req->buf.bloutp;
+	qat_comp_override_dst(req, dfbuf, dlen);
+
+	ret = qat_alg_send_dc_message(qat_req, inst, &areq->base);
+	if (ret != -ENOSPC)
+		return;
+
+err:
+	qat_bl_free_bufl(accel_dev, qat_bufs);
+	areq->base.complete(&areq->base, ret);
+}
+
 static void qat_comp_generic_callback(struct qat_compression_req *qat_req,
 				      void *resp)
 {
@@ -79,6 +139,14 @@  static void qat_comp_generic_callback(struct qat_compression_req *qat_req,
 		status ? "ERR" : "OK ",
 		consumed, produced, cmp_err, xlt_err);
 
+	if (qat_req->dir == DECOMPRESSION && qat_req->dst.is_null) {
+		if (cmp_err == ERR_CODE_OVERFLOW_ERROR) {
+			INIT_WORK(&qat_req->resubmit, qat_comp_resubmit);
+			adf_misc_wq_queue_work(&qat_req->resubmit);
+			return;
+		}
+	}
+
 	if (unlikely(status != ICP_QAT_FW_COMN_STATUS_FLAG_OK))
 		goto end;
 
@@ -175,16 +243,24 @@  static int qat_comp_alg_compress_decompress(struct acomp_req *areq,
 	if (areq->dst && !dlen)
 		return -EINVAL;
 
+	qat_req->dst.is_null = false;
+
 	/* Handle acomp requests that require the allocation of a destination
 	 * buffer. The size of the destination buffer is double the source
 	 * buffer (rounded up to the size of a page) to fit the decompressed
 	 * output or an expansion on the data for compression.
 	 */
 	if (!areq->dst) {
+		qat_req->dst.is_null = true;
+
 		dlen = round_up(2 * slen, PAGE_SIZE);
 		areq->dst = sgl_alloc(dlen, f, NULL);
 		if (!areq->dst)
 			return -ENOMEM;
+
+		areq->dlen = dlen;
+		qat_req->dst.retries = 0;
+		qat_req->dst.total_dlen = dlen;
 	}
 
 	if (dir == COMPRESSION) {
diff --git a/drivers/crypto/qat/qat_common/qat_comp_req.h b/drivers/crypto/qat/qat_common/qat_comp_req.h
index 18a1f33a6db9..404e32c5e778 100644
--- a/drivers/crypto/qat/qat_common/qat_comp_req.h
+++ b/drivers/crypto/qat/qat_common/qat_comp_req.h
@@ -25,6 +25,16 @@  static inline void qat_comp_create_req(void *ctx, void *req, u64 src, u32 slen,
 	req_pars->out_buffer_sz = dlen;
 }
 
+static inline void qat_comp_override_dst(void *req, u64 dst, u32 dlen)
+{
+	struct icp_qat_fw_comp_req *fw_req = req;
+	struct icp_qat_fw_comp_req_params *req_pars = &fw_req->comp_pars;
+
+	fw_req->comn_mid.dest_data_addr = dst;
+	fw_req->comn_mid.dst_length = dlen;
+	req_pars->out_buffer_sz = dlen;
+}
+
 static inline void qat_comp_create_compression_req(void *ctx, void *req,
 						   u64 src, u32 slen,
 						   u64 dst, u32 dlen,