diff mbox series

RDMA/rxe: Ratelimit error messages of read_reply()

Message ID 20220825110255.658706-1-matsuda-daisuke@fujitsu.com (mailing list archive)
State Changes Requested
Delegated to: Jason Gunthorpe
Headers show
Series RDMA/rxe: Ratelimit error messages of read_reply() | expand

Commit Message

Daisuke Matsuda (Fujitsu) Aug. 25, 2022, 11:02 a.m. UTC
When responder cannot copy data from a user MR, error messages overflow.
This is because an incoming RDMA Read request can results in multiple Read
responses. If the target MR is somehow unavailable, then the error message
is generated for every Read response.

For the same reason, the error message for packet transmission should also
be ratelimited.

Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
---
 drivers/infiniband/sw/rxe/rxe_resp.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Bob Pearson Aug. 25, 2022, 3:55 p.m. UTC | #1
On 8/25/22 06:02, Daisuke Matsuda wrote:
> When responder cannot copy data from a user MR, error messages overflow.
> This is because an incoming RDMA Read request can results in multiple Read
> responses. If the target MR is somehow unavailable, then the error message
> is generated for every Read response.
> 
> For the same reason, the error message for packet transmission should also
> be ratelimited.
> 
> Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
> ---
>  drivers/infiniband/sw/rxe/rxe_resp.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
> index b36ec5c4d5e0..f9e9679b5e32 100644
> --- a/drivers/infiniband/sw/rxe/rxe_resp.c
> +++ b/drivers/infiniband/sw/rxe/rxe_resp.c
> @@ -812,7 +812,7 @@ static enum resp_states read_reply(struct rxe_qp *qp,
>  	err = rxe_mr_copy(mr, res->read.va, payload_addr(&ack_pkt),
>  			  payload, RXE_FROM_MR_OBJ);
>  	if (err)
> -		pr_err("Failed copying memory\n");
> +		pr_err_ratelimited("Failed copying memory\n");
>  	if (mr)
>  		rxe_put(mr);
>  
> @@ -824,7 +824,7 @@ static enum resp_states read_reply(struct rxe_qp *qp,
>  
>  	err = rxe_xmit_packet(qp, &ack_pkt, skb);
>  	if (err) {
> -		pr_err("Failed sending RDMA reply.\n");
> +		pr_err_ratelimited("Failed sending RDMA reply.\n");
>  		return RESPST_ERR_RNR;
>  	}
>  

Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com>
Jason Gunthorpe Aug. 26, 2022, 12:28 p.m. UTC | #2
On Thu, Aug 25, 2022 at 08:02:55PM +0900, Daisuke Matsuda wrote:
> When responder cannot copy data from a user MR, error messages overflow.
> This is because an incoming RDMA Read request can results in multiple Read
> responses. If the target MR is somehow unavailable, then the error message
> is generated for every Read response.
> 
> For the same reason, the error message for packet transmission should also
> be ratelimited.
> 
> Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
> ---
>  drivers/infiniband/sw/rxe/rxe_resp.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

These lines should be deleted, network packts should never trigger
printing.

Jason
Daisuke Matsuda (Fujitsu) Aug. 29, 2022, 5:16 a.m. UTC | #3
On Friday, August 26, 2022 9:28 PM, Jason Gunthorpe wrote:
> On Thu, Aug 25, 2022 at 08:02:55PM +0900, Daisuke Matsuda wrote:
> > When responder cannot copy data from a user MR, error messages overflow.
> > This is because an incoming RDMA Read request can results in multiple
> Read
> > responses. If the target MR is somehow unavailable, then the error message
> > is generated for every Read response.
> >
> > For the same reason, the error message for packet transmission should also
> > be ratelimited.
> >
> > Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
> > ---
> >  drivers/infiniband/sw/rxe/rxe_resp.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> These lines should be deleted, network packts should never trigger
> printing.
> 
> Jason

Okay. I will post another patch to do that.

I wonder if we should also delete some messages in rxe_rcv() and its callees.
It seems some of them can be triggered by packets from an arbitrary client
even when there is no established connection between the requesting and
responding nodes.
As far as I know, the message below can cause a message overflow.
=====
static int hdr_check(struct rxe_pkt_info *pkt)
{
~~~~~
        if (qpn != IB_MULTICAST_QPN) {
                index = (qpn == 1) ? port->qp_gsi_index : qpn;

                qp = rxe_pool_get_index(&rxe->qp_pool, index);
                if (unlikely(!qp)) {
                        pr_warn_ratelimited("no qp matches qpn 0x%x\n", qpn);
                        goto err1;
                }
=====

Daisuke Matsuda
diff mbox series

Patch

diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
index b36ec5c4d5e0..f9e9679b5e32 100644
--- a/drivers/infiniband/sw/rxe/rxe_resp.c
+++ b/drivers/infiniband/sw/rxe/rxe_resp.c
@@ -812,7 +812,7 @@  static enum resp_states read_reply(struct rxe_qp *qp,
 	err = rxe_mr_copy(mr, res->read.va, payload_addr(&ack_pkt),
 			  payload, RXE_FROM_MR_OBJ);
 	if (err)
-		pr_err("Failed copying memory\n");
+		pr_err_ratelimited("Failed copying memory\n");
 	if (mr)
 		rxe_put(mr);
 
@@ -824,7 +824,7 @@  static enum resp_states read_reply(struct rxe_qp *qp,
 
 	err = rxe_xmit_packet(qp, &ack_pkt, skb);
 	if (err) {
-		pr_err("Failed sending RDMA reply.\n");
+		pr_err_ratelimited("Failed sending RDMA reply.\n");
 		return RESPST_ERR_RNR;
 	}