[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-rdma
Subject:    Re: [PATCH for-next 10/13] RDMA/hns: Remove unnecessary kzalloc
From:       Jason Gunthorpe <jgg () ziepe ! ca>
Date:       2019-07-31 13:02:25
Message-ID: 20190731130225.GE3946 () ziepe ! ca
[Download RAW message or body]

On Wed, Jul 31, 2019 at 12:59:01PM +0300, Leon Romanovsky wrote:
> On Wed, Jul 31, 2019 at 05:16:38PM +0800, c00284828 wrote:
> > 
> > 在 2019/7/31 15:49, Leon Romanovsky 写道:
> > > On Wed, Jul 31, 2019 at 10:43:01AM +0800, oulijun wrote:
> > > > 在 2019/7/30 21:40, Leon Romanovsky 写道:
> > > > > On Tue, Jul 30, 2019 at 04:56:47PM +0800, Lijun Ou wrote:
> > > > > > From: Lang Cheng <chenglang@huawei.com>
> > > > > > 
> > > > > > For hns_roce_v2_query_qp and hns_roce_v2_modify_qp,
> > > > > > we can use stack memory to create qp context data.
> > > > > > Make the code simpler.
> > > > > > 
> > > > > > Signed-off-by: Lang Cheng <chenglang@huawei.com>
> > > > > > drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 64 \
> > > > > > +++++++++++++----------------- 1 file changed, 27 insertions(+), 37 \
> > > > > > deletions(-) 
> > > > > > diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c \
> > > > > > b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index 1186e34..07ddfae \
> > > > > >                 100644
> > > > > > +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
> > > > > > @@ -4288,22 +4288,19 @@ static int hns_roce_v2_modify_qp(struct ib_qp \
> > > > > > *ibqp, {
> > > > > > 	struct hns_roce_dev *hr_dev = to_hr_dev(ibqp->device);
> > > > > > 	struct hns_roce_qp *hr_qp = to_hr_qp(ibqp);
> > > > > > -	struct hns_roce_v2_qp_context *context;
> > > > > > -	struct hns_roce_v2_qp_context *qpc_mask;
> > > > > > +	struct hns_roce_v2_qp_context ctx[2];
> > > > > > +	struct hns_roce_v2_qp_context *context = ctx;
> > > > > > +	struct hns_roce_v2_qp_context *qpc_mask = ctx + 1;
> > > > > > 	struct device *dev = hr_dev->dev;
> > > > > > 	int ret;
> > > > > > 
> > > > > > -	context = kcalloc(2, sizeof(*context), GFP_ATOMIC);
> > > > > > -	if (!context)
> > > > > > -		return -ENOMEM;
> > > > > > -
> > > > > > -	qpc_mask = context + 1;
> > > > > > 	/*
> > > > > > 	 * In v2 engine, software pass context and context mask to hardware
> > > > > > 	 * when modifying qp. If software need modify some fields in context,
> > > > > > 	 * we should set all bits of the relevant fields in context mask to
> > > > > > 	 * 0 at the same time, else set them to 0x1.
> > > > > > 	 */
> > > > > > +	memset(context, 0, sizeof(*context));
> > > > > "struct hns_roce_v2_qp_context ctx[2] = {};" will do the trick.
> > > > > 
> > > > > Thanks
> > > > > 
> > > > > .
> > > > In this case, the mask is actually writen twice. if you do this, will it \
> > > > bring extra overhead when modify qp?
> > > I don't understand first part of your sentence, and "no" to second part.
> > > 
> > > Thanks
> > 
> > +	struct hns_roce_v2_qp_context ctx[2];
> > +	struct hns_roce_v2_qp_context *context = ctx;
> > +	struct hns_roce_v2_qp_context *qpc_mask = ctx + 1;
> > ...
> > +	memset(context, 0, sizeof(*context));
> > 	memset(qpc_mask, 0xff, sizeof(*qpc_mask));
> > 
> > ctx[2] ={} does look more simple, just like another function in patch.
> > But ctx[1] will be written 0 before being written to 0xff, is it faster than \
> > twice memset ?
> 
> Are you seriously talking about this micro optimization while you have mailbox
> allocation in your modify_qp function?

In any event the compiler eliminates duplicate stores in a lot of cases
like this.

Jason


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic