Message ID | YXlxhpZWf2mxJaMi@fedora (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v4] media: rc: pwm-ir-tx: Switch to atomic PWM API | expand |
On Wed, Oct 27, 2021 at 12:34:30PM -0300, Maíra Canal wrote: > Remove legacy PWM interface (pwm_config, pwm_enable, pwm_disable) and > replace it for the atomic PWM API. > > Signed-off-by: Maíra Canal <maira.canal@usp.br> > Reported-by: kernel test robot <lkp@intel.com> While it's true that he kernel robot told you about a problem in an earlier revision, adding the tag here is misleading, because in the end you only see the tag in the commit history, and there is suggests that the commit fixes something that was reported in the kernel tree before. For this reason I usually only add a thanks after the tripple dash. Also note this nitpick: Your S-o-b should always be the last thing. > --- > V1 -> V2: Assign variables directly and simplify conditional statement > V2 -> V3: Fix declaration of undeclared variables > V3 -> V4: Fix DIV_ROUND_CLOSEST error with u64 variables > --- > drivers/media/rc/pwm-ir-tx.c | 18 +++++++++--------- > 1 file changed, 9 insertions(+), 9 deletions(-) > > diff --git a/drivers/media/rc/pwm-ir-tx.c b/drivers/media/rc/pwm-ir-tx.c > index 4bc28d2c9cc9..105a9c24f1e3 100644 > --- a/drivers/media/rc/pwm-ir-tx.c > +++ b/drivers/media/rc/pwm-ir-tx.c > @@ -53,22 +53,21 @@ static int pwm_ir_tx(struct rc_dev *dev, unsigned int *txbuf, > { > struct pwm_ir *pwm_ir = dev->priv; > struct pwm_device *pwm = pwm_ir->pwm; > - int i, duty, period; > + struct pwm_state state; > + int i; > ktime_t edge; > long delta; > > - period = DIV_ROUND_CLOSEST(NSEC_PER_SEC, pwm_ir->carrier); > - duty = DIV_ROUND_CLOSEST(pwm_ir->duty_cycle * period, 100); > + pwm_init_state(pwm, &state); > > - pwm_config(pwm, duty, period); > + state.period = DIV_ROUND_CLOSEST(NSEC_PER_SEC, pwm_ir->carrier); > + pwm_set_relative_duty_cycle(&state, pwm_ir->duty_cycle, 100); > > edge = ktime_get(); > > for (i = 0; i < count; i++) { > - if (i % 2) // space > - pwm_disable(pwm); > - else > - pwm_enable(pwm); > + state.enabled = !(i % 2); > + pwm_apply_state(pwm, &state); > > edge = ktime_add_us(edge, txbuf[i]); > delta = ktime_us_delta(edge, ktime_get()); > @@ -76,7 +75,8 @@ static int pwm_ir_tx(struct rc_dev *dev, unsigned int *txbuf, > usleep_range(delta, delta + 10); > } > > - pwm_disable(pwm); > + state.enabled = false; > + pwm_apply_state(pwm, &state); > > return count; > } The conversion is right (I think), note this could be optimized a bit further: state.period only depends on carrier which rarely changes, so the calculation could be done in pwm_ir_set_carrier(). Ditto for duty which only depends on state.period and pwm_ir->duty_cycle. (This is for a separate commit though.) Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Thanks Uwe
On Thu, Oct 28, 2021 at 08:45:13AM +0200, Uwe Kleine-König wrote: > The conversion is right (I think), We still have the problem that the pwm drivers calculate the period incorrectly by rounding down (except pwm-bcm2835). So the period is not as good as it could be in most cases, but this driver can't do anything about that. > note this could be optimized a bit > further: state.period only depends on carrier which rarely changes, so > the calculation could be done in pwm_ir_set_carrier(). Ditto for duty > which only depends on state.period and pwm_ir->duty_cycle. (This is for > a separate commit though.) I'm not sure what caching this is much of a win. The calculation is a few instructions, so you're not winning in the way of speed. On the flip side you use more memory since pwm_state has to be kmalloc() rather than existing just on the stack, and both ioctl handlers and the probe function need to recalculate the period/duty cycle, so there is a slight increase in code size. This change does not improve anything measurably and only increases code complexity. > Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Thanks for your review. Sean
Hello Sean, On Thu, Oct 28, 2021 at 10:14:42AM +0100, Sean Young wrote: > On Thu, Oct 28, 2021 at 08:45:13AM +0200, Uwe Kleine-König wrote: > > The conversion is right (I think), > > We still have the problem that the pwm drivers calculate the period > incorrectly by rounding down (except pwm-bcm2835). So the period is not > as good as it could be in most cases, but this driver can't do anything > about that. Yeah, some time ago I started coding a round_state function (wip at https://git.pengutronix.de/cgit/ukl/linux/commit/?h=pwm-wip&id=ae348eb6a55d6526f30ef4a49819197d9616391e) but this was pushed down on my todo-list by more important stuff. If you want to experiment with that ... > > note this could be optimized a bit > > further: state.period only depends on carrier which rarely changes, so > > the calculation could be done in pwm_ir_set_carrier(). Ditto for duty > > which only depends on state.period and pwm_ir->duty_cycle. (This is for > > a separate commit though.) > > I'm not sure what caching this is much of a win. The calculation is a few > instructions, so you're not winning in the way of speed. On the flip side > you use more memory since pwm_state has to be kmalloc() rather than existing I tested a bit with this patch on top of Maíra's: diff --git a/drivers/media/rc/pwm-ir-tx.c b/drivers/media/rc/pwm-ir-tx.c index 105a9c24f1e3..7585c21775bc 100644 --- a/drivers/media/rc/pwm-ir-tx.c +++ b/drivers/media/rc/pwm-ir-tx.c @@ -17,7 +17,7 @@ struct pwm_ir { struct pwm_device *pwm; - unsigned int carrier; + struct pwm_state state; unsigned int duty_cycle; }; @@ -32,6 +32,7 @@ static int pwm_ir_set_duty_cycle(struct rc_dev *dev, u32 duty_cycle) struct pwm_ir *pwm_ir = dev->priv; pwm_ir->duty_cycle = duty_cycle; + pwm_set_relative_duty_cycle(&pwm_ir->state, pwm_ir->duty_cycle, 100); return 0; } @@ -43,7 +44,8 @@ static int pwm_ir_set_carrier(struct rc_dev *dev, u32 carrier) if (!carrier) return -EINVAL; - pwm_ir->carrier = carrier; + pwm_ir->state.period = DIV_ROUND_CLOSEST(NSEC_PER_SEC, carrier); + pwm_set_relative_duty_cycle(&pwm_ir->state, pwm_ir->duty_cycle, 100); return 0; } @@ -53,21 +55,15 @@ static int pwm_ir_tx(struct rc_dev *dev, unsigned int *txbuf, { struct pwm_ir *pwm_ir = dev->priv; struct pwm_device *pwm = pwm_ir->pwm; - struct pwm_state state; int i; ktime_t edge; long delta; - pwm_init_state(pwm, &state); - - state.period = DIV_ROUND_CLOSEST(NSEC_PER_SEC, pwm_ir->carrier); - pwm_set_relative_duty_cycle(&state, pwm_ir->duty_cycle, 100); - edge = ktime_get(); for (i = 0; i < count; i++) { - state.enabled = !(i % 2); - pwm_apply_state(pwm, &state); + pwm_ir->state.enabled = !(i % 2); + pwm_apply_state(pwm, &pwm_ir->state); edge = ktime_add_us(edge, txbuf[i]); delta = ktime_us_delta(edge, ktime_get()); @@ -75,8 +71,8 @@ static int pwm_ir_tx(struct rc_dev *dev, unsigned int *txbuf, usleep_range(delta, delta + 10); } - state.enabled = false; - pwm_apply_state(pwm, &state); + pwm_ir->state.enabled = false; + pwm_apply_state(pwm, &pwm_ir->state); return count; } @@ -95,8 +91,9 @@ static int pwm_ir_probe(struct platform_device *pdev) if (IS_ERR(pwm_ir->pwm)) return PTR_ERR(pwm_ir->pwm); - pwm_ir->carrier = 38000; - pwm_ir->duty_cycle = 50; + pwm_ir->state.duty_cycle = 50; + pwm_init_state(pwm_ir->pwm, &pwm_ir->state); + pwm_ir_set_carrier(rcdev, 38000); rcdev = devm_rc_allocate_device(&pdev->dev, RC_DRIVER_IR_RAW_TX); if (!rcdev) bloat-o-meter reports (for an arm allmodconfig build) add/remove: 0/0 grow/shrink: 3/1 up/down: 644/-396 (248) Function old new delta pwm_ir_probe 372 676 +304 pwm_ir_set_carrier 108 292 +184 pwm_ir_set_duty_cycle 68 224 +156 pwm_ir_tx 908 512 -396 Total: Before=2302, After=2550, chg +10.77% struct pwm_ir increases from 12 bytes to 40 bytes. The stack space required by pwm_ir_tx decreases from 60 to 36 I don't know exactly how kmalloc works internally. Maybe allocating a structure of size 40 bytes doesn't need more memory than a structure of size 12? I didn't check how runtimes change, but the size decrease of pwm_ir_tx() is nice and might save a bit of runtime. Probably it depends on your focus if this change is good for you or not. > just on the stack, and both ioctl handlers and the probe function need to > recalculate the period/duty cycle, so there is a slight increase in code size. > > This change does not improve anything measurably and only increases code > complexity. You did measure? Best regards Uwe
Hi Uwe, On Thu, Oct 28, 2021 at 01:15:35PM +0200, Uwe Kleine-König wrote: > On Thu, Oct 28, 2021 at 10:14:42AM +0100, Sean Young wrote: > > On Thu, Oct 28, 2021 at 08:45:13AM +0200, Uwe Kleine-König wrote: > > > The conversion is right (I think), > > > > We still have the problem that the pwm drivers calculate the period > > incorrectly by rounding down (except pwm-bcm2835). So the period is not > > as good as it could be in most cases, but this driver can't do anything > > about that. > > Yeah, some time ago I started coding a round_state function > (wip at > https://git.pengutronix.de/cgit/ukl/linux/commit/?h=pwm-wip&id=ae348eb6a55d6526f30ef4a49819197d9616391e) > but this was pushed down on my todo-list by more important stuff. That looks great, thank you for working on that! > If you want to experiment with that ... I will have a look. > > > note this could be optimized a bit > > > further: state.period only depends on carrier which rarely changes, so > > > the calculation could be done in pwm_ir_set_carrier(). Ditto for duty > > > which only depends on state.period and pwm_ir->duty_cycle. (This is for > > > a separate commit though.) > > > > I'm not sure what caching this is much of a win. The calculation is a few > > instructions, so you're not winning in the way of speed. On the flip side > > you use more memory since pwm_state has to be kmalloc() rather than existing > > I tested a bit with this patch on top of Maíra's: > > diff --git a/drivers/media/rc/pwm-ir-tx.c b/drivers/media/rc/pwm-ir-tx.c > index 105a9c24f1e3..7585c21775bc 100644 > --- a/drivers/media/rc/pwm-ir-tx.c > +++ b/drivers/media/rc/pwm-ir-tx.c > @@ -17,7 +17,7 @@ > > struct pwm_ir { > struct pwm_device *pwm; > - unsigned int carrier; > + struct pwm_state state; > unsigned int duty_cycle; > }; > > @@ -32,6 +32,7 @@ static int pwm_ir_set_duty_cycle(struct rc_dev *dev, u32 duty_cycle) > struct pwm_ir *pwm_ir = dev->priv; > > pwm_ir->duty_cycle = duty_cycle; > + pwm_set_relative_duty_cycle(&pwm_ir->state, pwm_ir->duty_cycle, 100); > > return 0; > } > @@ -43,7 +44,8 @@ static int pwm_ir_set_carrier(struct rc_dev *dev, u32 carrier) > if (!carrier) > return -EINVAL; > > - pwm_ir->carrier = carrier; > + pwm_ir->state.period = DIV_ROUND_CLOSEST(NSEC_PER_SEC, carrier); > + pwm_set_relative_duty_cycle(&pwm_ir->state, pwm_ir->duty_cycle, 100); > > return 0; > } > @@ -53,21 +55,15 @@ static int pwm_ir_tx(struct rc_dev *dev, unsigned int *txbuf, > { > struct pwm_ir *pwm_ir = dev->priv; > struct pwm_device *pwm = pwm_ir->pwm; > - struct pwm_state state; > int i; > ktime_t edge; > long delta; > > - pwm_init_state(pwm, &state); > - > - state.period = DIV_ROUND_CLOSEST(NSEC_PER_SEC, pwm_ir->carrier); > - pwm_set_relative_duty_cycle(&state, pwm_ir->duty_cycle, 100); > - > edge = ktime_get(); > > for (i = 0; i < count; i++) { > - state.enabled = !(i % 2); > - pwm_apply_state(pwm, &state); > + pwm_ir->state.enabled = !(i % 2); > + pwm_apply_state(pwm, &pwm_ir->state); > > edge = ktime_add_us(edge, txbuf[i]); > delta = ktime_us_delta(edge, ktime_get()); > @@ -75,8 +71,8 @@ static int pwm_ir_tx(struct rc_dev *dev, unsigned int *txbuf, > usleep_range(delta, delta + 10); > } > > - state.enabled = false; > - pwm_apply_state(pwm, &state); > + pwm_ir->state.enabled = false; > + pwm_apply_state(pwm, &pwm_ir->state); > > return count; > } > @@ -95,8 +91,9 @@ static int pwm_ir_probe(struct platform_device *pdev) > if (IS_ERR(pwm_ir->pwm)) > return PTR_ERR(pwm_ir->pwm); > > - pwm_ir->carrier = 38000; > - pwm_ir->duty_cycle = 50; > + pwm_ir->state.duty_cycle = 50; > + pwm_init_state(pwm_ir->pwm, &pwm_ir->state); > + pwm_ir_set_carrier(rcdev, 38000); > > rcdev = devm_rc_allocate_device(&pdev->dev, RC_DRIVER_IR_RAW_TX); > if (!rcdev) > > bloat-o-meter reports (for an arm allmodconfig build) > > add/remove: 0/0 grow/shrink: 3/1 up/down: 644/-396 (248) > Function old new delta > pwm_ir_probe 372 676 +304 > pwm_ir_set_carrier 108 292 +184 > pwm_ir_set_duty_cycle 68 224 +156 > pwm_ir_tx 908 512 -396 > Total: Before=2302, After=2550, chg +10.77% So 248 bytes more after your changes. > struct pwm_ir increases from 12 bytes to 40 bytes. > > The stack space required by pwm_ir_tx decreases from 60 to 36 > > I don't know exactly how kmalloc works internally. Maybe allocating a > structure of size 40 bytes doesn't need more memory than a structure of > size 12? > > I didn't check how runtimes change, but the size decrease of pwm_ir_tx() > is nice and might save a bit of runtime. I'm not following, how is this decreasing runtime? > Probably it depends on your focus if this change is good for you or not. Decreasing size is of course a good thing. > > just on the stack, and both ioctl handlers and the probe function need to > > recalculate the period/duty cycle, so there is a slight increase in code size. > > > > This change does not improve anything measurably and only increases code > > complexity. > > You did measure? Thanks for prototyping this. Sean
On Thu, Oct 28, 2021 at 01:26:10PM +0100, Sean Young wrote: > Hi Uwe, > > On Thu, Oct 28, 2021 at 01:15:35PM +0200, Uwe Kleine-König wrote: > > On Thu, Oct 28, 2021 at 10:14:42AM +0100, Sean Young wrote: > > > On Thu, Oct 28, 2021 at 08:45:13AM +0200, Uwe Kleine-König wrote: > > > > The conversion is right (I think), > > > > > > We still have the problem that the pwm drivers calculate the period > > > incorrectly by rounding down (except pwm-bcm2835). So the period is not > > > as good as it could be in most cases, but this driver can't do anything > > > about that. > > > > Yeah, some time ago I started coding a round_state function > > (wip at > > https://git.pengutronix.de/cgit/ukl/linux/commit/?h=pwm-wip&id=ae348eb6a55d6526f30ef4a49819197d9616391e) > > but this was pushed down on my todo-list by more important stuff. > > That looks great, thank you for working on that! > > > If you want to experiment with that ... > > I will have a look. > > > > > note this could be optimized a bit > > > > further: state.period only depends on carrier which rarely changes, so > > > > the calculation could be done in pwm_ir_set_carrier(). Ditto for duty > > > > which only depends on state.period and pwm_ir->duty_cycle. (This is for > > > > a separate commit though.) > > > > > > I'm not sure what caching this is much of a win. The calculation is a few > > > instructions, so you're not winning in the way of speed. On the flip side > > > you use more memory since pwm_state has to be kmalloc() rather than existing > > > > I tested a bit with this patch on top of Maíra's: > > > > diff --git a/drivers/media/rc/pwm-ir-tx.c b/drivers/media/rc/pwm-ir-tx.c > > index 105a9c24f1e3..7585c21775bc 100644 > > --- a/drivers/media/rc/pwm-ir-tx.c > > +++ b/drivers/media/rc/pwm-ir-tx.c > > @@ -17,7 +17,7 @@ > > > > struct pwm_ir { > > struct pwm_device *pwm; > > - unsigned int carrier; > > + struct pwm_state state; > > unsigned int duty_cycle; > > }; > > > > @@ -32,6 +32,7 @@ static int pwm_ir_set_duty_cycle(struct rc_dev *dev, u32 duty_cycle) > > struct pwm_ir *pwm_ir = dev->priv; > > > > pwm_ir->duty_cycle = duty_cycle; > > + pwm_set_relative_duty_cycle(&pwm_ir->state, pwm_ir->duty_cycle, 100); > > > > return 0; > > } > > @@ -43,7 +44,8 @@ static int pwm_ir_set_carrier(struct rc_dev *dev, u32 carrier) > > if (!carrier) > > return -EINVAL; > > > > - pwm_ir->carrier = carrier; > > + pwm_ir->state.period = DIV_ROUND_CLOSEST(NSEC_PER_SEC, carrier); > > + pwm_set_relative_duty_cycle(&pwm_ir->state, pwm_ir->duty_cycle, 100); > > > > return 0; > > } > > @@ -53,21 +55,15 @@ static int pwm_ir_tx(struct rc_dev *dev, unsigned int *txbuf, > > { > > struct pwm_ir *pwm_ir = dev->priv; > > struct pwm_device *pwm = pwm_ir->pwm; > > - struct pwm_state state; > > int i; > > ktime_t edge; > > long delta; > > > > - pwm_init_state(pwm, &state); > > - > > - state.period = DIV_ROUND_CLOSEST(NSEC_PER_SEC, pwm_ir->carrier); > > - pwm_set_relative_duty_cycle(&state, pwm_ir->duty_cycle, 100); > > - > > edge = ktime_get(); > > > > for (i = 0; i < count; i++) { > > - state.enabled = !(i % 2); > > - pwm_apply_state(pwm, &state); > > + pwm_ir->state.enabled = !(i % 2); > > + pwm_apply_state(pwm, &pwm_ir->state); > > > > edge = ktime_add_us(edge, txbuf[i]); > > delta = ktime_us_delta(edge, ktime_get()); > > @@ -75,8 +71,8 @@ static int pwm_ir_tx(struct rc_dev *dev, unsigned int *txbuf, > > usleep_range(delta, delta + 10); > > } > > > > - state.enabled = false; > > - pwm_apply_state(pwm, &state); > > + pwm_ir->state.enabled = false; > > + pwm_apply_state(pwm, &pwm_ir->state); > > > > return count; > > } > > @@ -95,8 +91,9 @@ static int pwm_ir_probe(struct platform_device *pdev) > > if (IS_ERR(pwm_ir->pwm)) > > return PTR_ERR(pwm_ir->pwm); > > > > - pwm_ir->carrier = 38000; > > - pwm_ir->duty_cycle = 50; > > + pwm_ir->state.duty_cycle = 50; > > + pwm_init_state(pwm_ir->pwm, &pwm_ir->state); > > + pwm_ir_set_carrier(rcdev, 38000); > > > > rcdev = devm_rc_allocate_device(&pdev->dev, RC_DRIVER_IR_RAW_TX); > > if (!rcdev) > > > > bloat-o-meter reports (for an arm allmodconfig build) > > > > add/remove: 0/0 grow/shrink: 3/1 up/down: 644/-396 (248) > > Function old new delta > > pwm_ir_probe 372 676 +304 > > pwm_ir_set_carrier 108 292 +184 > > pwm_ir_set_duty_cycle 68 224 +156 > > pwm_ir_tx 908 512 -396 > > Total: Before=2302, After=2550, chg +10.77% > > So 248 bytes more after your changes. ack. This is because the compiler inlines the division which accounts for > 100 bytes. > > struct pwm_ir increases from 12 bytes to 40 bytes. > > > > The stack space required by pwm_ir_tx decreases from 60 to 36 > > > > I don't know exactly how kmalloc works internally. Maybe allocating a > > structure of size 40 bytes doesn't need more memory than a structure of > > size 12? > > > > I didn't check how runtimes change, but the size decrease of pwm_ir_tx() > > is nice and might save a bit of runtime. > > I'm not following, how is this decreasing runtime? With my changes pwm_ir_tx got smaller and { pwm_ir_probe, pwm_ir_set_carrier, pwm_ir_set_duty_cycle } got bigger. Now if for a typical runtime pattern pwm_ir_probe and pwm_ir_set_carrier run once and pwm_ir_set_duty_cycle 100 times and pwm_ir_tx 1000 times (no idea if that is realistic) it might be a net win in sum. Best regards Uwe
On Thu, Oct 28, 2021 at 08:05:16PM +0200, Uwe Kleine-König wrote: > On Thu, Oct 28, 2021 at 01:26:10PM +0100, Sean Young wrote: > > > bloat-o-meter reports (for an arm allmodconfig build) > > > > > > add/remove: 0/0 grow/shrink: 3/1 up/down: 644/-396 (248) > > > Function old new delta > > > pwm_ir_probe 372 676 +304 > > > pwm_ir_set_carrier 108 292 +184 > > > pwm_ir_set_duty_cycle 68 224 +156 > > > pwm_ir_tx 908 512 -396 > > > Total: Before=2302, After=2550, chg +10.77% > > > > So 248 bytes more after your changes. > > ack. This is because the compiler inlines the division which accounts > for > 100 bytes. I'm surprised it's that large. This is on 32 bit? > > > struct pwm_ir increases from 12 bytes to 40 bytes. > > > > > > The stack space required by pwm_ir_tx decreases from 60 to 36 > > > > > > I don't know exactly how kmalloc works internally. Maybe allocating a > > > structure of size 40 bytes doesn't need more memory than a structure of > > > size 12? > > > > > > I didn't check how runtimes change, but the size decrease of pwm_ir_tx() > > > is nice and might save a bit of runtime. > > > > I'm not following, how is this decreasing runtime? > > With my changes pwm_ir_tx got smaller and { pwm_ir_probe, > pwm_ir_set_carrier, pwm_ir_set_duty_cycle } got bigger. Now if for a > typical runtime pattern pwm_ir_probe and pwm_ir_set_carrier run once and > pwm_ir_set_duty_cycle 100 times and pwm_ir_tx 1000 times (no idea if > that is realistic) it might be a net win in sum. The two most common programs for sending IR are ir-ctl: https://git.linuxtv.org/v4l-utils.git/tree/utils/ir-ctl/ir-ctl.c#n1041 lircd: https://sourceforge.net/p/lirc/git/ci/master/tree/lib/transmit.c For each transmission, the carrier is set. If the duty cyle is specified, then that is set too. Then the transmit itself is done. Both of them set the carrier and duty cycle (if required) for every transmission: setting the carrier and duty cycle is a cheap operation, and it is device property which can be overriden by another process. This means with your changes, if the carrier and duty cycle are both set for each transmission, then we're doing more work. If only the carrier is set for each transmission, then there is no net gain/loss (I think), but the code size has increased. Thanks for prototyping this. Sean
On Fri, Oct 29, 2021 at 08:16:08AM +0100, Sean Young wrote: > On Thu, Oct 28, 2021 at 08:05:16PM +0200, Uwe Kleine-König wrote: > > On Thu, Oct 28, 2021 at 01:26:10PM +0100, Sean Young wrote: > > > > bloat-o-meter reports (for an arm allmodconfig build) > > > > > > > > add/remove: 0/0 grow/shrink: 3/1 up/down: 644/-396 (248) > > > > Function old new delta > > > > pwm_ir_probe 372 676 +304 > > > > pwm_ir_set_carrier 108 292 +184 > > > > pwm_ir_set_duty_cycle 68 224 +156 > > > > pwm_ir_tx 908 512 -396 > > > > Total: Before=2302, After=2550, chg +10.77% > > > > > > So 248 bytes more after your changes. > > > > ack. This is because the compiler inlines the division which accounts > > for > 100 bytes. > > I'm surprised it's that large. This is on 32 bit? Yes, it's a 64 bit division on 32 bit ARM. > > > > struct pwm_ir increases from 12 bytes to 40 bytes. > > > > > > > > The stack space required by pwm_ir_tx decreases from 60 to 36 > > > > > > > > I don't know exactly how kmalloc works internally. Maybe allocating a > > > > structure of size 40 bytes doesn't need more memory than a structure of > > > > size 12? > > > > > > > > I didn't check how runtimes change, but the size decrease of pwm_ir_tx() > > > > is nice and might save a bit of runtime. > > > > > > I'm not following, how is this decreasing runtime? > > > > With my changes pwm_ir_tx got smaller and { pwm_ir_probe, > > pwm_ir_set_carrier, pwm_ir_set_duty_cycle } got bigger. Now if for a > > typical runtime pattern pwm_ir_probe and pwm_ir_set_carrier run once and > > pwm_ir_set_duty_cycle 100 times and pwm_ir_tx 1000 times (no idea if > > that is realistic) it might be a net win in sum. > > The two most common programs for sending IR are > > ir-ctl: https://git.linuxtv.org/v4l-utils.git/tree/utils/ir-ctl/ir-ctl.c#n1041 > lircd: https://sourceforge.net/p/lirc/git/ci/master/tree/lib/transmit.c > > For each transmission, the carrier is set. If the duty cyle is specified, > then that is set too. Then the transmit itself is done. Both of them > set the carrier and duty cycle (if required) for every transmission: setting > the carrier and duty cycle is a cheap operation, and it is device property > which can be overriden by another process. > > This means with your changes, if the carrier and duty cycle are both set > for each transmission, then we're doing more work. If only the carrier > is set for each transmission, then there is no net gain/loss (I think), > but the code size has increased. OK, then I discard my patch. While reading that I wondered if it makes sense to have a callback that sets both carrier and duty cycle and then remove the other two. Best regards Uwe
On Fri, Oct 29, 2021 at 01:06:02PM +0200, Uwe Kleine-König wrote: > On Fri, Oct 29, 2021 at 08:16:08AM +0100, Sean Young wrote: > > This means with your changes, if the carrier and duty cycle are both set > > for each transmission, then we're doing more work. If only the carrier > > is set for each transmission, then there is no net gain/loss (I think), > > but the code size has increased. > > OK, then I discard my patch. > > While reading that I wondered if it makes sense to have a callback that > sets both carrier and duty cycle and then remove the other two. There are separate lirc ioctls to set carrier and duty cycle, that's why there are separate callbacks. Sean
I would like to thank you guys for the attention and this interesting discussion. I'm looking for some work in the kernel and I would like to know if you guys have any suggestions for beginner tasks in this subsystem. I have solid knowledge in C programming, but I started in the kernel a couple of weeks ago. Anyway, thank you for all the feedback. Maíra
Hello, On Fri, Oct 29, 2021 at 09:08:29AM -0300, Maíra Canal wrote: > I would like to thank you guys for the attention and this interesting > discussion. I'm looking for some work in the kernel and I would like > to know if you guys have any suggestions for beginner tasks in this > subsystem. I have solid knowledge in C programming, but I started in > the kernel a couple of weeks ago. Anyway, thank you for all the > feedback. If you want something mathematically demanding, you can pick up the patch I pointed out to Sean, I think I won't find the time in the near future to work on this. The background is that PWM drivers have to round most requests and there is no uniformity among drivers and so if a consumer (e.g. the pwm-ir driver) requests say 20000 ns, it will get 18000 from some drivers and maybe 25000 from others. So the idea is to have a function pwm_round_state that has fixed rounding rules such that a consumer can pick the best setting for their use-case. Something more mechanic in the PWM area is to convert drivers that still implement .config/.enable/.disable to .apply. See http://patchwork.ozlabs.org/project/linux-pwm/patch/20211029105617.210178-1-u.kleine-koenig@pengutronix.de/ for an example. The well-known good template is pwm_apply_legacy() after applying http://patchwork.ozlabs.org/project/linux-pwm/list/?series=251456 . If you want something more global: The prototype of the remove callbacks for platform devices returns an int: https://elixir.bootlin.com/linux/v5.15-rc7/source/include/linux/platform_device.h#L206 However the returned value is (nearly) ignored by the driver core: https://elixir.bootlin.com/linux/v5.15-rc7/source/drivers/base/platform.c#L1433 The longterm goal is to change the prototype of .remove to return void. As a first step making all functions return 0 is a worthwile project. The same problem exists for several other buses, one patch I sent to work on this goal for i2c is: https://lore.kernel.org/r/20211021105657.72572-1-u.kleine-koenig@pengutronix.de Best regards Uwe
On Fri, Oct 29, 2021 at 09:08:29AM -0300, Maíra Canal wrote: > I would like to thank you guys for the attention and this interesting > discussion. I'm looking for some work in the kernel and I would like > to know if you guys have any suggestions for beginner tasks in this > subsystem. I have solid knowledge in C programming, but I started in > the kernel a couple of weeks ago. Anyway, thank you for all the > feedback. Thank you for your contributions. rc-core (drivers/media/rc) is in good shape and I don't know of any outstanding issues. There is a ton of work around dvb, but this requires actual dvb hardware to test. - port drivers/media/usb/dvb-usb to drivers/media/usb/dvb-usb-v2 - Re-write dvb frontend without dvb_attach() - Implement dma-buf for dvb Like I said, you'll need actual hardware to test against, and this is probably not beginner tasks. Sean
Hi Uwe, On Thu, Oct 28, 2021 at 01:15:35PM +0200, Uwe Kleine-König wrote: > On Thu, Oct 28, 2021 at 10:14:42AM +0100, Sean Young wrote: > > We still have the problem that the pwm drivers calculate the period > > incorrectly by rounding down (except pwm-bcm2835). So the period is not > > as good as it could be in most cases, but this driver can't do anything > > about that. > > Yeah, some time ago I started coding a round_state function > (wip at > https://git.pengutronix.de/cgit/ukl/linux/commit/?h=pwm-wip&id=ae348eb6a55d6526f30ef4a49819197d9616391e) > but this was pushed down on my todo-list by more important stuff. > > If you want to experiment with that ... I was thinking about this problem this morning. - The pwm-ir-tx driver gets a carrier set in Hz, which it has to convert to a period (1e9 / carrier). There is loss of accuracy there. - When it gets to the pwm driver, the period is converted into the format the pwm hardware expects. For example the pwm-bcm2835 driver converts it into clock cycles (1e9 / 8e8). Both calculations involve loss of accuracy because of integer representation. Would it make more sense for the pwm interface to use numer/denom rational numbers? struct rational { u64 numer; u64 denom; }; If pwm-ir-tx would like to set the carrier, it could it like so: struct rational period = { .numer = NUSEC_PER_SEC, .denom = carrier, }; pwm_set_period(&period); Now pwm-bcm2835 could do it like so: int bcm2835_set_period(struct rational *period) { struct rational rate = { .numer = NUSEC_PER_SEC, .denum = clk_get_rate(clk), }; rational_div(&rate, period); int step = rational_to_u64(&rate); } Alternatively, since most of the pwm hardware is doing scaling based on the clock (I think), would not make more sense for the pwm driver interface to take a frequency rather than a period? Then the integer calculations can be simpler: just divide the clock rate by the required frequency and you have the period. Just some thoughts. Sean
On Sun, Oct 31, 2021 at 10:39:34AM +0000, Sean Young wrote: > Hi Uwe, > > On Thu, Oct 28, 2021 at 01:15:35PM +0200, Uwe Kleine-König wrote: > > On Thu, Oct 28, 2021 at 10:14:42AM +0100, Sean Young wrote: > > > We still have the problem that the pwm drivers calculate the period > > > incorrectly by rounding down (except pwm-bcm2835). So the period is not > > > as good as it could be in most cases, but this driver can't do anything > > > about that. > > > > Yeah, some time ago I started coding a round_state function > > (wip at > > https://git.pengutronix.de/cgit/ukl/linux/commit/?h=pwm-wip&id=ae348eb6a55d6526f30ef4a49819197d9616391e) > > but this was pushed down on my todo-list by more important stuff. > > > > If you want to experiment with that ... > > I was thinking about this problem this morning. > > - The pwm-ir-tx driver gets a carrier set in Hz, which it has to convert to > a period (1e9 / carrier). There is loss of accuracy there. Ack, and the loss is known. > - When it gets to the pwm driver, the period is converted into the format > the pwm hardware expects. For example the pwm-bcm2835 driver converts > it into clock cycles (1e9 / 8e8). > > Both calculations involve loss of accuracy because of integer representation. > > Would it make more sense for the pwm interface to use numer/denom rational > numbers? This is quite a complication because then all lowlevel driver needs to do fractal math. I don't want to open that can of worms. > struct rational { > u64 numer; > u64 denom; > }; > > If pwm-ir-tx would like to set the carrier, it could it like so: > > struct rational period = { > .numer = NUSEC_PER_SEC, > .denom = carrier, > }; > > pwm_set_period(&period); > > Now pwm-bcm2835 could do it like so: > > int bcm2835_set_period(struct rational *period) > { > struct rational rate = { > .numer = NUSEC_PER_SEC, > .denum = clk_get_rate(clk), > }; > > rational_div(&rate, period); > > int step = rational_to_u64(&rate); > } > > Alternatively, since most of the pwm hardware is doing scaling based on the > clock (I think), would not make more sense for the pwm driver interface to > take a frequency rather than a period? Then the integer calculations can be > simpler: just divide the clock rate by the required frequency and you have > the period. I think the rounding approach is easier and gives you the optimal setting, too. With a carrier of say 455 KHz you want a period of 1e9 / (455 kHz) = 2197.802197802198 ns. Now assuming a pwm clk of 10 MHz the pwm-bcm2835 driver can offer you 2100 ns and 2200 ns and the pwm-ir-tx driver can pick the one it prefers. (OK, this works so nicely because the pwm-bcm2835 driver has a nice clk rate, but also if it were say 66666000 Hz, it would offer you 2191 ns and 2206 ns and you could pick your favourite. (In this case there is a rounding error because the actual periods are 2190.021900219002 ns and 2205.022050220502 ns, but I would expect that this doesn't influence the choice and to improve here we probably would have to change the unit of clk_get_rate, too.) Best regards Uwe
diff --git a/drivers/media/rc/pwm-ir-tx.c b/drivers/media/rc/pwm-ir-tx.c index 4bc28d2c9cc9..105a9c24f1e3 100644 --- a/drivers/media/rc/pwm-ir-tx.c +++ b/drivers/media/rc/pwm-ir-tx.c @@ -53,22 +53,21 @@ static int pwm_ir_tx(struct rc_dev *dev, unsigned int *txbuf, { struct pwm_ir *pwm_ir = dev->priv; struct pwm_device *pwm = pwm_ir->pwm; - int i, duty, period; + struct pwm_state state; + int i; ktime_t edge; long delta; - period = DIV_ROUND_CLOSEST(NSEC_PER_SEC, pwm_ir->carrier); - duty = DIV_ROUND_CLOSEST(pwm_ir->duty_cycle * period, 100); + pwm_init_state(pwm, &state); - pwm_config(pwm, duty, period); + state.period = DIV_ROUND_CLOSEST(NSEC_PER_SEC, pwm_ir->carrier); + pwm_set_relative_duty_cycle(&state, pwm_ir->duty_cycle, 100); edge = ktime_get(); for (i = 0; i < count; i++) { - if (i % 2) // space - pwm_disable(pwm); - else - pwm_enable(pwm); + state.enabled = !(i % 2); + pwm_apply_state(pwm, &state); edge = ktime_add_us(edge, txbuf[i]); delta = ktime_us_delta(edge, ktime_get()); @@ -76,7 +75,8 @@ static int pwm_ir_tx(struct rc_dev *dev, unsigned int *txbuf, usleep_range(delta, delta + 10); } - pwm_disable(pwm); + state.enabled = false; + pwm_apply_state(pwm, &state); return count; }
Remove legacy PWM interface (pwm_config, pwm_enable, pwm_disable) and replace it for the atomic PWM API. Signed-off-by: Maíra Canal <maira.canal@usp.br> Reported-by: kernel test robot <lkp@intel.com> --- V1 -> V2: Assign variables directly and simplify conditional statement V2 -> V3: Fix declaration of undeclared variables V3 -> V4: Fix DIV_ROUND_CLOSEST error with u64 variables --- drivers/media/rc/pwm-ir-tx.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)