Use non-linear reward function for numerical habits #642

iSoron · 2020-09-13T23:55:28Z

iSoron
Sep 13, 2020
Maintainer

Suppose you are trying to form the habit of meditating 20 minutes per day. In the dev branch, if you consistently meditate daily for only 10 minutes, your score will eventually converge to 50%, and if you consistently meditate for only 5 minutes, your score will converge to 25%. That is, our current reward function is linear.

In #42 and in some private email conversations, @drbbrd suggested the usage of a non-linear reward function instead. In his original post, @drbbrd wrote:

For many habits (perhaps most), greater benefits are attained for the early repetitions, with diminishing returns for doing more of the same thing. What makes a habit a life-changing force is the ability to maintain it over the long term, by "just showing up" every day. By awarding more points for the first few repetitions than for additional repetitions, the user is encouraged to at least do a little bit (when their motivation is low), which helps tremendously in entrenching a habit as part of the normal daily routine.
[...]
A concrete example of a diminishing returns habit might be daily meditation. Twenty minutes per day might be a fine target to aim for, but a lot of credit should be given for "just showing up" and doing the first two minutes (which is tougher than it sounds!). The sqrt(n) scoring function accomplishes that smarter pay-off nicely. The user might obtain 30 points for a 2-minute session, 42 points for a 4-minute session, 52 points for 6 minutes, or 60 points for 8 minutes (30 * sqrt(4) = 60). More is better, but only moderately better. The user just enters the number of 2-minute intervals they did, and the app does the rest automagically.

In a sense, Loop already follows this philosophy. For example, when you are just starting out a new yes-no habit, each repetition gives you a significant reward and your score increases rapidly; as the score reaches 100%, the rewards start getting smaller and smaller. For numerical habits, it may also make sense to use a non-linear function. In the meditation example, 10 minutes could give you a score of 80%, 15 minutes gives you 95% and 20 minutes gives you 100%. I am opening this issue to discuss this possible enhancement and gather some feedback.

Thank you for the original suggestion, @drbbrd.

iSoron · 2020-09-14T00:22:33Z

iSoron
Sep 14, 2020
Maintainer Author

I am open to this feature and I think it could be easily integrated in the app. Specifically, in forceRecompute, we already calculate percentageCompleted, which goes from 0.0 to 1.0. We would just need to apply a non-linear function to this value before feeding it to Score.compute.
As for which non-linear function to use, I am not sure sqrt is a good function to use; it seems somewhat arbitrary and hard to explain to the users. A more natural function to me would be something like the Pareto principle, or 80/20 rule: achieving 20% of your target gives you 80% of the total rewards, and the same thing applies recursively (achieving 20% of the remaining target gives you 80% of the remaining rewards). The specific numbers (20/80) could be tweaked. The exact expression would be:

f(perc, reward) = 1 - (1 - reward)**(log(1 - x) / log(1 - perc))

Here is a chart of this function. In the blue curve, f(0.2, 0.8), 20% of the target gives you 80% of the rewards. In the green curve, f(0.5, 0.8), 50% of the target gives you 80% of the rewards. The red curve, f(0.5, 0.5), is just the identity.

0 replies

drbbrd · 2020-09-14T10:28:46Z

drbbrd
Sep 14, 2020

Keeping the measure as a percentage of completion is a fine idea, for consistency. The user enters their real goal, and gets rewarded accordingly in a non-linear fashion. (Some goals are more important than others, however, which is where weighting comes into play).

The 1/sqrt(n) function is far from arbitrary. It is seen throughout nature and physics, and is something of a natural law of our Universe. Statistical variance drops off at a rate of 1/sqrt(n) of the sample size, for example.

In addition to the above curves, you could add a plot of sqrt(completion). Thus, 1% completion would earn sqrt(0.01) = 10%, 4% completion would earn 20%, 25% completion would earn 50%, 50% completion would earn 70.7%, all the way up to 100% getting 100%.

There are many power laws that also occur naturally, so i would suggest evaluating them side by side for particular use cases, and decide which one intuitively "feels" the most appropriate. Mathematical taste is subjective, but highly numerate people have usually developed the best taste. I have found many cases where exponentials are used when a quadratic would be far more appropriate due to the statistical nature of the process (game XP leveling is one example).

If the goal is to meditate for 20 minutes a day, then i'm not convinced that 4 minutes should be worth an 80% score. We don't want to incentivize doing no more than the mininum (like some Pareto proponents who suggest aiming for only the 20% effort that matters most). In comparison, sqrt(0.2) = 45% "feels" more on target to me (and perhaps that can be improved further).

0 replies

drbbrd · 2020-09-14T11:13:22Z

drbbrd
Sep 14, 2020

With regard to having an overall score for all habits being tracked, there is a natural way to acheive that, using a linear combination of coefficients provided by the user.

In the example i provided, the user provided relative weights for 8 habits of 20 + 30 + 50 + 200 + 100 + 30 + 30 + 40 = 500. Those are normalized to coefficients of 0.04, 0.06, 0.10, 0.40, 0.20, 0.06, 0.06, 0.08, summing to 1.0.

The weighted sum of those individual scores produces a total score in the range 0 to 1, which is displayed as a percentage.

0 replies

drbbrd · 2020-09-14T12:44:03Z

drbbrd
Sep 14, 2020

Also notice that sqrt(completion) can gracefully handle numerical values greater than 100% (e.g. 200% completion, for 40 minutes of meditation, awards a 141% score). There wouldn't need to be a hard cap of 100% if the user did even better than just fulfilling the goal.

In contrast, the Pareto equation hits a discontinuity, even for 100% completion.

0 replies

iSoron · 2020-09-14T13:14:06Z

iSoron
Sep 14, 2020
Maintainer Author

Here is a chart comparing pareto(0.5, 0.8) (in green) with sqrt(x) (in blue):

EDIT: Adding a few more charts. Here is x**k for k in [0.05, 0.1, 0.2, 0.4, 0.8]:

Here is pareto(0.5, k) for k in [0.5, 0.6, 0.7, 0.8, 0.9]:

0 replies

drbbrd · 2020-09-14T13:33:45Z

drbbrd
Sep 14, 2020

Thank you for the additional visuals. Here is a table of 10% increments on completion, comparing pareto(0.5, 0.8) with sqrt(completion), rounding off to whole percentages.

I personally think the high end of that range is problematic, offering only a 2% increase for going from 80% to 90% completion, and almost nothing for 90% to 100%.

0 replies

drbbrd · 2020-09-15T22:40:38Z

drbbrd
Sep 15, 2020

After re-working the example data with the revised scoring method, i think it is an improvement.

Each habit has a numerical target set by the user (e.g. "10", which might be 10 reps, or 10 minutes). This is their definition of "a good day" for that habit.

Each habit is scored as sqrt(actual / target), and it is fine if actual is greater than target (good for you!).

The default is to weigh all habits equally, so the overall score works immediately (no barrier to entry, no need to dive into extra details).

If the user provides a relative value for each habit (on whatever scale they want), then the weighted sum produces a more accurate overall score.

Thank you for the discussion that has led to this improved formulation!

(I've shown daily scores out of 1000 instead of 100, for a little more precision).

0 replies

KraXen72 · 2025-11-23T09:46:22Z

KraXen72
Nov 23, 2025

Hello, is there any progress on this? thanks!

0 replies

Use non-linear reward function for numerical habits #642

Uh oh!

Uh oh!

iSoron Sep 13, 2020 Maintainer

Replies: 8 comments

Uh oh!

iSoron Sep 14, 2020 Maintainer Author

Uh oh!

drbbrd Sep 14, 2020

Uh oh!

drbbrd Sep 14, 2020

Uh oh!

drbbrd Sep 14, 2020

Uh oh!

Uh oh!

iSoron Sep 14, 2020 Maintainer Author

Uh oh!

drbbrd Sep 14, 2020

Uh oh!

drbbrd Sep 15, 2020

Uh oh!

KraXen72 Nov 23, 2025

iSoron
Sep 13, 2020
Maintainer

iSoron
Sep 14, 2020
Maintainer Author

drbbrd
Sep 14, 2020

drbbrd
Sep 14, 2020

drbbrd
Sep 14, 2020

iSoron
Sep 14, 2020
Maintainer Author

drbbrd
Sep 14, 2020

drbbrd
Sep 15, 2020

KraXen72
Nov 23, 2025