Make Predicate stack-safe using Eval #283

bbjubjub2494 · 2020-03-05T17:39:13Z

Fixes last comment in #142, Also adds contramap() ~~to compensate for the loss of inheritance from Function1.~~

Benchmarks

pre-existing 0.9.0 stack-unsafe implementation

(at 992770b)

Benchmark                                               (n)   Mode  Cnt        Score       Error  Units
ChainedPredicateBench.catsCollectionsPredicateUnravel    10  thrpt   25  8168148.133 ± 83522.583  ops/s
ChainedPredicateBench.catsCollectionsPredicateUnravel   100  thrpt   25   635545.444 ±  3674.601  ops/s
ChainedPredicateBench.catsCollectionsPredicateUnravel  1000  thrpt   25    51776.603 ±   335.972  ops/s

(SOE at n = 10,000)

this PR (mark 3)

Benchmark                                                (n)   Mode  Cnt        Score       Error  Units
ChainedPredicateBench.catsCollectionsPredicateUnravel     10  thrpt   25  3092451.360 ± 20246.653  ops/s
ChainedPredicateBench.catsCollectionsPredicateUnravel    100  thrpt   25   311739.000 ±  3842.162  ops/s
ChainedPredicateBench.catsCollectionsPredicateUnravel   1000  thrpt   25    27788.238 ±   226.262  ops/s
ChainedPredicateBench.catsCollectionsPredicateUnravel  10000  thrpt   25     2626.175 ±    20.365  ops/s

That's roughly -50% at n = 100 and -45% at n = 1,000. It was a convoluted intensive example with all the stress on the internals, so I would consider it a worst case scenario. Most real world intensional sets probably put most of their complexity in the basic predicates rather than in thousands of combinations. Hence I would say -50% is the lower bound of the performance degradation in general. Is that acceptable?

johnynek

I'm concerned with the drop in performance here. Can you make a benchmark?

There are other ways to make Predicate stack safe without giving up all the performance (see what was done in cats): https://github.com/typelevel/cats/blob/master/core/src/main/scala/cats/data/AndThen.scala

core/src/main/scala/cats/collections/Predicate.scala

bbjubjub2494 · 2020-03-05T19:15:26Z

There are other ways to make Predicate stack safe without giving up all the performance (see what was done in cats): https://github.com/typelevel/cats/blob/master/core/src/main/scala/cats/data/AndThen.scala

Will get back to you on that. I think a mix between using AndThen where possible and Eval otherwise might work.

codecov-io · 2020-03-05T19:49:27Z

Codecov Report

Merging #283 into master will decrease coverage by 0.08%.
The diff coverage is 86.66%.

@@            Coverage Diff             @@
##           master     #283      +/-   ##
==========================================
- Coverage    91.5%   91.42%   -0.09%     
==========================================
  Files          24       24              
  Lines        1625     1632       +7     
  Branches      214      219       +5     
==========================================
+ Hits         1487     1492       +5     
- Misses        138      140       +2

Impacted Files	Coverage Δ
...re/src/main/scala/cats/collections/Predicate.scala	`79.31% <86.66%> (+2.03%)`	⬆️
core/src/main/scala/cats/collections/BitSet.scala	`97.14% <0%> (-0.22%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0a98846...1c4a746. Read the comment docs.

bbjubjub2494 · 2020-04-08T21:37:06Z

The PR stabilized and finalized. @johnynek would you mind reviewing again or pinging somebody else?

bbjubjub2494 · 2020-07-28T19:01:19Z

core/src/main/scala/cats/collections/Predicate.scala


  /**
   * Return the opposite predicate
   */
  def unary_!(): Predicate[A] = negate
+
+  /**
+   * compose with a function the predicate.


Suggested change

* compose with a function the predicate.

* compose a function with the predicate.

typo on my part, will fix promptly

johnynek

I left some comments. I'm not trying to totally block this, but I'm concerned about performance. Also, I think an AST approach where you keep the logical structure and potentially leverage laws to simplify could be interesting.

johnynek · 2020-07-28T19:11:06Z

core/src/main/scala/cats/collections/Predicate.scala

 */
-abstract class Predicate[-A] extends scala.Function1[A, Boolean] { self =>
+final class Predicate[-A] private(private val p: AndThen[A, Eval[Boolean]]) { self =>


I'm still pretty skeptical of the need for both AndThen and Eval. Eval comes with a big performance penalty, like 100x isn't uncommon.

Is blowing the stack very common here?

Next, Why not just make type Predicate[-A] = Kleisli[Eval, A, Boolean] and then have some methods on that if you want to have a stack safe predicate? Why a new class?

Regarding performance, I'm not seeing any 100x: I posted a benchmark in the opening post and it seems to be at 2x worst case scenario. I might be missing something though.

Regarding the need for Eval, since we are dealing with tree-shaped expressions, there is a need for some type of trampoline like Eval if stack safety is to be achieved. (A Defer[_] to be precise) Maybe scala.util.control.TailCalls.TailRec would be preferable since we dont need all the laziness modes?

Regarding AndThen, it makes contramap stack-safe, and it's a thin layer over functions that doesn't even kick in unless a lot of composition is done, so I think it's actually more performant in this role than Kleisli because Kleisli leans on the trampoline. I'll come with benchmarks though.

Regarding whether it's common to blow the stack here, I come at it from the perspective that it shouldn't be possible at all in a Cats library. It should be as efficient as possible while remaining SOE-less. That's my interpretation of the "Efficency" motivation on the Cats page.

Regarding the class question that's a good point. I tried turning the class into an opaque newtype a couple of months ago, but it wouldn't compile under 2.11. Now that 2.11 support has been dropped we could go that route, or do a transparent type alias.

Regarding trampolines, I realized the most suitable trampoline is probably a dedicated one: we only use Eval.{True, False}, defer, and flatMap, so rolling my own should be fairly easy and would have built-in unboxed booleans. We shall see on the benchmarks.

Sorry, I overlooked the benchmarks, those look good to me. My concerns are removed by those.

I don't think a custom trampoline can speed this up (maybe I'm wrong) since there are only two Booleans boxing them is almost free, and Eval.True and Eval.False have already boxed them. Eval is faster than scala TailCall and cats.free.Free (by a significant margin) so beating it seems tough to me.

I'm now seeing a combination of the trampoline and the AST, similar to a trampoline with a monadReader capability. It's probably a tiny bit faster, it's more elegant and allows to apply laws so I'll try that next. Thanks you for stimulating my thought process.

johnynek · 2020-07-28T19:16:05Z

core/src/main/scala/cats/collections/Predicate.scala

+  def negate: Predicate[A] = Predicate.wrap[A](p.andThen {
+    case e: Now[Boolean] => if (e.value) Eval.False else Eval.True
+    case e => e.flatMap { if (_) Eval.False else Eval.True }
+  })


if we had case classes you could defer the code here:

case class Negate[A](of: Predicate[A]) extends Predicate[A] { ... }

then def negate could apply the law negate(negate(p)) == p. Similar with keeping And, Or and Const nodes around.

I'm not really in favor of adding an AST layer.

I feel like most of the time these kinds of optimisations will be spotted by user code that also knows these laws, and that most of the nodes will be opaque predicates (A => Boolean) that we cant really do a lot with besides double negation. So the laws would rarely be applied in practice. This also clashes with the type alias route, so simple predicates get an additional indirection, and adds a bunch of boilerplate and complexity.

Side note: would we apply De Morgan's laws in either direction?

If the set is simplified it defeats the point

bbjubjub2494 · 2020-07-30T22:11:06Z

Ok there we go. Kleisli is much nicer to work with. We have AST simplification, a more complete scalacheck generator, and the Boolean algebra. I haven't run final benchmarks yet but I think performance profile is the same as 2 days ago.

codecov-commenter · 2020-07-30T22:15:41Z

Codecov Report

Merging #283 into master will decrease coverage by 0.01%.
The diff coverage is 94.11%.

@@            Coverage Diff             @@
##           master     #283      +/-   ##
==========================================
- Coverage   91.50%   91.49%   -0.02%     
==========================================
  Files          24       24              
  Lines        1625     1646      +21     
  Branches      214      207       -7     
==========================================
+ Hits         1487     1506      +19     
- Misses        138      140       +2

Impacted Files	Coverage Δ
...re/src/main/scala/cats/collections/Predicate.scala	`85.71% <93.75%> (+8.44%)`	⬆️
core/src/main/scala/cats/collections/Set.scala	`88.79% <100.00%> (+0.09%)`	⬆️
...ats/collections/arbitrary/ArbitraryPredicate.scala	`100.00% <100.00%> (ø)`
core/src/main/scala/cats/collections/BitSet.scala	`97.14% <0.00%> (-0.22%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 51d6479...94604fb. Read the comment docs.

johnynek

Looking really good. Just made a few minor points.

johnynek · 2020-07-31T00:02:56Z

core/src/main/scala/cats/collections/Predicate.scala

+   * build a set from a membership function.
+  */
+  def apply[A](p: A => Boolean): Predicate[A] = Lift {
+    Kleisli(a => Eval.now(p(a)))


I think a => if (p(a)) Eval.True else Eval.False may be marginally more efficient.

johnynek · 2020-07-31T00:04:38Z

core/src/main/scala/cats/collections/Predicate.scala

@@ -76,6 +160,15 @@ trait PredicateInstances {
    override def combine(l: Predicate[A], r: Predicate[A]): Predicate[A] = l union r
  }

+  implicit def predicateBool[A]: Bool[Predicate[A]] = new Bool[Predicate[A]] {


we usually give longer names in cats to avoid the name aliasing problem with implicits.

e.g. implicit def catsCollectionsPredicateBool: Bool[Predicate[A]] = ...

johnynek · 2020-07-31T00:12:28Z

tests/src/test/scala/cats/collections/PredicateSpec.scala

@@ -22,6 +24,16 @@ class PredicateSpec extends CatsSuite {
    checkAll("ContravariantMonoidal[Predicate]", ContravariantMonoidalTests[Predicate].contravariantMonoidal[Int, Int, Int])
  }

+  {
+    implicit val eqForPredicateInt: Eq[Predicate[Int]] = new Eq[Predicate[Int]] {
+      val sample = -1 to 1 // need at least 2 elements to distinguish in-between values


why are just two enough? Why not (-100 to 100) or something?

Or, better yet, why not test with Predicate[Byte] and enumerate all 256 possibilities to check equality.

If the domain is just one element, then every predicate is effectively equivalent to either Empty or Everything depending on whether it accepts the one element.

I don't think we need a 201 or 256 cardinality domain because I tried manually introducing some defects in Predicate and it caught them immediately with the 3 elements.

That being said, it doesn't negatively affect the performance of the testsuite on my machine at all so I'd be ok with either.

bbjubjub2494 · 2020-07-31T10:13:24Z

Benchmarks

tip of PR

[info] Benchmark                                                (n)   Mode  Cnt        Score       Error  Units
[info] ChainedPredicateBench.catsCollectionsPredicateUnravel     10  thrpt   25  1441476.494 ± 16104.839  ops/s
[info] ChainedPredicateBench.catsCollectionsPredicateUnravel    100  thrpt   25   150259.432 ±   447.757  ops/s
[info] ChainedPredicateBench.catsCollectionsPredicateUnravel   1000  thrpt   25    15139.120 ±    36.946  ops/s
[info] ChainedPredicateBench.catsCollectionsPredicateUnravel  10000  thrpt   25     1509.519 ±    15.303  ops/s

same benchmark against on master

[info] Benchmark                                               (n)   Mode  Cnt         Score       Error  Units
[info] ChainedPredicateBench.catsCollectionsPredicateUnravel    10  thrpt   25  20347254.412 ± 17631.518  ops/s
[info] ChainedPredicateBench.catsCollectionsPredicateUnravel   100  thrpt   25   2097695.179 ± 16756.560  ops/s
[info] ChainedPredicateBench.catsCollectionsPredicateUnravel  1000  thrpt   25    188526.011 ±   276.839  ops/s

Woah that's not good is it? 13x loss, how did that happen?

bbjubjub2494 · 2020-07-31T11:34:44Z

benchmark without Kleisli

[info] Benchmark                                                (n)   Mode  Cnt        Score       Error  Units
[info] ChainedPredicateBench.catsCollectionsPredicateUnravel     10  thrpt   25  4396333.243 ± 64479.117  ops/s
[info] ChainedPredicateBench.catsCollectionsPredicateUnravel    100  thrpt   25   520604.540 ±  2450.074  ops/s
[info] ChainedPredicateBench.catsCollectionsPredicateUnravel   1000  thrpt   25    48156.396 ±   125.049  ops/s
[info] ChainedPredicateBench.catsCollectionsPredicateUnravel  10000  thrpt   25     4519.274 ±     7.830  ops/s

still 5x

johnynek

one small comment

Can we remove the draft? I think this is close to ready for merge (even could be merged now, and my suggestion could be added before publishing).

johnynek · 2021-09-10T18:41:22Z

core/src/main/scala/cats/collections/Predicate.scala

+  def apply[A](p: A => Boolean): Predicate[A] = Lift {
+    Kleisli(a => if (p(a)) Eval.True else Eval.False)
+  }
+


what about a def fromKleisli[A](k: Kleisli[Eval, A, Boolean]): Predicate[A] = Lift(k)

bbjubjub2494 changed the title ~~[WIP] Make Predicate stack-safe using Eval~~ Make Predicate stack-safe using Eval Mar 5, 2020

johnynek reviewed Mar 5, 2020

View reviewed changes

core/src/main/scala/cats/collections/Predicate.scala Outdated Show resolved Hide resolved

bbjubjub2494 requested a review from johnynek March 7, 2020 22:58

bbjubjub2494 changed the title ~~Make Predicate stack-safe using Eval~~ WIP: Make Predicate stack-safe using Eval Mar 11, 2020

bbjubjub2494 changed the title ~~WIP: Make Predicate stack-safe using Eval~~ Make Predicate stack-safe using Eval Mar 31, 2020

bbjubjub2494 marked this pull request as draft April 8, 2020 21:24

bbjubjub2494 marked this pull request as ready for review April 8, 2020 21:24

bbjubjub2494 commented Jul 28, 2020

View reviewed changes

johnynek reviewed Jul 28, 2020

View reviewed changes

bbjubjub2494 marked this pull request as draft July 29, 2020 11:46

Louis Bettens added 7 commits July 30, 2020 23:10

Predicate: add benchmark

b00c4c9

Predicate: stack safety tests

fc6bacc

Make Predicate stack-safe using Eval

76069db

Avoid trivial benchmarks and safety tests

cadc03c

If the set is simplified it defeats the point

Use Predicate.empty where applicable

f71a3bd

Improve ArbitraryPredicate

47d5786

Add Bool[Predicate[A]]

d0df5ec

bbjubjub2494 marked this pull request as ready for review July 30, 2020 22:40

johnynek reviewed Jul 31, 2020

View reviewed changes

Louis Bettens added 3 commits July 31, 2020 10:23

Predicate: use Eval static instances

2e93fce

PredicateInstances: fix naming of implicits

77e0628

PredicateSpec: increase comparion sample size

94604fb

bbjubjub2494 marked this pull request as draft August 2, 2020 11:14

johnynek reviewed Sep 10, 2021

View reviewed changes

bbjubjub2494 closed this by deleting the head repository Aug 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make Predicate stack-safe using Eval #283

Make Predicate stack-safe using Eval #283

bbjubjub2494 commented Mar 5, 2020 •

edited

Loading

johnynek left a comment

bbjubjub2494 commented Mar 5, 2020 •

edited

Loading

codecov-io commented Mar 5, 2020 •

edited

Loading

bbjubjub2494 commented Apr 8, 2020

bbjubjub2494 Jul 28, 2020

johnynek left a comment

johnynek Jul 28, 2020

bbjubjub2494 Jul 28, 2020

bbjubjub2494 Jul 28, 2020

johnynek Jul 28, 2020

bbjubjub2494 Jul 28, 2020

johnynek Jul 28, 2020

bbjubjub2494 Jul 28, 2020

bbjubjub2494 commented Jul 30, 2020

codecov-commenter commented Jul 30, 2020 •

edited

Loading

johnynek left a comment

johnynek Jul 31, 2020

johnynek Jul 31, 2020

johnynek Jul 31, 2020

bbjubjub2494 Jul 31, 2020

bbjubjub2494 commented Jul 31, 2020 •

edited

Loading

bbjubjub2494 commented Jul 31, 2020

johnynek left a comment

johnynek Sep 10, 2021

	* compose with a function the predicate.
	* compose a function with the predicate.

Make Predicate stack-safe using Eval #283

Make Predicate stack-safe using Eval #283

Conversation

bbjubjub2494 commented Mar 5, 2020 • edited Loading

Benchmarks

pre-existing 0.9.0 stack-unsafe implementation

this PR (mark 3)

johnynek left a comment

Choose a reason for hiding this comment

bbjubjub2494 commented Mar 5, 2020 • edited Loading

codecov-io commented Mar 5, 2020 • edited Loading

Codecov Report

bbjubjub2494 commented Apr 8, 2020

Choose a reason for hiding this comment

johnynek left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bbjubjub2494 commented Jul 30, 2020

codecov-commenter commented Jul 30, 2020 • edited Loading

Codecov Report

johnynek left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bbjubjub2494 commented Jul 31, 2020 • edited Loading

Benchmarks

tip of PR

same benchmark against on master

bbjubjub2494 commented Jul 31, 2020

benchmark without Kleisli

johnynek left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bbjubjub2494 commented Mar 5, 2020 •

edited

Loading

bbjubjub2494 commented Mar 5, 2020 •

edited

Loading

codecov-io commented Mar 5, 2020 •

edited

Loading

codecov-commenter commented Jul 30, 2020 •

edited

Loading

bbjubjub2494 commented Jul 31, 2020 •

edited

Loading