Correct and optimize setof/3 and bagof/3 by mthom · Pull Request #3211 · mthom/scryer-prolog

mthom · 2025-12-14T01:51:13Z

Use variant hashing and equivalence checking in Rust to replace the special purpose constant variable keysort, which produced incorrect solutions for #3151 and #3187 as @jjtolton showed in #3176. Performance is now much faster, a factor of O(N) in the solution size (#3186).

A variable safety bug was exposed in the course of implementing these corrections, which this PR also corrects.

triska · 2025-12-14T05:48:58Z

There is also #1526, is this related to setof/3 too?

triska · 2025-12-14T05:59:28Z

src/machine/variant_hashing.rs

+    // to h1 produces h2 (ISO Prolog standard section 7.1.6.1).
+    // return false on success and true on failure like eq_test.
+    #[inline(always)]
+    pub fn is_non_variant(&self, h1: HeapCellValue, h2: HeapCellValue) -> bool {


Per https://stackoverflow.com/questions/26900414/difference-between-two-variant-implementations, variant/2 is:

variant(X, Y) :- copy_term(Y, YC), subsumes_term(X, YC), subsumes_term(YC, X).

Maybe this can be used instead?

variant_disjoint(X, Y) :- term_variables(X, XVars), append(XVars, _, Dict), term_variables(Y, YVars), append(YVars, _, Dict), X == Y.

For bagof/3 and setof/3, the first goals are to be executed prior to any comparisons.

This is the key idea: That is, first all solutions get the very same variables (but this only works if these constraints are not copied around), and then == is good enough. No need for any hashing and the like.

UWN

?- setof(t,(dif(X,a);dif(X,b)),_).
   dif:dif(X,a), dif:dif(X,b), unexpected.

dif#26

It does not make sense to retain constraints in setof/3

UWN · 2025-12-14T12:38:46Z

A general comment on variant and the like. In this context here, there is no need for full variant functionality. In fact, all of this is not necessary. (==)/2 suffices. Note that after the findall/3, all elements (PairedSolutions) contain disjoint variables. And they should also not contain any attributed variables. Now, it is possible to set all variables equal, just according to the order from term_variables/2.
And thus, with a single sort, setof/3 eliminates all duplicates in one fell swoop whereas for bagof/3 one has to keep the duplicates.

mthom · 2025-12-31T06:20:29Z

There is also #1526, is this related to setof/3 too?

no, it's because findall copies solutions from a failure-driven loop. The resulting backtracking clears the stack of environment frames so the cont-based mechanism of library(tabling) is left with nothing to latch onto.

mthom · 2025-12-31T06:25:54Z

A general comment on variant and the like. In this context here, there is no need for full variant functionality. In fact, all of this is not necessary. (==)/2 suffices. Note that after the findall/3, all elements (PairedSolutions) contain disjoint variables. And they should also not contain any attributed variables. Now, it is possible to set all variables equal, just according to the order from term_variables/2. And thus, with a single sort, setof/3 eliminates all duplicates in one fell swoop whereas for bagof/3 one has to keep the duplicates.

It's unclear to me how bagof/3 should be revised in the light of the above comments. While setof/3 is bootstrapped from bagof/3, bagof/3 doesn't need to rely on variant predicates either, or should it instead use either variant/2 or variant_disjoint/2 above?

UWN · 2025-12-31T12:13:34Z

In an ideal implementation, setof/3 does not use bagof/3 since bagof/3 does not remove duplicates early on. Duplicates should be removed as early as possible. That's one aspect.

UWN · 2025-12-31T12:15:32Z

Otherwise bagof/3 could use the very same approach. That is, after the findall/3 unify all related variables, and only then sort for the alternate solutions for bagof/3.

mthom · 2026-01-06T05:15:01Z

I can't find a way to replace is_non_variant in Prolog without making both predicates much, much slower. For performance's sake it seems unavoidable. Unfortunately variants don't correspond to any form of sort order. If they did it could be done in O(N) in pure Prolog.

Skgland · 2026-01-06T22:54:06Z

I can't find a way to replace is_non_variant in Prolog without making both predicates much, much slower. For performance's sake it seems unavoidable. Unfortunately variants don't correspond to any form of sort order. If they did it could be done in O(N) in pure Prolog.

Would it work to map each variable to a term representing the position in the term that variable first occurs at, so that we then have a ground term and can use the sort order of ground terms? Or what am I missing that wouldn't make this work?

i.e. use a reserved functor (so that it can't occur in any of the terms e.g. $pos/1
and for a variable at

the root use []
the A-st argument of the root use [A]
the B-st argument of the A-st argument of the root term use [B, A]
the C-st argument of the B-st argument of the A-st argument of the root term use [C, B, A]
etc.

i.e. assuming the functor $pos doesn't already occur in the term

X -> $pos([])
[] -> []
t(X, y(Y)) -> t($pos([0]), y($pos[0,1]))
t([0], y([1])) -> t([0], y([1]))
t(A, A, B) -> t($pos([0]), $pos([0]), $pos([1]))
t(A, B, B) -> t($pos([0]), $pos([1]), $pos([1]))
t(X, Y, Y) -> t($pos([0]), $pos([1]), $pos([1]))

UWN · 2026-01-07T06:36:10Z

Would it work to map each variable to a term representing the position in the term that variable first occurs at, so that we then have a ground term

This is overkill. It is only necessary that the terms contain the very same variables, then regular == and sorting works. As long as a special mechanism is now developed for the sole purpose of bagof/3, testing it will become practically impossible. Just for comparison, it took about a year to implement variant checking in SWI (there called misleadingly =@=) with many, many failed attempts. But at least, it was possible to test it. When such a complex mechanism is exclusively accessible via bagof/3, the chances of identifying such errors diminish rapidly.

UWN · 2026-01-07T08:16:15Z

Hashing is also very challenging in the presence of rational trees (the default in Scryer). It means that even more cases of rational trees disclose their internal representation than those exposed by compare-ison.

ulrich@gupu:/opt/gupu/setof_correction_and_opt$ git status
On branch setof_correction_and_opt
Your branch is up to date with 'origin/setof_correction_and_opt'.

nothing to commit, working tree clean
ulrich@gupu:/opt/gupu/setof_correction_and_opt$ target/release/scryer-prolog -v
v0.10.0-60-gaa2d04cf
ulrich@gupu:/opt/gupu/setof_correction_and_opt$ target/release/scryer-prolog -f
?- setof(t,X^(-X=X,(Y = X ; Y = -X ; Y = - -X)),Ts).
   Y = -Y, Ts = "t"
;  Y = - - - - - - - - - - - - - - - - - - - - ..., Ts = "t", unexpected
;  Y = - - - - - - - - - - - - - - - - - - - - ..., Ts = "t", unexpected.
?- setof(Y,X^(-X=X,(Y = X ; Y = -X ; Y = - -X)),Ys).
   Ys = [- - - - - - - - - - - - - - - - - - - ...]. % fine, shows that comparison works

UWN · 2026-01-07T08:41:10Z

Or more explicitly,

?- bagof(t,X^(-X=X,(Y = X ; Y = -X ; Y = - -X)),[t,t,t]).
   false, unexpected.
   sto,
   Y = -Y.

UWN · 2026-01-08T08:43:35Z

sort_without_dedup/2: commonly called keysort/2

UWN · 2026-01-08T08:51:01Z

And sort_without_dedup/2 is incorrect, which can be seen:

?- bagof(Y,(Y=2;Y=1),Ys).
   Ys = [1,2], unexpected.
   Ys = [2,1].

UWN · 2026-01-08T09:55:37Z

Just one more remark on keysort/2: If an identical key is found, one may destructively point from one pair/key to the other (provided no cp is in between), such that the subsequent processing gets faster. Similarly for sort/2.

triska · 2026-01-09T07:00:48Z

Cargo.toml

 serde_json = "1.0.122"
 serde = "1.0.204"
 parking_lot = "0.12.4"
+hashbrown = "0.16.1"


Is this now still needed?

UWN

Progress!

What I am still very suspicious about are these two sort/2 s and this set_difference. We have term_variables and append for this! Witnesses0 is already unique, so why sort it? You will just introduce some evil implementation dependence.

UWN

(If might be a good idea to share more of the identical variable analysis in one common auxiliary predicate, but that is nit-picking)

triska reviewed Dec 14, 2025

View reviewed changes

UWN reviewed Dec 14, 2025

View reviewed changes

mthom force-pushed the setof_correction_and_opt branch from aa2d04c to 9221a6a Compare January 8, 2026 08:30

triska reviewed Jan 9, 2026

View reviewed changes

UWN suggested changes Jan 9, 2026

View reviewed changes

mthom force-pushed the setof_correction_and_opt branch from d74e0ad to 695c09e Compare January 13, 2026 07:04

This was referenced Jan 13, 2026

Improve term representation during comparison #3226

Open

findall-variation to better support setof/3 #3227

Open

UWN approved these changes Jan 14, 2026

View reviewed changes

mthom added 7 commits January 14, 2026 20:34

assert rational(3) as true in tests/builtins.pl

16dc10e

use branch numbers to detect branch subsumption

9089f9d

replace compare_term_test with parallel iterator, add is_not_variant

29cd805

add variant_hash and is_non_variant to fix setof/3, bagof/3

6284aa3

fix cargo fmt

e2bdf59

do not retain attributes in solutions of findall (#3020)

69a367d

find variant terms using just sort/2 and (==)/2

7022068

mthom added 5 commits January 14, 2026 20:39

remove variant_hashing.rs and related instructions

c2e1ded

cargo fmt fixes

1a8c4f9

replace sort_without_dedup/2 with keysort/2

a83f412

remove hashbrown crate

e446b13

remove unnecessary extra work in findall_with_existential/5

c79fd74

mthom force-pushed the setof_correction_and_opt branch from 695c09e to c79fd74 Compare January 15, 2026 04:43

mthom merged commit 453a88f into master Jan 15, 2026
15 checks passed

Conversation

mthom commented Dec 14, 2025

Uh oh!

triska commented Dec 14, 2025

Uh oh!

triska Dec 14, 2025

Choose a reason for hiding this comment

Uh oh!

UWN Dec 14, 2025

Choose a reason for hiding this comment

Uh oh!

UWN Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

UWN left a comment

Choose a reason for hiding this comment

Uh oh!

UWN commented Dec 14, 2025

Uh oh!

mthom commented Dec 31, 2025

Uh oh!

mthom commented Dec 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

UWN commented Dec 31, 2025

Uh oh!

UWN commented Dec 31, 2025

Uh oh!

mthom commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Skgland commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

UWN commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

UWN commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

UWN commented Jan 7, 2026

Uh oh!

UWN commented Jan 8, 2026

Uh oh!

UWN commented Jan 8, 2026

Uh oh!

UWN commented Jan 8, 2026

Uh oh!

triska Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

UWN left a comment

Choose a reason for hiding this comment

Uh oh!

UWN left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mthom commented Dec 31, 2025 •

edited

Loading

mthom commented Jan 6, 2026 •

edited

Loading

Skgland commented Jan 6, 2026 •

edited

Loading

UWN commented Jan 7, 2026 •

edited

Loading

UWN commented Jan 7, 2026 •

edited

Loading