Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Computing Partitions(...).cardinality() with set min_length= can be significantly improved by complementation and using max_length= instead #38897

Open
1 task done
maxale opened this issue Oct 31, 2024 · 8 comments · May be fixed by #38904

Comments

@maxale
Copy link
Contributor

maxale commented Oct 31, 2024

Problem Description

Check this out:

sage: %time Partitions(50,min_length=10).cardinality()
CPU times: user 1min 46s, sys: 14 ms, total: 1min 46s
Wall time: 1min 46s
158414

sage: %time number_of_partitions(50) - Partitions(50,max_length=9).cardinality()
CPU times: user 6.98 s, sys: 5 ms, total: 6.98 s
Wall time: 6.98 s
158414

Proposed Solution

As the code example shows, computing cardinality with min_length= appears to be much slower than that with max_length=. Hence, it'd be beneficial to convert one into the other by complementation (in the set of all partitions) as illustrated by the example.

Alternatives Considered

I believe the cardinalities in both cases can be computed via dynamic programming, but I did not check if Sage uses it wisely and why in one case it's much slower than in the other.

Additional Information

No response

Is there an existing issue for this?

  • I have searched the existing issues for a bug report that matches the one I want to file, without success.
@maxale maxale changed the title Computing Permutations(...).cardinality() with set min_length= can be significantly improved by complementation and using max_length= instead Computing Partitions(...).cardinality() with set min_length= can be significantly improved by complementation and using max_length= instead Oct 31, 2024
@mantepse
Copy link
Collaborator

If I made no mistake, Partitions(n, max_length=m).cardinality() and Partitions(n, min_length=m).cardinality() actually both compute the number by generating the partitions. There is no dedicated class for partitions of given maximal length. The reason for that is, very likely, that there are too many combinations of possible parameters, and nobody cared enough for this one yet. I do think that we should have it, because of $GL_n$.

We do have a dedicated cardinality function for partitions of given length, though.

sage: n=50; m=10
sage: %time sum(Partitions(n, length=i).cardinality() for i in range(m+1))
CPU times: user 114 ms, sys: 88.8 ms, total: 202 ms
Wall time: 221 ms
62740
sage: %time Partitions(n, max_length=m).cardinality()
CPU times: user 3.32 s, sys: 0 ns, total: 3.32 s
Wall time: 3.32 s
62740

@mantepse
Copy link
Collaborator

PS: I appreciate your reports a lot!

@maxale
Copy link
Contributor Author

maxale commented Oct 31, 2024

It's unfortunate that .cardinality() in these cases are computed via an actual generation of partitions. I think there should be an efficient implementation for "base" cases with max_length= or max_part= or both specified.
Then the cardinality for length=k can be computed as the difference of those for max_length=k and max_length=k-1, the cardinality of min_length=k can be computed via complementation, and same for min_part=k. This will cover most useful cases in my view.

On a related note, I think Sage should issue a warning (at verbosity = 0 or 1 level) whenever it knowingly uses a non-efficient algorithm (such as enumeration via generation). I use to believe that whenever a certain functionality exists it implements the best available algorithm. Knowing when this is not the case will greatly help to find bottlenecks and avoid using slow algorithms when performance is an issue.

@mantepse
Copy link
Collaborator

mantepse commented Nov 1, 2024

Any combinations of any subset of length, min_length, max_length, min_part should now be reasonably fast. I did stick to the current scheme (i.e., relating all computations to those of given size and length), that looked easier.

On a related note, I think Sage should issue a warning (at verbosity = 0 or 1 level) whenever it knowingly uses a non-efficient algorithm (such as enumeration via generation). I use to believe that whenever a certain functionality exists it implements the best available algorithm. Knowing when this is not the case will greatly help to find bottlenecks and avoid using slow algorithms when performance is an issue.

I think verbosity is hard to maintain. However, I use %prun and %lprun a lot, do you know about these? In the case at hand:

sage: %prun -s cumulative Partitions(30, min_length=10).cardinality()
         18655 function calls (18645 primitive calls) in 1.355 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      2/1    0.001    0.001    1.354    1.354 {built-in method builtins.exec}
        1    0.000    0.000    1.349    1.349 <string>:1(<module>)
        1    0.003    0.003    1.349    1.349 finite_enumerated_sets.py:97(_cardinality_from_iterator)
     2545    1.321    0.001    1.346    0.001 lists.py:225(_element_iter)
     2544    0.006    0.000    0.025    0.000 lists.py:281(_element_constructor_default)
     2544    0.005    0.000    0.018    0.000 partition.py:518(__init__)
     2544    0.008    0.000    0.012    0.000 combinat.py:1526(__init__)
     2544    0.004    0.000    0.004    0.000 combinat.py:1111(__init__)
        1    0.000    0.000    0.003    0.003 partition.py:6114(__classcall_private__)
...

@maxale
Copy link
Contributor Author

maxale commented Nov 17, 2024

@mantepse: thank you for the improvements! A nice testcase is the Sage code that I gave https://oeis.org/A333148 - how fast can it compute terms for n, say, around 1000?

@mantepse
Copy link
Collaborator

mantepse commented Nov 17, 2024

Apparently, I missed a case, because Partitions(m, max_part=l, length=k).cardinality() is still iterating over all partitions.

@mantepse
Copy link
Collaborator

mantepse commented Nov 19, 2024

@maxale, I have now coded that, too. It is not completely satisfying, because if you call Partitions(args).cardinality(), and awful amount of time is spent in initialising the iterator facility.

With some effort, one can bypass the problem as follows:

from sage.combinat.partition import number_of_partitions_length_max_part
def P(n, a, b, k, m):
    """
    Return the number of partitions of n with parts between a and b and length between k and m.

        sage: def D(n):
        ....:     s1 = number_of_partitions(n)
        ....:     s2 = sum(P(m, 0, l, k, k)
        ....:              * P(n-m-l^2, 0, Infinity, k+2*l, Infinity)
        ....:              for l in range(1, (n+1).isqrt())
        ....:              for m in range((n-l^2-2*l)*l//(l+1)+1)
        ....:              for k in range(ceil(m/l), min(m, n-m-l^2-2*l)+1))
        ....:     return s1 + s2
    """
    n = ZZ(n)
    if a <= 0: a = 1
    if b > n: b = n
    if k < 0: k = 0
    if m > n: m = n
    # print(n,a,b,k,m)
    if not k and m == n and a == 1:
        # unrestricted length, parts smaller b
        return ZZ.sum(number_of_partitions_length(n, i) for i in range(b + 1))
    
    return ZZ.sum(number_of_partitions_length_max_part(n - (a - 1)*l, l, b - a + 1)
                  for l in range(k, m+1))

Doing so, D(300) takes 10 seconds on my computer, D(350) takes 15 seconds.

@maxale
Copy link
Contributor Author

maxale commented Nov 19, 2024

@mantepse: Thanks for checking. My hope is to have an efficient and more or less universal .cardinality() method that will allow to hide all technicalities. As for the slowness with an iterator construction - can it be done only when an actual iteration starts? Always creating an iterator is a bad design decision to my taste.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants