Skip to content

Bit-packed arithmetic over Z/nZ for tiny n #2428

@fredrik-johansson

Description

@fredrik-johansson

Consider n < 256. We get 8x memory reduction using nmod8 instead of nmod, so it would be interesting to optimize the arithmetic in the nmod8 representation. Some low-hanging fruit here is to do batch modular reduction using SIMD operations.

For really small n we can do even better by bit packing. Obviously for n = 2 we want to use a single word to represent a chunk of 64 entries. Polynomial multiplication is easy on modern hardware since there are now dedicated CLMUL instructions

What about other tiny n?

We can store an entry of Z/3Z using 2 bits, so it's possible to pack 32 entries into a word. Packing as tightly as possible is not obviously optimal: using 3 or 4 bits or even 6 or 8 bits has the benefit that one can do some arithmetic in-place before reducing. When packing to 2 bits, the article https://doi.org/10.36045/bbms/1369316548 discusses a nonstandard encoding which makes the arithmetic operations easier to implement using bitwise operations. To figure out what works best, one probably needs some implementation experiments.

It should be possible to do some more small cases like n = 5 and 7 very similarly.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions