Is your feature request related to a problem? Please describe.
I am benchmarking FLINT modular matrix multiplication routines on the French national supercomputer Jean-Zay and obtain poor performances with nmod_mat_mul_blas using OpenBLAS BLAS.
I would like to test the library using the Intel-specific BLAS implementation Intel MKL (now regrouped into a set of tools called Intel OneAPI).
I found no suitable documentation to configure such a BLAS implementation.
Describe the solution you'd like
A clear and concise description of how to configure a vendor specific BLAS implementation like Intel MKL in the installation guide.