Skip to content

Version 3.1.0

Latest
Compare
Choose a tag to compare
@rossarmstrong rossarmstrong released this 22 Apr 01:48
· 16 commits to main since this release
v3.1.0
5654bad

Enhancements

  • Updated the 'summary' function in 'summary.py' to include a precise return type hint: 'pd.DataFrame | None'.

    • Improved readability and type safety for the function.
  • Updated the return type hint for the 'wer' function in 'wer.py' from 'float' to 'float | np.float64 | None'.

    • Enhanced type accuracy and alignment with the function's behavior.
  • Continued work on type hinting improvements and resolving 'mypy' errors for better code quality and maintainability.

  • Optimized Levenshtein distance matrix initialization in calculations() by replacing a Python list-of-lists with a Cython-typed NumPy array (cdef int[:, :]). This reduces memory overhead and significantly speeds up execution on typical workloads, especially for large datasets or repeated function calls. It improves scalability, responsiveness, and memory efficiency.

  • Refactored internal variable typing in calculations() for clarity and consistency:

    • Loop indices and size variables now use Py_ssize_t, matching Python's internal conventions.
    • Grouped and explicitly typed intermediate variables like inserted_words, deleted_words, and substituted_words for improved readability and static checks. This enhances code quality, reduces reliance on dynamic typing in performance-critical paths, and prepares the function for future optimizations.