Enhancements
-
Updated the 'summary' function in 'summary.py' to include a precise return type hint: 'pd.DataFrame | None'.
- Improved readability and type safety for the function.
-
Updated the return type hint for the 'wer' function in 'wer.py' from 'float' to 'float | np.float64 | None'.
- Enhanced type accuracy and alignment with the function's behavior.
-
Continued work on type hinting improvements and resolving 'mypy' errors for better code quality and maintainability.
-
Optimized Levenshtein distance matrix initialization in calculations() by replacing a Python list-of-lists with a Cython-typed NumPy array (cdef int[:, :]). This reduces memory overhead and significantly speeds up execution on typical workloads, especially for large datasets or repeated function calls. It improves scalability, responsiveness, and memory efficiency.
-
Refactored internal variable typing in calculations() for clarity and consistency:
- Loop indices and size variables now use Py_ssize_t, matching Python's internal conventions.
- Grouped and explicitly typed intermediate variables like inserted_words, deleted_words, and substituted_words for improved readability and static checks. This enhances code quality, reduces reliance on dynamic typing in performance-critical paths, and prepares the function for future optimizations.