-
Notifications
You must be signed in to change notification settings - Fork 26
Description
Summary
The SLICOT routine NF01BS (QR factorization of Jacobian in compressed form) contains a bug where DLACPY is called with overlapping source and destination memory regions. This is undefined behavior per the LAPACK specification.
Bug Location
File: SLICOT-Reference/src/NF01BS.f
Line: 519
CALL DLACPY( 'Full', N, ST, J(LDJ*BSN+1), LDJ, J(IBSN), N )Where IBSN = N*BSN + 1 (set at line 518).
Root Cause
The routine reshapes matrix J from leading dimension LDJ (= M = BNBSM) to leading dimension N (= BNBSN + ST). When LDJ > N, the source and destination regions overlap in memory.
Memory Layout
Source: J(LDJ*BSN+1) with leading dimension LDJ
Destination: J(N*BSN+1) with leading dimension N
When LDJ > N:
Source offset = LDJ * BSN = larger
Dest offset = N * BSN = smaller
These regions overlap!
Trigger Conditions
BN > 1(multiple blocks - general case, not special case)BSN > 0(nonlinear part exists)BSM > BSN(more rows than columns per block,MMN > 0)ST > 0(linear part exists)
LAPACK DLACPY Specification
From LAPACK documentation:
DLACPY copies all or part of a two-dimensional matrix A to another matrix B.
DLACPY does not handle overlapping memory regions. It uses simple nested loops (effectively memcpy semantics), not memmove.
Reproduction
Prerequisites
- gfortran
- LAPACK/BLAS libraries
- Valgrind
Build
cd tests/fortran
gfortran -g -O0 -fcheck=all -Wall -c \
../../SLICOT-Reference/src/NF01BS.f \
../../SLICOT-Reference/src/MD03BX.f
gfortran -g -O0 -fcheck=all -Wall -o test_nf01bs_overlap \
test_nf01bs_overlap.f NF01BS.o MD03BX.o -llapack -lblasRun with Valgrind
valgrind --track-origins=yes ./test_nf01bs_overlapExpected Output
==19687== Source and destination overlap in memcpy(0x..., 0x..., 48)
==19687== at 0x...: __GI_memcpy (...)
==19687== by 0x...: dlacpy_ (liblapack.so)
==19687== by 0x...: nf01bs_ (NF01BS.f:479)
==19687== by 0x...: MAIN__ (test_nf01bs_overlap.f:96)
Test Parameters
The test uses parameters that reliably trigger the overlap:
| Parameter | Value | Description |
|---|---|---|
| BN | 2 | Number of blocks |
| BSM | 4 | Block rows |
| BSN | 2 | Block columns |
| ST | 2 | Linear part size |
| N | 6 | Total columns (BN*BSN + ST) |
| M | 8 | Total rows (BN*BSM) |
| LDJ | 8 | Leading dimension (= M) |
Memory offsets:
- Source:
LDJ*BSN + 1 = 17(1-based) - Destination:
N*BSN + 1 = 13(1-based) - Overlap because source starts at 17, destination at 13, with
LDJ=8 > N=6
Fix in C Translation
The C translation (src/NF/nf01bs.c) uses a local helper function with memmove:
static void local_dlacpy_safe(i32 m, i32 n, const f64 *a, i32 lda, f64 *b, i32 ldb) {
for (i32 j = 0; j < n; j++) {
memmove(&b[j * ldb], &a[j * lda], m * sizeof(f64));
}
}This correctly handles overlapping memory regions.
Impact
- Severity: Medium
- Behavior: Undefined (may produce incorrect results silently)
- Detection: Only visible with memory debuggers (Valgrind, ASAN)
- Affected versions: All SLICOT versions containing NF01BS
Why It May Not Manifest
- LAPACK's column-by-column copy order may accidentally work for some overlap patterns
- Specific parameter combinations may avoid the overlap entirely
- Resulting numerical errors may be small enough to go unnoticed
Recommendation
The original SLICOT Fortran code should be patched to use a custom copy routine that handles overlapping memory, similar to the C translation fix.