Skip to content

NF01BS DLACPY Overlapping Memory Bug #35

@jamestjsp

Description

@jamestjsp

Summary

The SLICOT routine NF01BS (QR factorization of Jacobian in compressed form) contains a bug where DLACPY is called with overlapping source and destination memory regions. This is undefined behavior per the LAPACK specification.

Bug Location

File: SLICOT-Reference/src/NF01BS.f
Line: 519

CALL DLACPY( 'Full', N, ST, J(LDJ*BSN+1), LDJ, J(IBSN), N )

Where IBSN = N*BSN + 1 (set at line 518).

Root Cause

The routine reshapes matrix J from leading dimension LDJ (= M = BNBSM) to leading dimension N (= BNBSN + ST). When LDJ > N, the source and destination regions overlap in memory.

Memory Layout

Source:      J(LDJ*BSN+1) with leading dimension LDJ
Destination: J(N*BSN+1)   with leading dimension N

When LDJ > N:
  Source offset  = LDJ * BSN = larger
  Dest offset    = N * BSN   = smaller

  These regions overlap!

Trigger Conditions

  1. BN > 1 (multiple blocks - general case, not special case)
  2. BSN > 0 (nonlinear part exists)
  3. BSM > BSN (more rows than columns per block, MMN > 0)
  4. ST > 0 (linear part exists)

LAPACK DLACPY Specification

From LAPACK documentation:

DLACPY copies all or part of a two-dimensional matrix A to another matrix B.

DLACPY does not handle overlapping memory regions. It uses simple nested loops (effectively memcpy semantics), not memmove.

Reproduction

Prerequisites

  • gfortran
  • LAPACK/BLAS libraries
  • Valgrind

Build

cd tests/fortran
gfortran -g -O0 -fcheck=all -Wall -c \
    ../../SLICOT-Reference/src/NF01BS.f \
    ../../SLICOT-Reference/src/MD03BX.f
gfortran -g -O0 -fcheck=all -Wall -o test_nf01bs_overlap \
    test_nf01bs_overlap.f NF01BS.o MD03BX.o -llapack -lblas

Run with Valgrind

valgrind --track-origins=yes ./test_nf01bs_overlap

Expected Output

==19687== Source and destination overlap in memcpy(0x..., 0x..., 48)
==19687==    at 0x...: __GI_memcpy (...)
==19687==    by 0x...: dlacpy_ (liblapack.so)
==19687==    by 0x...: nf01bs_ (NF01BS.f:479)
==19687==    by 0x...: MAIN__ (test_nf01bs_overlap.f:96)

Test Parameters

The test uses parameters that reliably trigger the overlap:

Parameter Value Description
BN 2 Number of blocks
BSM 4 Block rows
BSN 2 Block columns
ST 2 Linear part size
N 6 Total columns (BN*BSN + ST)
M 8 Total rows (BN*BSM)
LDJ 8 Leading dimension (= M)

Memory offsets:

  • Source: LDJ*BSN + 1 = 17 (1-based)
  • Destination: N*BSN + 1 = 13 (1-based)
  • Overlap because source starts at 17, destination at 13, with LDJ=8 > N=6

Fix in C Translation

The C translation (src/NF/nf01bs.c) uses a local helper function with memmove:

static void local_dlacpy_safe(i32 m, i32 n, const f64 *a, i32 lda, f64 *b, i32 ldb) {
    for (i32 j = 0; j < n; j++) {
        memmove(&b[j * ldb], &a[j * lda], m * sizeof(f64));
    }
}

This correctly handles overlapping memory regions.

Impact

  • Severity: Medium
  • Behavior: Undefined (may produce incorrect results silently)
  • Detection: Only visible with memory debuggers (Valgrind, ASAN)
  • Affected versions: All SLICOT versions containing NF01BS

Why It May Not Manifest

  1. LAPACK's column-by-column copy order may accidentally work for some overlap patterns
  2. Specific parameter combinations may avoid the overlap entirely
  3. Resulting numerical errors may be small enough to go unnoticed

Recommendation

The original SLICOT Fortran code should be patched to use a custom copy routine that handles overlapping memory, similar to the C translation fix.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions