Skip to content

Uninitialized stack variable during string initialization in a function that returns a string of the parametrized length #1436

@pawosm-arm

Description

@pawosm-arm

consider following code:

program string_init

  print *, fun(128)

contains

  character(n) function fun(n)
    if (n .gt. 0) then
      fun = '?'
    else
      fun = ''
    end if
  end function

end

This will result in the following IR snippet:

define internal void @string_init_fun(ptr %fun, ptr %n, i64 %.U0001.arg, ptr %.S0000) "target-features"="+neon,+v8a" !dbg !29 {
L.entry:
        %.U0001.addr = alloca i64, align 8
        %n_318 = alloca i32, align 4
        %fun$len_323 = alloca i64, align 8

        store i64 %.U0001.arg, ptr %.U0001.addr, align 8
        %0 = load i32, ptr %n_318, align 4
        ...

The n_318 variable is allocated on the stack, never initialized, finally loaded to %0 (and used for further logic with disastrous effect). The result is random, and for a very long time it went unnoticed, until it started to fail regularly on Ubuntu 22, on which uninitialized variables seem to be populated with 0xff's. In effect a new execution path has started to be taken in the f90_str_cpy1() function which implements the fun = '?' assignment: a call to the memset() now occurs which results in the segfault.

One would argue whether this is a proper error. An argument that there is a problem with this Fortran code could be articulated. More elegant version may look like this:

program string_init
  implicit none
  integer, parameter :: n = 128

  print *, fun(n)

contains

  character(n) function fun(n)
    implicit none
    integer :: n

    if (n .gt. 0) then
      fun = '?'
    end if
  end function

end

In this light, it may appear that in the original program the string length was indeed never known. But this raises a question about n's visibility, because gfortran manages to backpropagate n to the returned type parameter and emits the code that doesn't fail.

EDIT: another possible rewrite of the original code that results in a correct IR:

program string_init

  print *, fun(128)

contains

  function fun(n)
    character(n) :: fun

    if (n .gt. 0) then
      fun = '?'
    else
      fun = ''
    end if
  end function

end

...it's expressing the original intent of the author more closely. Yet there's no rule in Fortran which could force a programmer to be more specific.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions