Skip to content

Add byte-array-to-string conversion to system module #14810

@snej

Description

@snej

Summary

The standard library could use a function to convert an openarray[char] or openarray[byte] to a string.

Description

Converting between byte arrays and strings is pretty common, in my experience. It's possible to cast from a seq[char] or seq[byte] to a string using cast[string](...), and the compiler allows you to do the same with an openarray, but at runtime you get a garbage string object. (It appears that the initial bytes of the array get reinterpreted as the string's length and address.) Here's an example.

This seems rather dangerous: an idiom that works in one case fails in another, but subtly enough that it might be overlooked at first. Results I've observed have been either a several-megabyte string of garbage, or an out-of-memory exception (when Nim tries to allocate a copy of a terabytes-long string.) I'm guessing that mutating the string might lead to memory corruption ... and it might be possible to craft a byte array that overwrites specific memory addresses, making this a potential security exploit.

I realize the cast operator is clearly documented as dangerous! The problem here isn't that using it causes undefined behavior, rather that (a) it's the only simple way to convert an array to a string, and worse, (b) it works for some types of arrays but not others.

Proposal

Add some toString(...) procs to the system module:

proc toString(chars: openarray[char]): string
proc toString(bytes: openarray[byte]): string

($ would be a better name, but that operator is already defined in dollars.nim, and has a different purpose.)

Alternatives

The only way I've found is to create an empty string and copy the bytes:

proc toString(bytes: openarray[byte]): string =
  result = newString(bytes.len)
  copyMem(result[0].addr, bytes[0].unsafeAddr, bytes.len)

It's a fine solution, but we shouldn't make developers use highly unsafe language features just to perform a common operation! Instead, that should be the implementation of the standard library function.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions