Skip to content

Unsoundness in run_ctags #101

Open
Open
@lwz23

Description

@lwz23

Hello, thank you for your contribution in this project, I an testing our static analysis tool in github's Rust project and I notice the following code:

fn run_ctags(opt: &Opt, files: &Vec<String>) -> Vec<ClassInfo> {
    let outputs = CmdCtags::call(&opt, &files).unwrap();
    let mut iters = Vec::new();
    for o in &outputs {
        let iter = if opt.validate_utf8 {
            str::from_utf8(&o.stdout).unwrap().lines()
        } else {
            unsafe { str::from_utf8_unchecked(&o.stdout).lines() }
        };
        iters.push(iter);
    }
    let parser = CtagsParser::parse_str(iters);
    let classes = parser.classes();
    classes
}

The issue is in the run_ctags function where it uses str::from_utf8_unchecked on external command output:

let iter = if opt.validate_utf8 {
    str::from_utf8(&o.stdout).unwrap().lines()
} else {
    unsafe { str::from_utf8_unchecked(&o.stdout).lines() }
};

Since o.stdout comes from an external program (ctags), there's no guarantee the data is valid UTF-8. If opt.validate_utf8 is false (which can be controlled through the configuration because I notice this is a pub field), the program will use str::from_utf8_unchecked on potentially invalid UTF-8 data, causing undefined behavior.
A valid path to call this fn: pub fn execute -> fn run_ctags

POC

fn main() {
    // Create a configuration with a repo that points to an existing directory
    let config = CocoConfig {
        repos: vec![RepoConfig {
            url: String::from("/tmp/test_repo"),  // Point to any directory
            languages: Some(vec![String::from("rust")]),
            // Other fields initialized as needed
        }],
        plugins: vec![PluginConfig {
            name: String::from("struct_analysis"),
            configs: vec![
                KeyValue {
                    key: String::from("ctags"),
                    value: String::from("/usr/bin/ctags"),  // Path to ctags binary
                }
            ]
        }]
    };
    
    // This will eventually call run_ctags, which uses str::from_utf8_unchecked
    // on the output of an external program (ctags) when validate_utf8 is false
    execute(config);
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions