Skip to content

Regarding consistent naming and efficient cleanup logic of temporary directories / files #1637

@0xXA

Description

@0xXA

Hi Team,

I noticed that several extractors, detectors, and internal functions create temporary files or directories under /tmp (for example, names like scalibr-*). At the moment, the cleanup logic appears to be either limited to EmbeddedFS extractors or implemented in a scattered and inconsistent manner.

I’ve raised this issue to propose a consistent approach to temporary directory naming, which would allow us to clean up resources more reliably once Scalibr finishes its execution.

Proposed rule 1: If a detector, enricher, or extractor needs to create a temporary directory, it must follow a common naming convention. For example, directory names should start with "osv-scalibr-".

Proposed rule 2 (abstraction of rule 1): We can implement a single function like:

func ScalibrTempDir(prefix string) (string, error) {
	return os.MkdirTemp("/tmp", "osv-scalibr-"+prefix+"-")
}

Then enforce it's use osv-scalibr wide.

In addition, we can implement tracking code in ScalibrTempDir() then later at exit time use this information to remove all temporary files and directories. This ensures we never run into an infinite loop and attacks related to it. For example, We can create one temp root per Scalibr run, and put everything under it.

/tmp/osv-scalibr-run-9f23a1/
 ├── extractor-a/
 ├── detector-b/
 └── enricher-c/

This way, we only need to track this "osv-scalibr-run-" directory for each iteration of scalibr. Moreover, we can simply erase everything with:

defer os.RemoveAll(runTmp)

With this in place, we could safely and efficiently remove all Scalibr-related temporary files or directories from /tmp at exit.

Benefits:

  • Ensures consistency across extractors, detectors, and enrichers
  • Enables centralized and reliable cleanup logic
  • Reduces duplicated cleanup code (e.g., multiple defer blocks)
  • Improves performance by simplifying teardown
  • Lowers the burden on new plugin developers, who won’t need to implement custom cleanup logic

Overall, this should make the system easier to maintain, safer, and more developer-friendly. Of course, there are more efficient ways to realize the same.

Best regards,
Yuvraj Saxena

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions