-
Notifications
You must be signed in to change notification settings - Fork 103
Description
Hi Team,
I noticed that several extractors, detectors, and internal functions create temporary files or directories under /tmp (for example, names like scalibr-*). At the moment, the cleanup logic appears to be either limited to EmbeddedFS extractors or implemented in a scattered and inconsistent manner.
I’ve raised this issue to propose a consistent approach to temporary directory naming, which would allow us to clean up resources more reliably once Scalibr finishes its execution.
Proposed rule 1: If a detector, enricher, or extractor needs to create a temporary directory, it must follow a common naming convention. For example, directory names should start with "osv-scalibr-".
Proposed rule 2 (abstraction of rule 1): We can implement a single function like:
func ScalibrTempDir(prefix string) (string, error) {
return os.MkdirTemp("/tmp", "osv-scalibr-"+prefix+"-")
}
Then enforce it's use osv-scalibr wide.
In addition, we can implement tracking code in ScalibrTempDir() then later at exit time use this information to remove all temporary files and directories. This ensures we never run into an infinite loop and attacks related to it. For example, We can create one temp root per Scalibr run, and put everything under it.
/tmp/osv-scalibr-run-9f23a1/
├── extractor-a/
├── detector-b/
└── enricher-c/
This way, we only need to track this "osv-scalibr-run-" directory for each iteration of scalibr. Moreover, we can simply erase everything with:
defer os.RemoveAll(runTmp)
With this in place, we could safely and efficiently remove all Scalibr-related temporary files or directories from /tmp at exit.
Benefits:
- Ensures consistency across extractors, detectors, and enrichers
- Enables centralized and reliable cleanup logic
- Reduces duplicated cleanup code (e.g., multiple defer blocks)
- Improves performance by simplifying teardown
- Lowers the burden on new plugin developers, who won’t need to implement custom cleanup logic
Overall, this should make the system easier to maintain, safer, and more developer-friendly. Of course, there are more efficient ways to realize the same.
Best regards,
Yuvraj Saxena