Dilemma with saving scan results #2228

lamppu · 2025-03-11T09:09:12Z

With the DOS scanner plugin, we have a bit of a problem that has been flying under the radar, until the issues connected to scan summaries were added to the issues endpoint, which highlighted this for us.

So the problem is, that with the DOS scanner, if there is some problem in the scanning of a provenance in the first scan, there is still a scan result that gets saved for that provenance. And then, even if the scan succeeds the next time around, the new results won't be saved, but the old results will be linked to the new run, which if I'm understanding this correctly, is due to this logic of saving results, which of course makes sense to not save duplicates of the same result, but here it is a bit problematic.

I don't know what the ideal approach here would be, could the findExistingScanResult function somehow also compare the issues of the scan or something?

The text was updated successfully, but these errors were encountered:

lamppu · 2025-03-11T09:10:29Z

For the time being, if I'm understanding this correctly, I think there's an ugly workaround we could use by adding a string representation of the DOS scanner plugin configuration here, and when needed, slightly change the configuration (like a poll interval or something) so that the matchesBasicScanResultProperties would return false, and the new results would be saved. Though this would then save new results for all provenances, which is somewhat not optimal either.

sschuberth · 2025-03-11T16:13:25Z

could the findExistingScanResult function somehow also compare the issues of the scan or something?

That sounds like the obviously correct solution to me, esp. as the function's documentation says "Note that results with additional data are not deduplicated". While this leaves a bit unclear what such "additional data" is, I guess you could count issues in here.

Any thoughts @oheger-bosch?

oheger-bosch · 2025-03-12T06:12:45Z

Not sure because I am not so deep into the topic, but isn't this a general problem with scan result storages in ORT?

We can of course extend findExistingScanResult to take more properties into account. This would lead to multiple scan results stored for a provenance. How is then handled when reading the results? It probably does not make sense to associate all existing results to the new run.

Or do you ignore already existing scan results for the DOS scanner and always trigger a scan, analogously to what we do with FossID? Then the situation is different.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dilemma with saving scan results #2228

Dilemma with saving scan results #2228

lamppu commented Mar 11, 2025

lamppu commented Mar 11, 2025

sschuberth commented Mar 11, 2025

oheger-bosch commented Mar 12, 2025

Dilemma with saving scan results #2228

Dilemma with saving scan results #2228

Comments

lamppu commented Mar 11, 2025

lamppu commented Mar 11, 2025

sschuberth commented Mar 11, 2025

oheger-bosch commented Mar 12, 2025