Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

False positive buildings in the data set for Sweden - Gothenburg example with 6913 examples #94

Open
stefankinell opened this issue Feb 21, 2024 · 5 comments

Comments

@stefankinell
Copy link

In the dataset for Sweden I find so many false positives in the Microsoft data set so that it becomes difficult to use without manual inspection. As a test I ran a comparison on the data for the city of Gothenburg. I did a comparison of the Microsoft building with the official data from the city of the buildings. I added a buffer of 10 meters from the real buildings from the city data and kept buildings from the MLBuildings footprint that did not touch these buffers. The result left me with 6913 "buildings". Some of them are probably buildings, but the vast majority of them are false positives. I have ziped a .gpkg file with my false positives in the post as reference. But some examples are also presented with images below.

So what are they then?

It is a combination of things one can understand, like containers in the harbour. Cars parked on farms. Boats in the harbour.

image
At: 57.692740,11.841055

image
At: 57.6920724,11.8009287

But there are also quite strange things like bare rock by the ocean.

image
At 57.7334580,11.7445517

image
At 57.7428883,11.7393247

Cars on the road?
image
At: 57.8011746,11.9566407

Forrest:
image
At: 57.8025395,11.9673369

Running track:
image
At 57.6783985,11.9391418

gbg_false-positives.zip

@stefankinell
Copy link
Author

And these are just examples from where I live. They were easy to show due to the accessibility to real data. I see the same in many other places in Sweden, as well as big square chunks of data just missing.

I see that data in other parts has been reprocessed with a confidence attribute added to them. Would that be possible to do also in Sweden soon?

@andwoi
Copy link
Contributor

andwoi commented Feb 21, 2024

@stefankinell thanks for gathering these together. From this sample (and others) it appears our models systematically struggle in ports and other industrial areas. We've seen similar behavior for airports.

@stefankinell
Copy link
Author

Which is quite understandable. What makes me more puzzled are the many "huts" that are found in the forest and on the oceanside cliffs.

@stefankinell
Copy link
Author

I checked back to see if something has been updated since I posted this. It seems that the data for Sweden has not been touched since the first iteration. Is it correct to assume that it will not be processed again? I understand if our country is to small to be of interest, but it would be nice at least to know what the plans moving forward are.

@andwoi
Copy link
Contributor

andwoi commented Sep 16, 2024

Our plan at the moment is to continue updating data as our imagery sources are updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants