Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete all non-en-US builds? #477

Open
peterbe opened this issue Feb 4, 2019 · 7 comments
Open

Delete all non-en-US builds? #477

peterbe opened this issue Feb 4, 2019 · 7 comments

Comments

@peterbe
Copy link
Contributor

peterbe commented Feb 4, 2019

When we built Buildhub2 we not only changed the stack (still Python and still React SearchKit) we also entirely changed the architecture such that we only store and index created *buildhub.json files instead of trying to figure out what the build should be based on the binaries + metadata. Because of that only get en-US builds.

https://bugzilla.mozilla.org/show_bug.cgi?id=1459302 is about adding the en-US builds additionally but that bug is stalled (without momentum or owners) and maybe that's OK. The people who have come forth and shown interest in this project are all OK with just en-US. They just need the versions and dates and the buildIDs.

The reason there are non-en-US builds at all in https://buildhub.moz.tools is because of the legacy migration from https://mozilla-services.github.io/buildhub/ (aka. Buildhub1). If we're not bothering with them going forward, what's the point of bothering with it in the past.

We're not running low on disk-space but the feeling of dropping nearly 100x of the data is attractive.

@willkg
Copy link
Contributor

willkg commented Feb 4, 2019

Socorro has its own archive.mozilla.org scraper and captures data for the BetaVersionRule.

The only place where Socorro uses Buildhub is links in the report view to Buildhub so that engineers can get additional information about that build and binaries related to the crash in question. I have no idea if engineers ever need to get non-en-US binaries. That's the only concern I can think of for Socorro.

@wlach
Copy link

wlach commented Feb 4, 2019

Yeah, Mission Control only uses the en-US data, or really the subset of that which isn't locale-specific. This is ok by me.

@peterbe
Copy link
Contributor Author

peterbe commented Feb 4, 2019

@willkg Socorro links to the buildID and that's the same for en-US as it is for sv-SE or fr. The only thing that is different between en-US and, say, de is that the download URL (and download size) is different. E.g. https://archive.mozilla.org/pub/firefox/releases/66.0b4/win64/en-US/Firefox%20Setup%2066.0b4.exe vs https://archive.mozilla.org/pub/firefox/releases/66.0b4/win64/de/Firefox%20Setup%2066.0b4.exe

@mars-f
Copy link

mars-f commented Feb 4, 2019

I only need en-US for build systems trend analysis. This is OK by me.

@fbertsch
Copy link

fbertsch commented Feb 5, 2019

This is completely fine for my use-case - we only need distinct revisions for every release date. I was going to filter on en-US anyways.

@peterbe
Copy link
Contributor Author

peterbe commented Feb 5, 2019

@fbertsch Without fully comprehending your usecase, if you can find a way to use Elasticsearch aggregates to get a distinct list of revisions you'd never need to do any filtering by locale anyway.

@flodolo
Copy link

flodolo commented Feb 6, 2019

No issues on the l10n side about removing non-en-US build. We weren't aware that this data existed, and there are no real use cases driving the work to add support for l10n builds, as far as I can think of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants