-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tv_imdb: update to use new data source #17
Comments
Updated base URL in IMDB.pm to use archived data (December 2017) in f0140f3 This is only a short-term workaround until we migrate tv_imdb to the new data source. |
Also, |
Sadly, it looks like tv_imdb -- in its current form -- is a dead duck. The IMDb dataset is no longer in the public domain. The ftp files haven't been updated since Dec 22 2017, and won't be updated in future. IMDb are no longer releasing updates to these files. The expectation from IMDb is that people switch to using the new TSV (tab separated values) files available on https. However these files contain a very much reduced dataset, and many key elements are no longer available (no plot summaries, mpaa ratings, keywords, only 3 genres, only the top 3 actors, etc.). The ethos as stated by IMDb is:
In other words, the intention of the new datasets is that you are only to use them to identify the key to access the page on their website, and no more. Hence no rich dataset like we've used for the past 20-odd years. The marketing reasons for this should be obvious if you've visited imdb.com lately: it's like the old days with auto-playing videos, clickbait, massive adverts, etc. I think we need to look at alternatives, such as the APIs from TMDb (The Movie Database) or OMDb (The Open Movie Database). |
It looks like OMDb is no longer maintained. It uses IMDb data and reading some of the support tickets suggests it probably uses a database built from the no-longer maintained .list files (hint= people can't find programmes shown after 2017) And TMDb has 567,000 films compared to 6,500,000 on IMDb :-( |
In late December 2017, the IMDB mirror went read-only; the URL of the archived data changed to ftp://ftp.fu-berlin.de/pub/misc/movies/database/frozendata/.
From the README:
"IMDb datasets, providing bulk-access to IMDb title and name data, are now available from us via an HTTPS link.
As a previous ftp user you can just switch to https, however there are some formatting changes within the data.
For details on the new file formats and access guidelines, see www.imdb.com/interfaces."
In addition to being served over https, the data files on IMDB's new service have some formatting changes.
The text was updated successfully, but these errors were encountered: