Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update guess_format_id with new formats #84

Open
amoeba opened this issue May 8, 2018 · 2 comments
Open

Update guess_format_id with new formats #84

amoeba opened this issue May 8, 2018 · 2 comments

Comments

@amoeba
Copy link
Contributor

amoeba commented May 8, 2018

We're getting some new formats in dataone, https://gist.github.com/amoeba/d4771fc01d4f8f66c44202856d078e8e

  • Update guess_format_id's extension<->formatId map
  • Make sure ipynb -> json is in there as it's non-obvious
  • Can we add MATLAB version detection to the algo? I already did this for netcdf
  • What can we do about RAW files? Are there a few common RAW extensions?
@maier-m
Copy link
Contributor

maier-m commented May 9, 2018

May be difficult to find MATLAB version.
Could potentially use foo <- R.matlab::readMat("file_path")
and attr(foo, "header")$version, but on the few tests I've run all return "5" which is not? a MATLAB file type version. Also v7.3 won't open with the above R package (could use a try call and possibly the h5 package to confirm v7.3 files). Other approaches may be more efficient.

@maier-m
Copy link
Contributor

maier-m commented May 9, 2018

https://www.loc.gov/preservation/digital/formats/fdd/fdd000241.shtml here would be a good place to look for RAW file extensions. Also potentially just .raw as seen here https://arcticdata.io/catalog/#view/doi:10.18739/A2SC4M

laijasmine pushed a commit that referenced this issue Oct 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants