- 
                Notifications
    You must be signed in to change notification settings 
- Fork 6
CovJSON, CF-JSON and NCO-JSON #86
Description
There has been a discussion on the Climate and Forecast mailing list about different JSON formats for recording NetCDF data. Until this discussion I wasn't aware that there are a couple of other initiatives going on:
- CF-JSON (http://cf-json.org)
- NCO, which provides JSON output (http://nco.sf.net/nco.html#json)
The discussion revealed that these two initiatives are quite similar in aim to each other. They both aim to translate the NetCDF(-3) [edit/correction - NCO also supports NetCDF4] data model into JSON and apply the CF metadata conventions directly.
CovJSON does not have quite the same aim: it operates at a higher level of abstraction and does not mimic any particular existing format. Here are a few comparison points between CovJSON, CF-JSON and NCO, intended to stimulate discussion. I'm going to make a simplifying assumption that CF-JSON and NCO are very similar in respect of the points made here:
- 
CF-JSON and NCO are likely to be more familiar with users who are already comfortable with NetCDF and the CF conventions. 
- 
Conversely, users who are unfamiliar with CF/NetCDF may find CovJSON easier to understand (at least, that's our intention...). CovJSON does not assume that the data are "born in NetCDF format". 
- 
CovJSON borrows concepts from ISO and OGC standards, and may be conceptually more familiar to folk from those communities. It's intended to provide a "bridge" between what we might loosely call the "NetCDF community" and the "GIS community". 
- 
CovJSON cannot (yet) encode all possibilities afforded by CF/NetCDF. For example, cell methods and climatological time are not yet supported in CovJSON. So if entirely "lossless" encoding of CF-NetCDF in JSON is required, CF-JSON and NCO may be more appropriate choices. 
- 
The NetCDF(-3) data model struggles to accommodate certain types of data structures. It is quite a "flat" structure, and the mechanisms required to link relevant data together in a NetCDF file can be quite hard to understand. (Coordinate reference systems, and their links to dimensions and variables are one example. Encoding geometries is another.) I assume that JSON formats based directly on NetCDF will suffer from similar issues, forcing clients to implement some of the more complex parts of the CF conventions in order to piece the information back together. By contrast, CovJSON aims to repartition the same information in a way that is (hopefully) easier for clients to deal with, using the possibilities afforded by JSON. 
- 
By virtue of the above, I would argue that non-gridded data (e.g. observations from points or moving platforms), which often require the recording of geometries, trajectories and other "composite" coordinate types, are easier to encode and understand in CovJSON than in NetCDF. (Concretely, CovJSON provides the facility for "tuple" and "polygon" axis types: https://covjson.org/spec/#axis-objects. These require some gymnastics to encode in NetCDF.) 
- 
CovJSON provides mechanisms to partition large datasets among different files (e.g. holding range objects in separate files, the tiling scheme). This is done for "web-friendliness", i.e. avoiding large monolithic files. I'm not aware that CF-JSON and NCO have this facility, although I may be wrong. 
- 
On a more minor point of implementation, CovJSON encodes data values as flat, 1-D arrays (the reason why is explained here. CF-JSON and NCO use nested arrays. 
Discussion of the above points (and addition of new ones!) is most welcome. My intention is not to evangelise for CovJSON, but to point out points of similarity and departure (philosophically and structurally) between CovJSON, CF-JSON and NCO. If we can understand these points we'll be in a better place to discuss whether we should look at merging these initiatives.