Skip to content

db.columns: Output column metadata, not just names #6072

@wenzeslaus

Description

@wenzeslaus

I suggest to output information about column types to db.columns. Currently, the tool only gives columns names, so in JSON, it is a list. Instead, it should output objects (dictionaries) with information about type. This would allow users to do more with the output. Other tools, most notably v.info, are already outputting that information. db.columns is already a low-level tool, so it makes sense that outputs more, rather than less information.

The JSON output can look like this (same as v.info except not nested under a key "columns"):

[
    {
        "name": "cat",
        "sql_type": "INTEGER",
        "is_number": true
    },
    {
        "name": "onemap_pro",
        "sql_type": "DOUBLE PRECISION",
        "is_number": true
    }
]

Other formats than JSON

The default output needs to stay the same for now, i.e., a list of names, one name per line. This is currently the plain output, so the plan would be to add the type info to plain in v9.

To accommodate the old name-only use-case in the new interface and avoid users parsing the plain text output, we could add format="shell" or format="list" which would be just the list of names. This would be a step away from having all the formats print exactly the same info just in different format - we would basically say, that the "list" format cannot contain all the info JSON can.

To have non-JSON parse-able format with multiple values per record, format=csv could be used. v.info -c -g (v.info -c format=shell) prints basically CSV which is related to the discussion of the need to distinguish Bash-eval-parse-able key-values and CSV in #4492.

Alternatively or additionally, we could introduce a flag, e.g., -n to print names only even in JSON format which would produce the output we have now with #6042.

Other tools

Printing just the names does not make much sense to me and other tools show that's what we decided elsewhere. v.info -c format=json and v.db.select format=json print a list of dictionaries with info about types. While v.db.select columns info is more recent added by me, v.info -c was printing column names and types for really long time.

v.db.select is printing the columns as additional info for convenience and because you can get a different list or even new columns. v.info has it as alternative output for convenience and because there is no other tool now for vector maps. Nevertheless, the outputs should be the same unless there is a reason to make them different.

v.info

$ grass ~/grassdata/nc_spm_full_v2alpha2/PERMANENT/ --exec v.info geology -c format=json
{
    "columns": [
        {
            "name": "cat",
            "sql_type": "INTEGER",
            "is_number": true
        },
        {
            "name": "onemap_pro",
            "sql_type": "DOUBLE PRECISION",
            "is_number": true
        },
        {
            "name": "PERIMETER",
            "sql_type": "DOUBLE PRECISION",
            "is_number": true
        },
        {
            "name": "GEOL250_",
            "sql_type": "INTEGER",
            "is_number": true
        },
        {
            "name": "GEOL250_ID",
            "sql_type": "INTEGER",
            "is_number": true
        },
        {
            "name": "GEO_NAME",
            "sql_type": "CHARACTER",
            "is_number": false
        },
        {
            "name": "SHAPE_area",
            "sql_type": "DOUBLE PRECISION",
            "is_number": true
        },
        {
            "name": "SHAPE_len",
            "sql_type": "DOUBLE PRECISION",
            "is_number": true
        }
    ]
}

v.db.select

$ grass ~/grassdata/nc_spm_full_v2alpha2/PERMANENT/ --exec v.db.select geology columns="GEO_NAME, SHAPE_area * 100 AS area100" where="cat = 1820" format=json 
{"info":
{"columns":[
{"name":"GEO_NAME","sql_type":"CHARACTER","is_number":false},
{"name":"area100","sql_type":"DOUBLE PRECISION","is_number":true}
]},
"records":[
{"GEO_NAME":"Qp","area100":82944614.4862}
]}

Additional info

Create v.db.columns?

There is no v.db.columns, maybe there should be one. If we would have one, the difference from db.columns should be the interface (direct access to database versus vector map name and layer), but not in format. How would we design such a tool from scratch? Likely, it would output all the information we have about the columns rather than just the names.

Prior discussions

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    Status

    Todo

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions