-
Notifications
You must be signed in to change notification settings - Fork 1.7k
How to: Avoid Pitfalls
TOC
- Keywords
nan,NaN,inf,Inf,infiniteand nullfoo.barvs.foo.bar- Cartesian Products
- Generator Expressions in Assignment Right-Hand Sides
- Backtracking (
empty) in Assignment RHS Expressions and Reductions - Multi-arity Functions and Comma/Semi-colon Confusability
index/1is byte-oriented butmatch/1is codepoint-oriented- If A and B are arrays,
B|.[A]is the same asB|indices(A) - Overriding Operator Definitions
The fact that jq has keywords such as if and end has various implications, some of which may not be obvious. In particular:
- in jq 1.6 and earlier, keywords cannot be used in the abbreviated syntax for specifying key-value pairs, e.g.
{foo}for{"foo": .foo} - in jq 1.6 and earlier, keywords cannot be used to form $-variable names
The full list of reserved keywords is currently:
and as break catch def elif else end foreach if import include label module or reduce then try
(The list of keywords for any particular version of jq can be derived from the lexer.l file, the “master” version of which is https://github.com/stedolan/jq/blob/master/src/lexer.l)
nan is a jq value representing IEEE NaN, but it prints as null.
NaN is recognized in JSON text and is also understood to represent IEEE NaN.
Use isnan to test whether a jq value is identical to IEEE NaN.
Here are some illustrative examples:
$ echo NaN | jq .
null
$ echo nan | jq .
parse error: Invalid literal at line 2, column 0
$ echo NaN | jq isnan
true
$ jq -n 'nan | isnan'
true
Similar comments apply to the jq value infinite, and the admissible values inf and Inf:
$ echo Inf | jq isinfinite
true
$ echo inf | jq isinfinite
true
$ jq -n 'infinite | isinfinite'
true
foo.bar is short for foo | .bar and means: call foo and then get the value at the "bar" key of the output(s) of foo.
.foo.bar is short for .foo | .bar and means: get the value at the "foo" key of . and then get the value at the "bar" key of that.
One character, big difference.
jq is geared to produce Cartesian products at the drop of a hat. For example, the expression (1,2) | (3,4) produces four results:
3
4
3
4
To see why:
$ jq -n '(1,2) as $i | (3,4) | "\($i),\(.)"'
"1,3"
"1,4"
"2,3"
"2,4"
Generator expressions in assignment RHS expressions are likely to surprise users. Compare (.a,.b) = (1,2) to (.a,.b) |= (.+1,.*2).
.a=empty and .a|=empty behave differently:
null | .a = empty #=> the empty stream
null | .a |= empty #=> null
In reductions, care should be exercised when including empty in the body. For example, one might reasonably expect that:
reduce 1 as $x (2; empty)
would produce 2, but in fact it produces null in most versions of jq, including jq 1.5 and earlier, as well as the current “master” version as of 2018.
WARNING: Expressions of the form A | .[] |= E where A is an array and E can evaluate to empty should in general be avoided. Their behavior is inconsistent between versions of jq, and jq version 1.6 will often evaluate them incorrectly. For example, using jq 1.6:
jq -n '[0,1,2] | .[] |= if . == 0 then empty else . end'
yields:
[1,2,null]
foo(a,b) is NOT the same as foo(a;b). If foo/1 and foo/2 are both defined, then if you write foo(a,b)intending to call the two-argument function, you'll silently get the wrong behavior.
For example, foo(1,2) is a call to foo/1 with a single argument consisting of the expression 1,2, while foo(1;2) is a call to foo/2 with two arguments: the expressions 1, and 2.
One character, big difference.
Given strings as input, the index family of filters (index, rindex,
indices) return byte-oriented offsets. For codepoint-oriented
offsets, one can use the array-oriented versions of these filters, or match/1 or match/2, or the definition of myindex given below.
For example:
$ jq -cn '"aéb" | [., index("b")]'
["aéb",3]
$ jq -cn '"aéb" | [., (explode|index("b"|explode))]'
["aéb",2]
$ jq -cn '"a\u00e9b" | [., index("b")]'
["aéb",3]
$ jq -cn '"a\u00e9b" | match("b").offset'
2
# codepoint-oriented version of `index/1` for strings
# e.g. ("”#a" | myindex("#a")) yields 1
def myindex($string):
($string|length) as $sl
| if $sl > length
then null
else
explode as $x
| ($string|explode) as $s
| first(range(0; 1 + length - $sl) as $i
| select($x[$i: $sl+$i] == $s) | $i) // null
end;
If A and B are JSON arrays, then B|.[A] asks for the sorted array of ALL the indices, $i, such that .[0:$i] + A is an initial subarray of B. This has implications for B|index(A) as well.
Examples:
jq -nc '[0,1,2,3,4,1,2] | .[[1,2]]'
[1,5]
jq -nc '[0,1,2,3,4,1,2] | index([[1,2]])
1
Overriding operator definitions is possible but probably ill-advised if for no other reason
than that the results can be surprising because of compile-time constant-folding.
Consider, for example, what happens when we override + as follows:
def myplus($a;$b): _plus($a;$b);
def _plus($a;$b): [ myplus($a;$b) ];
We might expect that the expression 1+2 would now evaluate to [3] but, because the constant-folding
occurs before the new definition becomes effective, it will instead evaluate to 3.
- Home
- FAQ
- jq Language Description
- Cookbook
- Modules
- Parsing Expression Grammars
- Docs for Oniguruma Regular Expressions (RE.txt)
- Advanced Topics
- Guide for Contributors
- How To
- C API
- jq Internals
- Tips
- Development