You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We should consider being more explicit in explaining the semantics of NULL and MISSING behavior in the context of logical operators, predicate functions, etc.
Quoting from the PartiQL spec:
Quote 1: Boolean Connectives and IS NULL Operator -- PartiQL Spec section 8 Where clause
As far as the boolean connectives and IS NULL are concerned a NULL input and a MISSING input behave identically. For example, MISSING AND TRUE is equivalent to NULL AND TRUE: they both result into NULL.
Quote 2: Handling of MISSING in functions. -- PartiQL Spec section 7.1 Inputs with wrong types
Since no function (other than IS MISSING) has MISSING as an input argument type, it follows that all functions
return MISSING when one of their inputs is MISSING.
Each one of these expressions returns MISSING: 5 + missing, 5 > ’a’, NOT {a:1}.
Quote 3: Distinguish between NULL and MISSING -- PartiQL Spec section 8 Where clause
The predicate IS MISSING allows distinguishing between NULL and MISSING: NULL IS MISSING results to false;
MISSING IS MISSING results to true.
Equality never fails in the type-checking mode and never returns MISSING in the permissive mode. Instead, it can
compare values of any two types, according to the rules of the PartiQL type system. For example, 5 = ’a’ is false.
The eqg, unlike the =, returns true when a NULL is compared to a NULL or a MISSING to a MISSING. When the
arrays x and y do not have the same length, the x = y is false.
I found it different to derive a coherent semantics for handling MISSING in context of boolean function from the above quotes.
SQL Background
“Truth Value” And "Boolean Value"?
SQL-99 6.30
The SQL spec specifies three truth value. TRUE, FALSE, and UNKNOWN.
A boolean value can be comprised of truth value TRUE, FALSE, and UNKNOWN. The UNKNOWN truth value is supported in boolean value as NULL.
The SQL spec does not distinguish between the NULL value of boolean data type and the UNKNOWN truth value.
It is also important to notice that there is a distinguish between the NULl value of a non-boolean data type and the UNKNOWN truth value. In fact, the SQL spec defines different ways to assert on whether the value is NULL (NULL predicate SQL Spec 1999 - Section 8.7) and to assert on whether the value is UNKNOWN(Boolean Test, SQL-1999 Section 6.30).
Example 1:
CREATE TABLE test (
boolean_value BOOL,
non_boolean_value INT8
);
INSERT INTO test (boolean_value, non_boolean_value) VALUES (null, null);
SELECT boolean_value IS UNKNOWN FROM test; -- Returns true
SELECT non_boolean_value IS UNKNOWN FROM test; -- Returns error as in this case the value is null of type INT8
SELECT boolean_value IS NULL FROM test; -- Returns true
SELECT non_boolean_value IS NULL FROM test; -- Returns true
PartiQL
In this section, we attempted to derive the semantics rules from first principle then cross validate the rules with quotes from the PartiQL spec.
PartiQL’s Truth Value:
To distinguish between the Boolean Values, denote the three as T for true, F for false, U for unknown.
The truth table would look like:
A
B
A AND B
A OR B
NOT A
T
T
T
T
F
T
F
F
T
F
T
U
U
T
F
F
T
F
T
T
F
F
F
F
T
F
U
F
U
T
U
T
U
T
U
U
F
F
U
U
U
U
U
U
U
Relationship between Truth Value and PartiQL Value
Since PartiQL is SQL-compatible, we can deduct:
A truth value T is equivalent to a boolean value true.
A truth value F is equivalent to a boolean value false.
A truth value U is equivalent to a null value of Boolean Type in strict mode.
The question to clarify is: the relationship between MISSING and truth value UNKNOWN in permissive mode.
Attempt 1:
Assumption:
The MISSING value, when convert to a truth value, becomes a MISSING.
UNKNOWN truth value, when converting to PartiQL Value, becomes NULL of boolean type.
Define AND, OR, NOT, IS UNKNOWN as logical operators. Logical operators takes truth value as operands.
A predicate function returns a truth value. If a predicate function propagates MISSING, it means the function returns UNKNOWN, when any of its operands is MISSING. This is because we need to convert MISSING to a truth value.
The IS NULL is a predicate function that propagates MISSING, and it returns TRUE if and only if the operand is NULL value of any type.
The IS MISSING is a predicate function that returns TRUE if and only if the operand is MISSING, otherwise it returns FALSE.
The = operator is a predicate function that propagate MISSING. If the operands are not of comparable types, then it returns FALSE.
The > operator is a predicate function that propagate MISSING. If the operands are not of comparable types, then it results in an Data Type Mismatch Error in strict mode and MISSING which get turned to UNKNOWN in permissive mode.
index
Expression
result truth value
result PartiQL value
Semantics
1
MISSING and TRUE
UNKNOWN
NULL
the MISSING turns into a unknown truth value, the expression returns a unknown truth value, when present as a PartiQL Value, the result is NULL.
2
NULL and TRUE
UNKNOWN
NULL
the NULL turns into a unknown truth value, the expression returns a unknown truth value, when present as a PartiQL Value, the result is NULL
3
MISSING IS UNKNOWN
TRUE
TRUE
MISSING is used as a truth value, so internally this expression is UNKNOWN IS UNKNOWN which return true
4
MISSING IS NULL
UNKNOWN
NULL
The IS NULL is a predicate function that propagate MISSING, as predicate function returns a truth value, the MISSING turns into an UNKNOWN truth value. When present as a PartiQL value, the result is NULL.
5
5 = ’a’
FALSE
FALSE
The operands of the equal operator is not comparable, hence it returns false
6
5 > ’a’
Unknown
NULL
The operands of the > operator are not of comparable types, hence the expression returns MISSING which get turned to a UNKNOWN truth value because the > operator is a predicate function.
7
5 >= 'a'
UNKNOWN
NULL
The expression is equivalent to `5 > 'a' OR 5 = 'a' , which is FALSE OR UNKNOWN which results in UNKNOWN. When presents as a PartiQL value, the expression returns unknown.
8
MISSING = MISSING
UNKNOWN
NULL
The equal operator propagates MISSING, hence the expression returns MISSING which get turned to a UNKNOWN truth value because the = operator is a predicate function.
9
NOT {a:1}
UNKNOWN
NULL
The Not operator is a logical operator, it should take truth value as operand. {a:1} can not be converted to a truth value, hence the result should return Data Type Mismatch error and return MISSING, which get turned to a UNKNOWN truth value because logical operator returns truth value.
From the above:
Example 4 violates Quote 1 NULL IS NULL and MISSING IS NULL behaves identically.
Example 5 violates Quote 2 5 + 'a' should return MISSING.
Example 9 violates Quote 2 NOT {'a' : 1} should return MISSING.
Attempt 2:
Seeming the issues above indicates that the data type mismatch issues take precedence then logic value conversion. We can change the semantics rules as following:
The MISSING value, when convert to a truth value, becomes a UNKNOWN.
UNKNOWN truth value, when converting to PartiQL Value, becomes NULL of boolean type.
Define AND, OR, NOT, IS UNKNOWN as logical operators. Logical operators takes truth value as operands. If an operand can not be converted to a truth value, it returns MISSING in permissive mode.
A predicate function returns a truth value. If a predicate function propagates MISSING, it means the function returns MISSING, when any of its operands is MISSING. We DO NOT convert the missing value to UNKNOWN truth value in this case.
The IS NULL is a predicate function that propagates MISSING, and it returns TRUE if and only if the operand is NULL value of any type.
The IS MISSING is a predicate function that returns TRUE if and only if the operand is MISSING, otherwise it returns FALSE.
The = operator is a predicate function that propagate MISSING. If the operands are not of comparable types, then it returns FALSE.
The > operator is a predicate function that propagate MISSING. If the operands are not of comparable types, then it results in an Data Type Mismatch Error in strict mode and MISSING in permissive mode.
Examples:
index
Expression
result truth value
result PartiQL value
Semantics
1
MISSING and TRUE
UNKNOWN
NULL
the MISSING operand turns into an UNKNOWN truth value, the expression returns an UNKNOWN truth value, when present as a PartiQL Value, the result is NULL.
2
NULL and TRUE
UNKNOWN
NULL
the NULL operand turns into an UNKNOWN truth value, the expression returns an UNKNOWN truth value, when present as a PartiQL Value, the result is NULL
3
MISSING IS UNKNOWN
TRUE
TRUE
MISSING is used as a truth value, so internally this expression is UNKNOWN IS UNKNOWN which return TRUE
4
MISSING IS NULL
N/A
MISSING
The IS NULL is a predicate function that propagate MISSING. Hence the function returns MISSING.
5
5 = ’a’
FALSE
FALSE
The operands of the equal operator is not comparable, hence it returns FALSE
6
5 > ’a’
N/A
MISSING
The operands of the > operator are not of comparable types, hence the expression returns MISSING
7
5 >= 'a'
N/A
MISSING
The operands of the >= operator are not of comparable type, hence the expression returns MISSING. Notice if we want to rewrite the expression, it would be CASE WHEN (5 > 'a' IS MISSING) THEN MISSING ELSE 5 > 'a' OR 5 = 'a' END
8
MISSING = MISSING
N/A
MISSING
The equal operator propagates MISSING, hence the expression returns MISSING.
9
NOT {a:1}
N/A
MISSING
The Not operator is a logical operator, it should take truth value as operand. {a:1} can not be converted to a truth value, hence the result should return Data Type Mismatch error and return MISSING
From the above:
Example 4 violates Quote 1 NULL IS NULL and MISSING IS NULL behaves identically.
Example 8 violates Quote 4 Equality never returns MISSING in permissive mode.
Open Question
The PartiQL Spec stated that NULL IS NULL and MISSING IS NULL behaves identically, what is the rational behind this statement?
The PartiQL Spec stated that "Equality never returns MISSING in permissive mode.", but what should MISSING = MISSING returns?
The key difference between Approach 1 and Approach 2 is the conversion from a MISSING value to an UNKNOWN truth value.
- Approach 1 suggests that MISSING value can be converted to UNKNOWN truth value, for both cases when the result of an function should be a truth value (predicate function), and when the operand of an function should be a truth value (logical function).
- Approach 2 suggested that MISSING value can be converted to UNKNOWN truth value only when operand of an function should be a truth value.
- Although this distinguish is acceptable, but in situation where a predicate and a logical operator is chained, an end user would need to be very careful with the semantics of the query.
- logical (predicate) -> NEVER MISSING
- predicate(logical) -> Can be MISSING.
- Maybe it is worth to consider adapting the four value logic to better distinguish the semantic between logical value and NULL/MISSING.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
We should consider being more explicit in explaining the semantics of
NULL
andMISSING
behavior in the context of logical operators, predicate functions, etc.Quoting from the PartiQL spec:
Quote 1: Boolean Connectives and IS NULL Operator -- PartiQL Spec section 8 Where clause
Quote 2: Handling of MISSING in functions. -- PartiQL Spec section 7.1 Inputs with wrong types
Quote 3: Distinguish between NULL and MISSING -- PartiQL Spec section 8 Where clause
Quote 4: Equality -- PartiQL Spec section 7.1.1 Equality
I found it different to derive a coherent semantics for handling
MISSING
in context of boolean function from the above quotes.SQL Background
“Truth Value” And "Boolean Value"?
The SQL spec specifies three truth value.
TRUE
,FALSE
, andUNKNOWN
.A boolean value can be comprised of truth value
TRUE
,FALSE
, andUNKNOWN
. TheUNKNOWN
truth value is supported in boolean value asNULL
.The SQL spec does not distinguish between the
NULL
value of boolean data type and theUNKNOWN
truth value.It is also important to notice that there is a distinguish between the
NULl
value of a non-boolean data type and theUNKNOWN
truth value. In fact, the SQL spec defines different ways to assert on whether the value isNULL
(NULL predicate SQL Spec 1999 - Section 8.7) and to assert on whether the value isUNKNOWN
(Boolean Test, SQL-1999 Section 6.30).Example 1:
PartiQL
In this section, we attempted to derive the semantics rules from first principle then cross validate the rules with quotes from the PartiQL spec.
PartiQL’s Truth Value:
To distinguish between the Boolean Values, denote the three as
T
for true,F
for false,U
for unknown.The truth table would look like:
Relationship between Truth Value and PartiQL Value
Since PartiQL is SQL-compatible, we can deduct:
T
is equivalent to a boolean valuetrue
.F
is equivalent to a boolean valuefalse
.U
is equivalent to anull
value of Boolean Type in strict mode.The question to clarify is: the relationship between
MISSING
and truth valueUNKNOWN
in permissive mode.Attempt 1:
Assumption:
MISSING
value, when convert to a truth value, becomes aMISSING
.UNKNOWN
truth value, when converting to PartiQL Value, becomes NULL of boolean type.AND
,OR
,NOT
,IS UNKNOWN
as logical operators. Logical operators takes truth value as operands.MISSING
, it means the function returnsUNKNOWN
, when any of its operands isMISSING
. This is because we need to convertMISSING
to a truth value.IS NULL
is a predicate function that propagatesMISSING
, and it returnsTRUE
if and only if the operand is NULL value of any type.IS MISSING
is a predicate function that returnsTRUE
if and only if the operand isMISSING
, otherwise it returnsFALSE
.=
operator is a predicate function that propagateMISSING
. If the operands are not of comparable types, then it returnsFALSE
.>
operator is a predicate function that propagateMISSING
. If the operands are not of comparable types, then it results in anData Type Mismatch Error
in strict mode andMISSING
which get turned toUNKNOWN
in permissive mode.NULL
.NULL
UNKNOWN IS UNKNOWN
which return trueIS NULL
is a predicate function that propagate MISSING, as predicate function returns a truth value, theMISSING
turns into anUNKNOWN
truth value. When present as a PartiQL value, the result isNULL
.>
operator are not of comparable types, hence the expression returns MISSING which get turned to aUNKNOWN
truth value because the>
operator is a predicate function.MISSING
, hence the expression returns MISSING which get turned to aUNKNOWN
truth value because the=
operator is a predicate function.Data Type Mismatch
error and return MISSING, which get turned to aUNKNOWN
truth value because logical operator returns truth value.From the above:
NULL IS NULL
andMISSING IS NULL
behaves identically.5 + 'a'
should returnMISSING
.NOT {'a' : 1}
should returnMISSING
.Attempt 2:
Seeming the issues above indicates that the data type mismatch issues take precedence then logic value conversion. We can change the semantics rules as following:
MISSING
value, when convert to a truth value, becomes aUNKNOWN
.UNKNOWN
truth value, when converting to PartiQL Value, becomes NULL of boolean type.AND
,OR
,NOT
,IS UNKNOWN
as logical operators. Logical operators takes truth value as operands. If an operand can not be converted to a truth value, it returnsMISSING
in permissive mode.MISSING
, it means the function returnsMISSING
, when any of its operands isMISSING
. We DO NOT convert the missing value toUNKNOWN
truth value in this case.IS NULL
is a predicate function that propagatesMISSING
, and it returnsTRUE
if and only if the operand is NULL value of any type.IS MISSING
is a predicate function that returnsTRUE
if and only if the operand isMISSING
, otherwise it returnsFALSE
.=
operator is a predicate function that propagateMISSING
. If the operands are not of comparable types, then it returnsFALSE
.>
operator is a predicate function that propagateMISSING
. If the operands are not of comparable types, then it results in anData Type Mismatch Error
in strict mode andMISSING
in permissive mode.Examples:
MISSING
operand turns into anUNKNOWN
truth value, the expression returns anUNKNOWN
truth value, when present as a PartiQL Value, the result isNULL
.NULL
operand turns into anUNKNOWN
truth value, the expression returns anUNKNOWN
truth value, when present as a PartiQL Value, the result isNULL
MISSING
is used as a truth value, so internally this expression isUNKNOWN IS UNKNOWN
which returnTRUE
IS NULL
is a predicate function that propagateMISSING
. Hence the function returnsMISSING
.FALSE
>
operator are not of comparable types, hence the expression returnsMISSING
>=
operator are not of comparable type, hence the expression returnsMISSING
. Notice if we want to rewrite the expression, it would beCASE WHEN (5 > 'a' IS MISSING) THEN MISSING ELSE 5 > 'a' OR 5 = 'a' END
MISSING
, hence the expression returnsMISSING
.Data Type Mismatch
error and returnMISSING
From the above:
NULL IS NULL
andMISSING IS NULL
behaves identically.Open Question
NULL IS NULL
andMISSING IS NULL
behaves identically, what is the rational behind this statement?MISSING = MISSING
returns?MISSING
value to anUNKNOWN
truth value.- Approach 1 suggests that
MISSING
value can be converted toUNKNOWN
truth value, for both cases when the result of an function should be a truth value (predicate function), and when the operand of an function should be a truth value (logical function).- Approach 2 suggested that
MISSING
value can be converted toUNKNOWN
truth value only when operand of an function should be a truth value.- Although this distinguish is acceptable, but in situation where a predicate and a logical operator is chained, an end user would need to be very careful with the semantics of the query.
-
logical (predicate)
-> NEVER MISSING-
predicate(logical)
-> Can be MISSING.- Maybe it is worth to consider adapting the four value logic to better distinguish the semantic between logical value and
NULL
/MISSING
.Beta Was this translation helpful? Give feedback.
All reactions