Skip to content

Conversation

@simonstey
Copy link
Contributor

@simonstey simonstey commented Nov 4, 2025

This is a revision of the now closed PR here: #526 (comment)

Closes #484


This pull request implements comprehensive changes to address Issue #484 regarding sequence processing naming inconsistencies in SHACL Node Expressions. The changes align the vocabulary and documentation with the sequence-based nature of node expression processing while maintaining backward compatibility through deprecation notices.

Problem Statement

SHACL Node Expressions are fundamentally sequence-based but were using set-style operation names, creating confusion:

  • Operations like union and minus suggested set semantics despite working on ordered sequences
  • The generic path property conflicted conceptually with constraint sh:path
  • Missing advanced sequence operations limited processing capabilities

Solution

1. Vocabulary Renaming for Sequence Semantics

Renamed Operations:

Deprecation Strategy:

  • Updated shacl.ttl with deprecation notices for sh:union and sh:minus
  • Maintained backward compatibility while guiding migration to new terms

2. Advanced Sequence Operations

New Operations Added:

  • shnex:flatMap - Applies expression to each input node and flattens results
  • shnex:findFirst - Returns first node conforming to a given shape
  • shnex:matchAll - Returns true if all nodes conform to a given shape

Files Modified

Vocabulary Files

  • shacl12-vocabularies/shnex.ttl

    • Added complete RDF definitions for FlatMap, FindFirst, and MatchAll expressions
    • Updated existing definitions with sequence-appropriate naming
    • Enhanced property comments for clarity
  • shacl12-vocabularies/shacl.ttl

    • Added deprecation notices for sh:union and sh:minus
    • Clear migration guidance to new sequence-based terms

Documentation

  • shacl12-node-expr/index.html
    • Renamed sections: UnionExpression → JoinExpression, MinusExpression → RemoveExpression

New Advanced Operations:

# Find first senior employee
sh:values [
    shnex:findFirst [
        shnex:nodes [ shnex:pathValues ex:employee ] ;
        shnex:findFirst ex:SeniorEmployeeShape ;
    ] ;
] .

# Check if all employees are active
sh:values [
    shnex:matchAll [
        shnex:nodes [ shnex:pathValues ex:employee ] ;
        shnex:matchAll ex:ActiveEmployeeShape ;
    ] ;
] .

@afs
Copy link
Contributor

afs commented Nov 5, 2025

join has a very strong meaning for databases. concat for lists maybe. Or "combine".

"concatentation" is used in the definition.

Various text stil uses "union":

"The node expressions that shall be unioned."

and

EVALUATION OF UNION EXPRESSIONS

and

"Note that a union expression may produce "

Example 10 has sh:union.

may produce duplicate output nodes if the individual output nodes overlap.

The text on duplicates is unclear - it seems to read as they may occur but also that a system that eliminates them is also correct.

Is it order preserving?

Also - shnex:remove - is it order preserving?

The nodes that shall be removed from the shnex:nodes.

that could be read as one-for-one or a remove contained list.

A list "1 1 1 2 2" removing "1 1" - could be "1 2 2", or "1 1 2 2" legal or "1 2 1" with different readings.

@afs
Copy link
Contributor

afs commented Nov 5, 2025

Suggestion: make the names "list*" (including similar operations). This gives more freedom to what the specific operation name is:

shnex:listJoin, shnex:listConcat, , shnex:listUnion,

shnex:listRemovelAll, shnex:listRemoveElements

The handling of order preserving needs to be made clear - or text to say it does not matter (i.e. it's a multiset/bag).

This is in other places as well -- shnex:intersection

(what is the intersection of "1 1 1 2 2 1" and "1 1"? the text can be read several ways including as "subsection" or keep original (LHS) cardinality)

Co-authored-by: Ted Thibodeau Jr <[email protected]>
@simonstey
Copy link
Contributor Author

Here's the summary of how I tried to address @afs 's comments:

1. “Join” → “Concat” terminology and semantics

In shacl12-node-expr/index.html:

  • The example for distinct expressions was updated so that the derived property ex:superClassesIncludingRoot is now described as:

    • computed using a concat expression instead of a join expression:
      • Link text: joinconcat
      • Anchor: #JoinExpression#ConcatExpression
    • Turtle example: shnex:join (shnex:concat (.
  • The former “Join Expressions” section was replaced with a new “Concat Expressions” section:

    • New terminology:
      • join expressionconcat expression
      • shnex:JoinExpressionshnex:ConcatExpression
      • property shnex:joinshnex:concat.
    • Semantics:
      • Defined evaluation rule for concat expressions:
        • Let members be the members of the shnex:concat value.
        • Output nodes are the concatenation of the output nodes of each member expression NE via evalExpr(NE, focusGraph, focusNode, scope).
        • Order is preserved, left‑to‑right, keeping the order within each list.
      • Note that concat may produce duplicates, and shnex:distinct can be used to remove them.
    • New example ex:allRelatives:
      • Declares ex:allRelatives combining values of ex:parent and ex:sibling via shnex:concat.
      • Includes a detailed evaluation trace showing how the two lists are concatenated into one sequence and an abstract numeric example ([1,2] + [3,4,5][1,2,3,4,5]).

2. Expanded examples and evaluation traces

Intersection Expressions section:

  • Added an evaluation trace example for an intersection expression (ex:Australian, ex:German) with:
    • Concrete lists of persons from each expression.
    • The resulting intersection.
    • An abstract numeric example ([1,2,3,4][2,4,5][2,4]).

Remove Expressions section:

  • Clarified evaluation terminology:
    • Variable name toRemoveremove:
      • “Let remove be the value of shnex:remove…”
      • evalExpr(toRemove, …)evalExpr(remove, …).
  • Added a full illustrative example for shnex:remove:
    • Derived property ex:availableAuthors:
      • shnex:nodes [ shnex:pathValues ex:author ]
      • shnex:remove [ shnex:pathValues ex:authorOnLeave ]
    • Example data and evaluation trace:
      • Shows authors list (with duplicates), authors on leave, and the result after removing all occurrences of the authors on leave.
      • Includes an abstract numeric example removing [1,1] from [1,1,1,2,2][2,2], emphasizing that all instances of the removed nodes are eliminated.

3. FlatMap Expressions: clarified semantics and richer example

The Advanced Sequence Operations / FlatMap Expressions section was significantly reworked:

  • Structural and wording cleanup, but also semantic clarifications:

    • Still defines shnex:FlatMapExpression with parameters shnex:flatMap and shnex:nodes.
    • Evaluation now more explicitly described:
      • Let flatMap be the value of shnex:flatMap.
      • Let nodes be the value of shnex:nodes (or the focus node if omitted).
      • Let N be the output nodes of evalExpr(nodes, focusGraph, focusNode, scope).
      • For each node n in N, let Mₙ be the output nodes of evalExpr(flatMap, focusGraph, n, scope).
      • The flatMap output is the concatenation of all Mₙ in the order of n in N.
    • Additional explanatory paragraphs:
      • Emphasize that the focus node changes for each iteration to n so relative paths work correctly.
      • Emphasize that duplicates are preserved and shnex:distinct can be used afterward if needed.
  • The ex:allSkills example was expanded:

    • Highlighted key portions of Turtle with <b>…</b> markup.
    • Added a data graph describing employees and their skills.
    • Added a detailed evaluation trace:
      • Shows shnex:nodes [shnex:pathValues ex:employee] producing [ex:Employee1, ex:Employee2, ex:Employee3].
      • Applies shnex:flatMap [shnex:pathValues ex:skill] to each employee to produce individual skill lists.
      • Concatenates all the skill lists into a single sequence.
      • Mentions optional follow‑up operations such as shnex:distinct, shnex:filterShape, shnex:limit.

4. Minor formatting tweaks

  • Some examples now wrap sh:values [...] with <b>…</b> to highlight the expression:
    • FindFirst example: sh:values [ ... ]sh:values <b>[ ... ]</b>.
    • MatchAll example: sh:values [ ... ]sh:values <b>[ ... ]</b>.
  • Removed placeholder JSON-LD blocks in the FindFirst and MatchAll examples (<div class="jsonld"><equivalent jsonld>…) to streamline the HTML.

5. Vocabulary updates for concat vs join

In shacl12-vocabularies/shacl.ttl:

  • Adjusted deprecation comment for sh:union:
    • From: “replaced by shnex:join.”
    • To: “replaced by shnex:concat.”

In shacl12-vocabularies/shnex.ttl:

  • Renamed the Join expression vocabulary to Concat:

    • Class:
      • shnex:JoinExpressionshnex:ConcatExpression
      • Label: “Join expression” → “Concat expression”
      • Parameter: shnex:JoinExpression-joinshnex:ConcatExpression-concat.
    • Property:
      • shnex:joinshnex:concat
      • Label: “join” → “concat”
      • Comment updated:
        • From “In Join Expressions, … joined.”
        • To “In Concat Expressions, … concatenated.”
      • Domain changed:
        • shnex:JoinExpressionshnex:ConcatExpression.

@simonstey simonstey requested a review from afs November 19, 2025 05:13
Copy link
Contributor

@HolgerKnublauch HolgerKnublauch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good starting point. It will be easier to drill into details once this is merged into main and we have individual sub-issues compared to this mega thread. I have not myself looked into all details but will do so once I do the implementation and test cases.

<p class="syntax">
<span data-syntax-rule="FlatMapExpression-syntax">
A <a>blank node</a> that is the <a>subject</a> of the following properties
is called a <dfn>flatMap expression</dfn>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
is called a <dfn>flatMap expression</dfn>,
is called a <dfn>flat map expression</dfn>,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elsewhere we used names like "path values expression" instead of pathValues expression, so we should decide on one or the other, but not mix them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whatever makes us agree to merge this in is OK for me. Naming is secondary. It is easy to get lost in details.

<div class="def-header">EVALUATION OF FLATMAP EXPRESSIONS</div>
<p>
Let <code>flatMap</code> be the <a>value</a> of <code>shnex:flatMap</code>
and <code>nodes</code> be the <a>value</a> of <code>shnex:nodes</code> in a <a>flatMap expression</a>.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
and <code>nodes</code> be the <a>value</a> of <code>shnex:nodes</code> in a <a>flatMap expression</a>.
and <code>nodes</code> be the <a>value</a> of <code>shnex:nodes</code> in a <a>flat map expression</a>.

<code>evalExpr(flatMap, focusGraph, <var>n</var>, scope)</code>.
</p>
<p>
The <a>output nodes</a> of the <a>flatMap expression</a> are produced by concatenating all sequences
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The <a>output nodes</a> of the <a>flatMap expression</a> are produced by concatenating all sequences
The <a>output nodes</a> of the <a>flat map expression</a> are produced by concatenating all sequences

<p class="syntax">
<span data-syntax-rule="FindFirstExpression-syntax">
A <a>blank node</a> that is the <a>subject</a> of the following properties
is called a <dfn>findFirst expression</dfn>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
is called a <dfn>findFirst expression</dfn>,
is called a <dfn>find first expression</dfn>,

<div class="def-header">EVALUATION OF FINDFIRST EXPRESSIONS</div>
<p>
Let <code>shape</code> be the <a>value</a> of <code>shnex:findFirst</code>
and <code>nodes</code> be the <a>value</a> of <code>shnex:nodes</code> in a <a>findFirst expression</a>.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
and <code>nodes</code> be the <a>value</a> of <code>shnex:nodes</code> in a <a>findFirst expression</a>.
and <code>nodes</code> be the <a>value</a> of <code>shnex:nodes</code> in a <a>find first expression</a>.

and <code>nodes</code> be the <a>value</a> of <code>shnex:nodes</code> in the <a>matchAll expression</a>.
If <code>shnex:nodes</code> is not specified, let <code>nodes</code> be the focus node.
Let <code>N</code> be the <a>output nodes</a> of <code>evalExpr(nodes, focusGraph, focusNode, scope)</code>.
The <a>output nodes</a> of the <a>matchAll expression</a> contain the boolean <code>true</code> if
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The <a>output nodes</a> of the <a>matchAll expression</a> contain the boolean <code>true</code> if
The <a>output nodes</a> of the <a>match all expression</a> contain the boolean <code>true</code> if

<p id="MatchAllExpressionExample">
The following example illustrates the use of <code>shnex:matchAll</code> to derive a property
<code>ex:allEmployeesActive</code> that checks whether all employees of a company are currently active.
The matchAll operation tests each employee against a shape that validates their active status.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The matchAll operation tests each employee against a shape that validates their active status.
The match all operation tests each employee against a shape that validates their active status.

</span>
</p>
<div class="def" id="FindFirstExpression-evaluation">
<div class="def-header">EVALUATION OF FINDFIRST EXPRESSIONS</div>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<div class="def-header">EVALUATION OF FINDFIRST EXPRESSIONS</div>
<div class="def-header">EVALUATION OF FIND FIRST EXPRESSIONS</div>

</span>
</p>
<div class="def" id="FlatMapExpression-evaluation">
<div class="def-header">EVALUATION OF FLATMAP EXPRESSIONS</div>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<div class="def-header">EVALUATION OF FLATMAP EXPRESSIONS</div>
<div class="def-header">EVALUATION OF FLAT MAP EXPRESSIONS</div>

Comment on lines +2047 to +2048
The <a>node expression</a> that is applied to each input node.
The <a>node expression</a> that is applied to each input node.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The <a>node expression</a> that is applied to each input node.
The <a>node expression</a> that is applied to each input node.
The <a>node expression</a> that is applied to each input node.

Comment on lines +1130 to +1131
shnex:remove <b>( owl:Thing rdfs:Resource )</b> ;
] .</b>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new </b> in line 1131 has no effect, as it's preceded by the existing </b> in line 1130. I believe the new one should be removed, because the old one follows the closing paren that matches the opening paren which immediately follows the opening <b>.

Suggested change
shnex:remove <b>( owl:Thing rdfs:Resource )</b> ;
] .</b>
shnex:remove <b>( owl:Thing rdfs:Resource )</b> ;
] .

</div>
<p><em>The remainder of this section is informative.</em></p>
<p>
The following example declares a derived property <code>ex:availableAuthors</code>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The following example declares a derived property <code>ex:availableAuthors</code>
The following example declares a derived property, <code>ex:availableAuthors</code>,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Node Expressions For SHACL 1.2 Node Expressions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Sequence processing

5 participants