-
Notifications
You must be signed in to change notification settings - Fork 46
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Implement non-ascii variables in Java and Rust (#310)
## What is the goal of this PR? We update to TypeQL with Unicode support in both value and concept variables. This makes the following valid TypeQL: ``` match $人 isa person, has name "Liu"; get $人; ``` ``` match $אדם isa person, has name "Solomon"; get $אדם; ``` We now require all Labels and Variables are valid unicode identifiers not starting with `_`. This change is fully backwards compatible. We also validate that Type Labels and Variables created using the TypeQL language builders in both Rust and Java are conforming to our Unicode specification. ## What are the changes implemented in this PR? - Refactor variable names to allow the same range of characters as Labels - which is unicode identifier excluding a leading `_` - Add language builder validation for the strings provided to Labels or Variable names - Update typedb-behaviour
- Loading branch information
1 parent
3a523f4
commit 025798c
Showing
21 changed files
with
174 additions
and
41 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
/* | ||
* Copyright (C) 2022 Vaticle | ||
* | ||
* Licensed to the Apache Software Foundation (ASF) under one | ||
* or more contributor license agreements. See the NOTICE file | ||
* distributed with this work for additional information | ||
* regarding copyright ownership. The ASF licenses this file | ||
* to you under the Apache License, Version 2.0 (the | ||
* "License"); you may not use this file except in compliance | ||
* with the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, | ||
* software distributed under the License is distributed on an | ||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
* KIND, either express or implied. See the License for the | ||
* specific language governing permissions and limitations | ||
* under the License. | ||
*/ | ||
|
||
use std::sync::OnceLock; | ||
|
||
use regex::{Regex, RegexBuilder}; | ||
|
||
pub fn is_valid_identifier(identifier: &str) -> bool { | ||
static REGEX: OnceLock<Regex> = OnceLock::new(); | ||
let regex = REGEX.get_or_init(|| { | ||
let identifier_start = "A-Za-z\ | ||
\\u00C0-\\u00D6\ | ||
\\u00D8-\\u00F6\ | ||
\\u00F8-\\u02FF\ | ||
\\u0370-\\u037D\ | ||
\\u037F-\\u1FFF\ | ||
\\u200C-\\u200D\ | ||
\\u2070-\\u218F\ | ||
\\u2C00-\\u2FEF\ | ||
\\u3001-\\uD7FF\ | ||
\\uF900-\\uFDCF\ | ||
\\uFDF0-\\uFFFD"; | ||
let identifier_tail = format!( | ||
"{}\ | ||
0-9\ | ||
_\ | ||
\\-\ | ||
\\u00B7\ | ||
\\u0300-\\u036F\ | ||
\\u203F-\\u2040", | ||
identifier_start | ||
); | ||
let identifier_pattern = format!("^[{}][{}]*$", identifier_start, identifier_tail); | ||
RegexBuilder::new(&identifier_pattern).build().unwrap() | ||
}); | ||
regex.is_match(identifier) | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.