-
Notifications
You must be signed in to change notification settings - Fork 142
SNOW-2443512: Add support for scalar string and binary functions ( part 2 ) #3946
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
SNOW-2443512: Add support for scalar string and binary functions ( part 2 ) #3946
Conversation
…ingAndBinaryPart2
…ingAndBinaryPart2 # Conflicts: # CHANGELOG.md
sfc-gh-mvegaalvarez
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
…ingAndBinaryPart2
| if position is None: | ||
| position = lit(1) | ||
| if occurrence is None: | ||
| occurrence = lit(1) | ||
| if option is None: | ||
| option = lit(0) | ||
| if regexp_parameters is None: | ||
| regexp_parameters = lit("c") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have two questions here:
-
do we need to check the dependency between the optional args? what happen if
occurrenceis set butpositionis not set. I feel this is not allowed in sql as in sql the arguments are positional, users have to specifypositionfirst thenoccurrencethen possibly the rest. -
I see there is a long note about
Note By default, REGEXP_INSTR returns the begin or end character offset for the entire matching part of the subject. However, if the e (for “extract”) parameter is specified, REGEXP_INSTR returns the begin or end character offset for the part of the subject that matches the first sub-expression in the pattern. If e is specified but a group_num is not also specified, then the group_num defaults to 1 (the first group). If there is no sub-expression in the pattern, REGEXP_INSTR behaves as if e was not set. For examples that use e, see [Examples](https://docs.snowflake.com/en/sql-reference/functions/regexp_instr#examples) in this topic.
shall we not set the default value from client at all and let server decide the default rule? I feel this way it's simpler
| subject: ColumnOrName, | ||
| pattern: ColumnOrName, | ||
| position: ColumnOrName = None, | ||
| occurrence: ColumnOrName = None, | ||
| regex_parameters: ColumnOrName = None, | ||
| group_num: ColumnOrName = None, | ||
| _emit_ast: bool = True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above
|
|
||
|
|
||
| @publicapi | ||
| def regexp_substr_all( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same for this func
…yPart2 # Conflicts: # CHANGELOG.md
…yPart2 # Conflicts: # CHANGELOG.md # docs/source/snowpark/functions.rst # src/snowflake/snowpark/_functions/scalar_functions.py
…yPart2 # Conflicts: # CHANGELOG.md # src/snowflake/snowpark/_functions/scalar_functions.py
…ingAndBinaryPart2
Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.
Fixes SNOW-2443512
Fill out the following pre-review checklist:
Please describe how your code solves the related issue.
Add support for the scalar string and binary functions:
hex_decode_stringjarowinkler_similarityparse_urlregexp_instrregexp_likeregexp_substrregexp_substr_allrtrimmed_lengthspacesplit_part