Skip to content

MDEV-20022 sql_mode="oracle" does not support TO_NUMBER() function #4128

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

Olernov
Copy link
Contributor

@Olernov Olernov commented Jun 19, 2025

  • The Jira issue number for this PR is: MDEV-20022

Description

This patch implements an Oracle style function to_number() with the following signatures:

  • to_number(number_or_string_subject)
  • to_number(string_subject, string_format)

The function is implemented in sql/item_numconvfunc.cc.

The function returns the DOUBLE data type for
all signatures and input data types.

The format parser understands the following components:

  • Digits: 0, 9
  • Hex digits: X
  • Group separators: comma (,) and G
  • Decimal delimiers: period (.) and D
  • Approximate number signature: EEEE
  • Currency/numeric flags: $ and B
  • Currency signatures: C, L, U
  • Sign signatures: S, MI, PR
  • Special format signatures: V, TM, TM9, TME
  • Format flag: FM

Note, the parser was implemented assuming that we'll also have the oppostite function to_char() soon for numeric input. So it parser all known components.
However, the function to_number() does not support:

  • Formats V, TM, TM9, TME. to_number() returns NULL if the format string has these components. These componens are supported only by to_char() in Oracle.

Features not inclided into this patch:

  • The ON CONVERSION ERROR clause
  • The third parameter (nlsparam)
  • Internationalized components: G, D, C, L, U. These features will be implemented later, under terms of MDEV-36978.

Notable changes in the other files:

  • item_func.h: adding Item_handled_func::Handler_double

  • simple_parser.h: adding a number of CONTAINER templates They help to save on duplicate code when creating classes suitable for passing into parsing templates such as OPT, OR2C, OR3C, etc

  • simple_parser.h: Adding parsing templates OR4C and OR5C

  • simple_parser.h: Moving the template "OPT" towars the beginning of the file Rule parsing templates TOKEN, TokenChoice, AND2, OR2C, OR3C, OR4C, OR5C, LIST now provide a sub-class Opt, to parse its optional rule.

Release Notes

TODO: What should the release notes say about this change?
Include any changed system variables, status variables or behaviour. Optionally list any https://mariadb.com/kb/ pages that need changing.

How can this PR be tested?

Tests are added to the suite.

If the changes are not amenable to automated testing, please explain why not and carefully describe how to test manually.

Basing the PR against the correct MariaDB version

  • This is a new feature or a refactoring, and the PR is based against the main branch.
  • This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.

PR quality check

  • I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
  • For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Contributor Author

@Olernov Olernov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial part of review comments. They are mostly cosmetic and intended to simplify the code and improve the readability

@@ -56,12 +56,283 @@
storage, like Extended_string_tokenizer.
*/

/*
A future change proposal:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This parser looks like a library, and I believe it should be documented more fully. What steps should a developer take to implement their own new parser based on the simple parser? Currently it's not clear.

It's worth mentioning that the policy-based design is employed, otherwise the code looks very complicated for those not familiar with such an approach

*/

template<class PARSER, class AParent, class A>
class CONTAINER1: public A
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class is named "Container" but what does it contain? If it doesn't contain anything, isn't there a better and more specific name for it? If its primary goal is "to derive the storage from another container" (as a comment says) maybe it should be named StorageDeriver or something?

Containers to collect parsed data into the goal structures

CONTAINER1 is needed to derive the storage from another container,
but to make a new distinct type.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the use cases when one would need such a derivation? How should they employ this class, can you please post an example?

GRAMMAR: fraction_pDV: fraction_pDV_signature [ fraction_body ]

GRAMMAR: fraction_pDVCLU: positional_currency [ fraction_body ]
GRAMMAR: | fraction_pDV [ postfix_currency_signature ]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding some examples here and at other places, it's very hard to imagine how such a grammar is used in format strings

@DaveGosselin-MariaDB DaveGosselin-MariaDB self-requested a review July 17, 2025 13:34
@abarkov abarkov force-pushed the bb-12.0-bar-MDEV-20022 branch 2 times, most recently from 045e38a to 6e753a9 Compare July 18, 2025 12:05
Copy link
Member

@DaveGosselin-MariaDB DaveGosselin-MariaDB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice. Please mention somewhere (maybe in simple_parser.h?) that the design pattern employed here is "Policy Based Design" and is described by https://en.wikipedia.org/wiki/Modern_C%2B%2B_Design .

@abarkov abarkov force-pushed the bb-12.0-bar-MDEV-20022 branch 8 times, most recently from 4976593 to 5fea1c6 Compare July 23, 2025 07:38
This patch implements an Oracle style function to_number()
with the following signatures:

- to_number(number_or_string_subject)
- to_number(string_subject, string_format)

The function is implemented in sql/item_numconvfunc.cc.

The function returns the DOUBLE data type for
all signatures and input data types.

The format parser understands the following components:

- Digits: 0, 9
- Hex digits: X
- Group separators: comma (,) and G
- Decimal delimiers: period (.) and D
- Approximate number signature: EEEE
- Currency/numeric flags: $ and B
- Currency signatures: C, L, U
- Sign signatures: S, MI, PR
- Special format signatures: V, TM, TM9, TME
- Format flag: FM

Note, the parser was implemented assuming that we'll also have the
oppostite function to_char() soon for numeric input.
So it parser all known components.
However, the function to_number() does not support:
- Formats V, TM, TM9, TME. to_number() returns NULL if the format
  string has these components.
  These componens are supported only by to_char() in Oracle.

Features not inclided into this patch:
- The ON CONVERSION ERROR clause
- The third parameter (nlsparam)
- Internationalized components: G, D, C, L, U.
These features will be implemented later, under terms of MDEV-36978.

Notable changes in the other files:

- item_func.h: adding Item_handled_func::Handler_double

- simple_parser.h: adding a number of *CONTAINER* templates
  They help to save on duplicate code when creating classes
  suitable for passing into parsing templates such as OPT, OR2C, OR3C, etc

- simple_parser.h: Adding parsing templates OR4C and OR5C

- simple_parser.h: Moving the template "OPT" towars the beginning of the file
  Rule parsing templates TOKEN, TokenChoice, AND2, OR2C, OR3C, OR4C, OR5C, LIST
  now provide a sub-class Opt, to parse its optional rule.
@abarkov abarkov force-pushed the bb-12.0-bar-MDEV-20022 branch from 5fea1c6 to 10931c6 Compare July 23, 2025 08:07
@abarkov abarkov closed this Jul 25, 2025
@abarkov
Copy link
Contributor

abarkov commented Jul 25, 2025

Closed the PR to avoid buildbot building the same push two times:

  • one time because it's a bb-* branch
  • another time because it's a PR

New comments will still go into this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

5 participants