Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse string arguments in ToString #574

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from
Draft

Conversation

rocky
Copy link
Member

@rocky rocky commented Oct 12, 2022

This is a little closer to the correct behavior when ToString[] is passed a string argument.

We currently can't handle passing an "encoding" parameter to ToString[], for example:

ToString['1->3', OutputForm, CharacterEncoding->"Unicode"]

The main obstacle here is understanding how go get the desired encoding passed down via the direct call to MakeBoxes down to formatting where this should be used.

WMA does not explicitly list CharacterEncoding as an option to MakeBoxes. Does it exist and is not documented? Assuming it does not exist, how would WMA pass this information down?

However it is done, in our current code base, we do not have such a feature, short of changing $CharacterEncoding which we should avoid doing.

@rocky rocky force-pushed the ToString-for-string-arg branch from 583e8ee to 0417a44 Compare October 12, 2022 22:48
@rocky rocky marked this pull request as draft October 12, 2022 23:00
@mmatera
Copy link
Contributor

mmatera commented Oct 12, 2022

This is a little closer to the correct behavior when ToString[] is passed a string argument.

We currently can't handle passing an "encoding" parameter to ToString[], for example:

ToString['1->3', OutputForm, CharacterEncoding->"Unicode"]

The main obstacle here is understanding how go get the desired encoding passed down via the direct call to MakeBoxes down to formatting where this should be used.

WMA do not explicitly list CharacterEncoding as an option to MakeBoxes. Does it exist and is not documented? Assuming it does not exist, how would WMA pass this information down?

No, it does not exists (I tested it). And, as I showed before, MakeBoxes does not use that information to build a Boxed expression. It happens afterwards (when Boxes are converted into text strings)

However it is done, in our current code base, we do not have such a feature (short of changing $CharacterEncoding which we should avoid doing.

Also, this does not happen in WMA. Otherwise, it would come up in my tests. In some way, ToString produces Boxed expressions, and then apply the encoding when the boxed expression is translated into a String.

# runs ToString[] on the resulting M-expression.
session = MathicsSession()
try:
expr = parse(session.definitions, MathicsSingleLineFeeder(expr.value))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really do you want to create a session to convert a string?

Copy link
Member Author

@rocky rocky Oct 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I suppose it is better to just Definitions. I am not sure though if we want add_builtin set True or False. You tell me.

The thought here was to use a clean slate of the definitions, and not modification made to the built-in functions.

But you tell me, does ToString behavior of an expression change if there has been an altered behavior to built-in function functions or other functions added which appear in the string part of ToString?

Copy link
Member Author

@rocky rocky Oct 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MakeBoxes does not use that information to build a Boxed expression.

I wasn't intending to imply that. Instead, there is ToString that accepts an encoding parameter. In the course of doing, ToString calls boxes_to_text() which triggers MakeBoxes .

Right now MakeBoxes performs formatting of operators before it returns. We need to get the information through to the formatter that MakeBoxes is invoking . In a functional approach which is what would be preferred here, this would passed via the function call arguments somehow.

It happens afterwards (when Boxes are converted into text strings)

Currently in Mathics, Boxing and Formatting can be interleaved. So right now there is no absolute "afterwards".

In Wolfram Alpha are there ever situations where Boxing and Formatting are interleaved?

Removing this interleaving in Mathics would require a major refactoring of our code. If that needs to be done, I think we need to reduce our goals for the time being and set a more easily-achievable bar to reach.

that apparently is all we need.
@rocky
Copy link
Member Author

rocky commented Oct 13, 2022

In order to dig deeper into ToString[] and what's up with encoding, it looks the encode_tex and encode_mathml functions should be replaced, and the do_format_xxx routines mathics/core/formatter.py needs to be rethought. Sigh.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants