Add IGNORECASE ability to Token Remapping. #67

manux81 · 2021-04-10T13:55:42Z

With reference to:
#62
I've found a problem with tokens remapping in the eventuality of:
reflags = re.IGNORECASE
This change allows creating special cases ignoring capitalization in the eventuality of
reflags = re.IGNORECASE is set in Lexer class.

Example:
class MyLexer(sly.Lexer):
reflags = re.IGNORECASE
# Base ID rule
ID = r'[a-zA-Z_][a-zA-Z0-9_]*'

# Special cases
ID['if'] = IF
ID['else'] = ELSE
ID['while'] = WHILE
# Now if,If,IF,iF are handle as the same special case

With reference to: dabeaz#62 I've found a problem with tokens remapping in the eventuality of: reflags = re.IGNORECASE This change allows creating special cases ignoring capitalization in the eventuality of reflags = re.IGNORECASE assignment in Lexer class. Example: class MyLexer(sly.Lexer): reflags = re.IGNORECASE # Base ID rule ID = r'[a-zA-Z_][a-zA-Z0-9_]*' # Special cases ID['if'] = IF ID['else'] = ELSE ID['while'] = WHILE # Now if,If,IF,iF are handle as the same special cases

jpsnyder · 2021-09-17T12:45:04Z

I'm running into this issue too. Any update on getting this merged in?

manux81 · 2021-09-26T19:50:26Z

Hi @jpsnyder, no @dabeaz still has not been reply to me.

dabeaz · 2021-09-26T21:30:53Z

Apologies on the delay. I do NOT see this change making it into SLY because of its special purpose nature and potential impact on performance. However, it is possible to achieve the desired effect using a function:

keywords = { 'if', 'else', 'while', }

@(_r'[a-zA-Z_][a-zA-Z0-9_]*')
def ID(self, t):
    if t.value.lower() in keywords:
        t.type = t.value.upper()
    return t

dabeaz · 2021-09-26T21:34:43Z

I would just note, that I'm going to keep this open for now. Even though I'm leaning towards not doing this, I might reconsider. I just don't have any timeline for it. In the meantime, use the function as a workaround.

jpsnyder · 2021-09-27T12:53:16Z

Fair enough. That workaround works for me. (And actually is better because I can customize the generated token to my liking.)

My only argument for accepting the MR, would be that it would seem to be expected behavior for token remapping to be case-insensitive if the user set that re flag to IGNORECASE. Principle of least surprises.

manux81 · 2021-09-29T02:19:11Z

I agree with @jpsnyder and also using reflags way is sound like flex approch.

jpsnyder · 2021-09-30T12:37:46Z

While I'm not sure on the performance hit for something like this, but would it make sense for the string value within the brackets for token remapping actually be a regex pattern like everything else? In which case, the regex pattern would be applied to the matches of the main token being remapped. (Make use of the .fullmatch() function)

This would solve the case sensitivity issue and provide more flexibility while still technically supporting the old way.
It could also allow us to apply different re flags to the token mapping than what is set globally.

ID["if"] = IF
ID["else"] = ELSE
ID["(?i)while"] = WHILE   # case-insenstive mapping
ID["def[0-9]"] = DEF   # def followed by a number

On the other hand, this is probably feature creep...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add IGNORECASE ability to Token Remapping. #67

Add IGNORECASE ability to Token Remapping. #67

manux81 commented Apr 10, 2021 •

edited

Loading

Uh oh!

jpsnyder commented Sep 17, 2021

Uh oh!

manux81 commented Sep 26, 2021

Uh oh!

dabeaz commented Sep 26, 2021

Uh oh!

dabeaz commented Sep 26, 2021

Uh oh!

jpsnyder commented Sep 27, 2021 •

edited

Loading

Uh oh!

manux81 commented Sep 29, 2021

Uh oh!

jpsnyder commented Sep 30, 2021 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add IGNORECASE ability to Token Remapping. #67

Are you sure you want to change the base?

Add IGNORECASE ability to Token Remapping. #67

Conversation

manux81 commented Apr 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jpsnyder commented Sep 17, 2021

Uh oh!

manux81 commented Sep 26, 2021

Uh oh!

dabeaz commented Sep 26, 2021

Uh oh!

dabeaz commented Sep 26, 2021

Uh oh!

jpsnyder commented Sep 27, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

manux81 commented Sep 29, 2021

Uh oh!

jpsnyder commented Sep 30, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

manux81 commented Apr 10, 2021 •

edited

Loading

jpsnyder commented Sep 27, 2021 •

edited

Loading

jpsnyder commented Sep 30, 2021 •

edited

Loading