Discussion about the parser performances.

Hi,

Currently the parser is Brython's biggest bottleneck.
Indeed, the `py2ast` process is responsible of ~74% of the total execution time.

I took a quick glance at it, and noticed several things:
1. `$.Parser` does an inefficient copy of `tokens` (1,671 tokens for my tests, 150k+ on bigger files).
    Do we really need to perform such a copy ? If so, maybe using `.filter()` or :
    ```js
    const tokens = new Array(_tokens.length); // preallocate
    let offset   = 0;
    // add a token:
    tokens[offset++] =  _tokens[i];
    // ....
    tokens.length = offset;
    ```
    would be more efficient ?
2. I noticed a `tokens.splice()`, if it is called several times, it can be quite trouble some... do we really need it ?
    One way would be to mark the token, e.g. `token.TO_REMOVE = true` and then remove then all at once, when we are done.
3. A token as a lot of fields (9) with some that seems redundant:
    - 4x position : the start position may be deductible from the previous token ?
    - `line` : why do we need to store it as when we have the `lineno` property ?
    - `type`/`num_type` :  do we really need to store `type` ? I have the feeling we can deduce it from `num_type` ?
    - `string`/`bytes` : I didn't noticed a case where they have a different value.
4. Lot of :
    ```js
    var EXTRA = {};
    EXTRA.lineno = token.lineno;
    EXTRA.?          = token.?;
    ```
    I think it would help JS engine to do :
    ```js
    const EXTRA = {
        lineno: token.lineno,
        ?          : token.?
    }
    ```

I didn't take a deeper look, but ofc if it is possible to store the tokens into a pre-allocated `Float64Array`, it could also help a lot (this can be a pre-allocated buffer we can copy then reuse when parsing several scripts).


Cordially,

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Discussion about the parser performances. #2554

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Discussion about the parser performances. #2554

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions