PHPeggy

Migrating from phpegjs
Requirements
Installation
Usage
- Generating a Parser
  - JS API
  - Command Line
- Using the Parser
Grammar Syntax and Semantics
Conversion Guide - Peggy action blocks to PHPeggy

PHPeggy

A PHP code generation plugin for Peggy.

PHPeggy is the successor of phpegjs which had been abandoned by its maintainer.

Migrating from `phpegjs`

Peggy version 1.x.x is compatible with the most recent phpegjs release. Follow these steps to upgrade:

There are a few API changes compared to the most recent phpegjs release.

Options specific to PHPeggy have to be passed to phpeggy and not to phpegjs.

Follow these steps to upgrade:

Follow the migration instructions from Peggy.
Uninstall phpegjs.
Replace all require("phpegjs") or import ... from "phpegjs" with require("phpeggy") or import ... from "phpeggy" as appropriate.

PHPeggy-specific options are now passed to phpeggy:

var parser = peggy.generate("start = ('a' / 'b')+", {
-    plugins: [require("phpegjs")],
+    plugins: [require("phpeggy")],
-    phpegjs: { /* phpegjs-specific options */ }
+    phpeggy: { /* phpeggy-specific options */ }
});

That's it!

Requirements

Peggy (known compatible with version 5)
PHP version >=8 for the created parser
mbstring extension enabled

Installation

Node.js

Install Peggy with PHPeggy plugin

$ npm install peggy@^5.0.0 phpeggy

Usage

Generating a Parser

JS API

In Node.js, require both the Peggy parser generator and the PHPeggy plugin:

const peggy = require("peggy");
const phpeggy = require("phpeggy");

To generate a PHP parser, pass both the PHPeggy plugin and your grammar to peggy.generate:

const parser = peggy.generate("start = ('a' / 'b')+", {
    plugins: [phpeggy]
});

The method will return source code of generated parser as a string. Unlike original Peggy, generated PHP parser will be a class, not a function.

Supported options of peggy.generate:

allowedStartRules — rules the parser will be allowed to start parsing from (default: the first rule in the grammar)
cache — if true, makes the parser cache results, avoiding exponential parsing time in pathological cases but making the parser slower (default: false). In case of PHP, this is strongly recommended for big grammars (like javascript.pegjs or css.pegjs in example folder)
grammarSource — this object will be passed to any location() objects as the source property (default: undefined). This object will be used even if options.grammarSource is redefined in the grammar. It is useful to attach the file information to the errors, for example

You can also pass options specific to the PHPeggy plugin as follows:

const parser = peggy.generate("start = ('a' / 'b')+", {
    plugins: [phpeggy],
    phpeggy: { /* phpeggy-specific options */ }
});

Here are the options available to pass this way:

parserNamespace - namespace of generated parser (default: PHPeggy). If value is '' or null, no namespace will be used.
parserClassName - name of generated class for parser (default: Parser).
header - you can provide a custom header that will be added at the top of the parser, e.g. /* My custom header */.

Command Line

To generate a parser from your grammar, use the peggy command:

npx peggy --plugin /path/to/phpeggy/src/phpeggy.js arithmetics.pegjs

The following options might be of interest in the context of PHPeggy:

--allowed-start-rules <rules>
--cache
--extra-options <options>
-c, --extra-options-file <file>
-o, --output <file>
-S, --start-rule <rule>

--format is irrelevant as PHPeggy will only provide PHP source code.

Here is a more complex example:

npx peggy -o arithmeticsParser.php --plugin /path/to/phpeggy/src/phpeggy.js arithmetics.pegjs --cache --extra-option '{ "phpeggy" : { "parserNamespace" : "MyNameSpace", "parserClassName" : "ArithmeticsParser", "header" : "/* My custom header */" } }'

A more detailed description of the different options can be found in the peggy documentation.

Using the Parser

Save parser generated by peggy.generate to a file
In PHP code:

include "your.parser.file.php";

try {
    $parser = new PHPeggy\Parser;
    $result = $parser->parse($input);
} catch (PHPeggy\SyntaxError $ex) {
    // Handle parsing error
    // [...]
}

You can use the following snippet to format parsing errors:

catch (PHPeggy\SyntaxError $e) {
    $message = "Syntax error: " . $e->getMessage() . " at line " . $e->grammarLine . " column " . $e->grammarColumn . " offset " . $e->grammarOffset;
}

Or use SyntaxError->format():

catch (PHPeggy\SyntaxError $e) {
    $errorFormatted = $e->format(array(array("source" => "User input", "text" => $user_input)));
}

Which will look similar to:

SyntaxError: Expected "a" but "b" found.
 --> Input string:1:1
  |
1 | b
  | ^

Note that the generated PHP parser will call preg_match_all( '/./us', ... ) on the input string. This may be undesirable for projects that need to maintain compatibility with PCRE versions that are missing Unicode support (WordPress, for example). To avoid this call, split the input string into an array (one array element per UTF-8 character) and pass this array into $parser->parse() instead of the string input.

Grammar Syntax and Semantics

See documentation of Peggy with following differences:

action and predicate blocks should be written in PHP.
the per-parse initializer code block is used to provide additional methods, properties and constants to the Parser class. A special method function initialize() can be provided and resembles the Peggy per-parse initializer i.e. this method is called before the generated parser starts parsing (see examples/fizzbuzz.pegjs). All methods have access to the input ($this->input) and the options ($this->options).
the global initializer code block can be used to add use statements, classes, functions, constants, ...
Importing External Rules works only from the Command Line.

Original Peggy rule:

media_list = head:medium tail:("," S* medium)* {
  let result = [head];
  for (let i = 0; i < tail.length; i++) {
    result.push(tail[i][2]);
  }
  return result;
}

PHPeggy rule:

media_list = head:medium tail:("," S* medium)* {
  $result = [$head];
  for ($i = 0; $i < \count($tail); $i++) {
    $result[] = $tail[$i][2];
  }
  return $result;
}

To target both JavaScript and PHP with a single grammar, you can mix the two languages using a special comment syntax:

media_list = head:medium tail:("," S* medium)* {
  /** <?php
  $result = [$head];
  for ($i = 0; $i < \count($tail); $i++) {
    $result[] = $tail[$i][2];
  }
  return $result;
  ?> **/

  let result = [head];
  for (let i = 0; i < tail.length; i++) {
    result.push(tail[i][2]);
  }
  return result;
}

You can also use the following utility functions in PHP action blocks:

chr_unicode($code) - return character by its UTF-8 code (analogue of JavaScript's String.fromCharCode function).
ord_unicode($code) - return the UTF-8 code for a character (analogue of JavaScript's String.prototype.charCodeAt(0) function).

Guide for converting Peggy action blocks to PHPeggy

Javascript code	PHP analogue
`some_var`	`$some_var`
`{f1: "val1", f2: "val2"}`	`["f1" => "val1", "f2" => "val2"]`
`["val1", "val2"]`	`["val1", "val2"]`
`some_array.push("val")`	`$some_array[] = "val"`
`some_array.length`	`count($some_array)`
`some_array.join("")`	`implode("", $some_array)`
`some_array1.concat(some_array2)`	`array_merge($some_array1, $some_array2)`
`parseInt("23")`	`intval("23")`
`parseFloat("23.1")`	`floatval("23.1")`
`some_str.length`	`mb_strlen(some_str, "UTF-8")`
`some_str.replace("b", "\b")`	`str_replace("b", "\b", $some_str)`
`String.fromCharCode(2323)`	`chr_unicode(2323)`
`input`	`$this->input`
`options`	`$this->options`
`error(message, where)`	`$this->error(message, where)`
`expected(message, where)`	`$this->expected(message, where)`
`location()`	`$this->location()`
`range()`	`$this->range()`
`offset()`	`$this->offset()`
`text()`	`$this->text()`

Name		Name	Last commit message	Last commit date
Latest commit History 476 Commits
.github/workflows		.github/workflows
docker		docker
examples		examples
manual-test		manual-test
src		src
test		test
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.npmignore		.npmignore
.nvmrc		.nvmrc
.php-cs-fixer.dist.manual.php		.php-cs-fixer.dist.manual.php
.php-cs-fixer.dist.php		.php-cs-fixer.dist.php
AUTHORS		AUTHORS
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
composer.json		composer.json
docker-compose.yml		docker-compose.yml
eslint.config.mjs		eslint.config.mjs
package.json		package.json
phpstan.neon		phpstan.neon
pnpm-lock.yaml		pnpm-lock.yaml
psalm.xml		psalm.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PHPeggy

Migrating from `phpegjs`

Requirements

Installation

Node.js

Usage

Generating a Parser

JS API

Command Line

Using the Parser

Grammar Syntax and Semantics

Guide for converting Peggy action blocks to PHPeggy

About

Uh oh!

Releases 5

Packages

Uh oh!

Contributors 5

Uh oh!

Languages

License

MarcelBolten/phpeggy

Folders and files

Latest commit

History

Repository files navigation

PHPeggy

Migrating from phpegjs

Requirements

Installation

Node.js

Usage

Generating a Parser

JS API

Command Line

Using the Parser

Grammar Syntax and Semantics

Guide for converting Peggy action blocks to PHPeggy

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Migrating from `phpegjs`

Packages