You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+88-33Lines changed: 88 additions & 33 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
-
# PackCC #
1
+
# PackCC
2
2
3
-
## Overview ##
3
+
## Overview
4
4
5
5
**PackCC** is a parser generator for C.
6
6
Its main features are as follows:
@@ -41,14 +41,14 @@ This feature is irrelevant to common users, but helpful for PackCC developers to
41
41
42
42
PackCC itself is under MIT license, but you can distribute your generated code under any license you like.
43
43
44
-
## Installation ##
44
+
## Installation
45
45
46
46
You can obtain the executable `packcc` by compiling [`src/packcc.c`](src/packcc.c) using your favorite C compiler.
47
47
For convenience, the build environments using GCC, Clang, and Microsoft Visual Studio are prepared under [`build`](build) directory.
48
48
49
-
### Using GCC ###
49
+
### Using GCC
50
50
51
-
#### Other than MinGW ####
51
+
#### Other than MinGW
52
52
53
53
`packcc` will be built in both directories `build/gcc/debug/bin` and `build/gcc/release/bin` using `gcc` by executing the following commands:
54
54
@@ -60,7 +60,7 @@ make check # bats-core and uncrustify are required (see tests/README.md)
60
60
61
61
`packcc` in the directory `build/gcc/release/bin` is suitable for practical use.
62
62
63
-
#### MinGW ####
63
+
#### MinGW
64
64
65
65
`packcc` will be built in both directories `build/mingw-gcc/debug/bin` and `build/mingw-gcc/release/bin` using `gcc` by executing the following commands:
66
66
@@ -72,9 +72,9 @@ make check # bats-core and uncrustify are required (see tests/README.md)
72
72
73
73
`packcc` in the directory `build/mingw-gcc/release/bin` is suitable for practical use.
74
74
75
-
### Using Clang ###
75
+
### Using Clang
76
76
77
-
#### Other than MinGW ####
77
+
#### Other than MinGW
78
78
79
79
`packcc` will be built in both directories `build/clang/debug/bin` and `build/clang/release/bin` using `clang` by executing the following commands:
80
80
@@ -86,7 +86,7 @@ make check # bats-core and uncrustify are required (see tests/README.md)
86
86
87
87
`packcc` in the directory `build/clang/release/bin` is suitable for practical use.
88
88
89
-
#### MinGW ####
89
+
#### MinGW
90
90
91
91
`packcc` will be built in both directories `build/mingw-clang/debug/bin` and `build/mingw-clang/release/bin` using `clang` by executing the following commands:
92
92
@@ -98,10 +98,11 @@ make check # bats-core and uncrustify are required (see tests/README.md)
98
98
99
99
`packcc` in the directory `build/mingw-clang/release/bin` is suitable for practical use.
100
100
101
-
### Using Microsoft Visual Studio ###
101
+
### Using Microsoft Visual Studio
102
102
103
103
You have to install Microsoft Visual Studio 2019 in advance.
104
104
After that, you can build `packcc.exe` by the following instructions:
105
+
105
106
- Open the solution file `build\msvc\msvc.sln`,
106
107
- Select a preferred solution configuration (*Debug* or *Release*) and a preferred solution platform (*x64* or *x86*),
107
108
- Invoke the *Build Solution* menu item.
@@ -110,20 +111,21 @@ After that, you can build `packcc.exe` by the following instructions:
110
111
Here, `XXX` is `x64` or `x86`, and `YYY` is `Debug` or `Release`.
111
112
`packcc.exe` in the directory `build\msvc\XXX\Release` is suitable for practical use.
112
113
113
-
## Usage ##
114
+
## Usage
114
115
115
-
### Command ###
116
+
### Command
116
117
117
-
You must prepare a PEG source file (see the following section).
118
-
Let the file name `example.peg` for example.
118
+
You must prepare a PEG source file in advance.
119
+
For details of the PEG source syntax, see the section "Syntax".
120
+
Here, let the file name `example.peg` for example.
119
121
120
122
```
121
123
packcc example.peg
122
124
```
123
125
124
126
By running this, the parser source `example.h` and `example.c` are generated.
125
127
126
-
If no PEG file name is specified, the PEG source is read from the standard input, and `-.h` and `-.c`are generated.
128
+
If no PEG file name is specified, the PEG source is read from the standard input, and `-.h` and `-.c`will be generated.
127
129
128
130
The base name of the parser source files can be changed by `-o` option.
129
131
@@ -132,6 +134,19 @@ packcc -o parser example.peg
132
134
```
133
135
134
136
By running this, the parser source `parser.h` and `parser.c` are generated.
137
+
This option can be specified only once.
138
+
139
+
A directory to search for import files can be added by `-I` option (version 2.0.0 or later).
140
+
This option can be specified as many times as needed.
141
+
The firstly specified directory will be searched first, the secondly specified directory will be searched next, and so on.
142
+
143
+
```
144
+
packcc -I foo -I bar/baz example.peg
145
+
```
146
+
147
+
By running this, the directory `foo` is searched first, and the directory `bar/baz` is searched next.
148
+
The directories specified by this option have higher priority than those specified in the environment variable `PCC_IMPORT_PATH` and the default directories.
149
+
For more details of import, see the explanation of `%import` written in the section "Syntax".
135
150
136
151
If you want to disable UTF-8 support, specify the command line option `-a` or `--ascii` (version 1.4.0 or later).
137
152
@@ -144,7 +159,7 @@ If you want to confirm the version of the `packcc` command, execute the below.
144
159
packcc -v
145
160
```
146
161
147
-
### Syntax ###
162
+
### Syntax
148
163
149
164
A grammar consists of a set of named rules.
150
165
A rule definition can be split into multiple lines.
@@ -317,37 +332,37 @@ All matched actions are guaranteed to be executed only once.
317
332
318
333
In the action, the C source code can use the predefined variables below.
319
334
320
-
-**`$$`**
335
+
-**`$$`** :
321
336
The output variable, to which the result of the rule is stored.
322
337
The data type is the one specified by `%value`.
323
338
The default data type is `int`.
324
-
-**`auxil`**
339
+
-**`auxil`** :
325
340
The user-defined data that has been given via the API function `pcc_create()`.
326
341
The data type is the one specified by `%auxil`.
327
342
The default data type is `void *`.
328
-
-_variable_
343
+
-_variable_ :
329
344
The result of another rule that has already been evaluated.
330
345
If the rule has not been evaluated, it is ensured that the value is zero-cleared (version 1.7.1 or later).
331
346
The data type is the one specified by `%value`.
332
347
The default data type is `int`.
333
-
-**`$`**_n_
348
+
-**`$`**_n_ :
334
349
The string of the captured text.
335
350
The _n_ is the positive integer that corresponds to the order of capturing.
336
351
The variable `$1` holds the string of the first captured text.
337
-
-**`$`**_n_**`s`**
352
+
-**`$`**_n_**`s`** :
338
353
The start position in the input of the captured text, inclusive.
339
354
The _n_ is the positive integer that corresponds to the order of capturing.
340
355
The variable `$1s` holds the start position of the first captured text.
341
-
-**`$`**_n_**`e`**
356
+
-**`$`**_n_**`e`** :
342
357
The end position in the input of the captured text, exclusive.
343
358
The _n_ is the positive integer that corresponds to the order of capturing.
344
359
The variable `$1e` holds the end position of the first captured text.
345
-
-**`$0`**
360
+
-**`$0`** :
346
361
The string of the text between the start position in the input at which the rule pattern begins to match
347
362
and the current position in the input at which the element immediately before the action ends to match.
348
-
-**`$0s`**
363
+
-**`$0s`** :
349
364
The start position in the input at which the rule pattern begins to match.
350
-
-**`$0e`**
365
+
-**`$0e`** :
351
366
The current position in the input at which the element immediately before the action ends to match.
352
367
353
368
An example is shown below.
@@ -390,17 +405,20 @@ rule2 <- (e1 e2 e3) ~{ error("one of e[123] has failed"); }
390
405
The specified C source code is copied verbatim to the C header file before the generated parser API function declarations.
391
406
Any braces in the C source code must be properly nested.
392
407
Note that braces in directive lines and in comments (`/*`...`*/` and `//`...) are appropriately ignored.
408
+
When `%header` is used multiple times, the respective C source codes are copied in order of their appearance.
393
409
394
410
**`%source``{`_c source code_`}`**
395
411
396
412
The specified C source code is copied verbatim to the C source file before the generated parser implementation code.
397
413
Any braces in the C source code must be properly nested.
398
414
Note that braces in directive lines and in comments (`/*`...`*/` and `//`...) are appropriately ignored.
415
+
When `%source` is used multiple times, the respective C source codes are copied in order of their appearance.
399
416
400
417
**`%common``{`_c source code_`}`**
401
418
402
419
The specified C source code is copied verbatim to both of the C header file and the C source file
403
420
before the generated parser API function declarations and the implementation code respectively.
421
+
This has the same effect as `%header {`_c source code_`} %source {`_c source code_`}`.
404
422
Any braces in the C source code must be properly nested.
405
423
Note that braces in directive lines and in comments (`/*`...`*/` and `//`...) are appropriately ignored.
406
424
@@ -419,15 +437,42 @@ This can be useful for example when it is necessary to modify behavior of standa
419
437
420
438
The type of output data, which is output as `$$` in each action and can be retrieved from the parser API function `pcc_parse()`,
421
439
is changed to the specified one from the default `int`.
440
+
This can be used only once and cannot be used in imported files.
422
441
423
442
**`%auxil``"`_user-defined data type_`"`**
424
443
425
444
The type of user-defined data, which is passed to the parser API function `pcc_create()`,
426
445
is changed to the specified one from the default `void *`.
446
+
This can be used only once and cannot be used in imported files.
427
447
428
448
**`%prefix``"`_prefix_`"`**
429
449
430
450
The prefix of the parser API functions is changed to the specified one from the default `pcc`.
451
+
This can be used only once and cannot be used in imported files.
452
+
453
+
**`%import``"`_import file name_`"`**
454
+
455
+
The content of the specified import file is expanded at the text location of `%import` (version 2.0.0 or later).
456
+
This can be used multiple times anywhere and can be used also in imported files.
457
+
The _import file name_ can be a relative path to the current directory or an absolute path.
458
+
If it is a relative path, the directories listed below are searched for the import file in the listed order.
459
+
460
+
1. the directory where the file that imports the import file is located
461
+
2. the directories specified with `-I` options
462
+
- They are prioritized in order of their appearance in the command line.
463
+
3. the directories specified by the environment variable `PCC_IMPORT_PATH`
464
+
- They are prioritized in order of their appearance in the value of this variable.
465
+
- The character used as a delimiter between directory names is the colon `':'` if PackCC is built for a Unix-like platform such as Linux, macOS, and MinGW.
466
+
The character is the semicolon `';'` if PackCC is built as a native Windows executable.
467
+
(This is exactly the same manner as the environment variable `PATH`.)
468
+
4. the per-user default directory
469
+
- This is the subdirectory `.packcc/import` in the home directory if PackCC is built for a Unix-like platform,
470
+
and in the user profile directory, "`C:\Users\`_username_" for example, if PackCC is built as a native Windows executable.
471
+
5. the system-wide default directory
472
+
- This is the directory `/usr/share/packcc/import` if PackCC is built for a Unix-like platform,
473
+
and is the subdirectory `packcc/import` in the common application data directory, "`C:\ProgramData`" for example.
474
+
475
+
Note that the file imported once is silently ignored when it is attempted to be imported again.
431
476
432
477
**`#`_comment_**
433
478
@@ -440,7 +485,16 @@ All text following `%%` is copied verbatim to the C source file after the genera
440
485
441
486
<small>(The specification is determined by referring to [peg/leg](http://piumarta.com/software/peg/) developed by Ian Piumarta.)</small>
This contains various rules to match a Unicode character belonging to a specific [general category](https://unicode.org/reports/tr44/#General_Category_Values).
496
+
497
+
### Macros
444
498
445
499
Some macros are prepared to customize the parser.
446
500
The macro definition should be in <u>`%source` section</u> in the PEG source.
@@ -560,9 +614,10 @@ For other events, `buffer` and `length` indicate a part of the currently loaded
560
614
The user-defined data passed to the API function `pcc_create()` can be retrieved from this argument.
561
615
562
616
There are currently three supported events:
563
-
- `PCC_DBG_EVALUATE` (= 0) - called when the parser starts to evaluate `rule`
564
-
- `PCC_DBG_MATCH` (= 1) - called when `rule` is matched, at which point buffer holds entire matched string
565
-
- `PCC_DBG_NOMATCH` (= 2) - called when the parser determines that the input does not match currently evaluated `rule`
617
+
618
+
- `PCC_DBG_EVALUATE` (= 0) - called when the parser starts to evaluate `rule`
619
+
- `PCC_DBG_MATCH` (= 1) - called when `rule` is matched, at which point buffer holds entire matched string
620
+
- `PCC_DBG_NOMATCH` (= 2) - called when the parser determines that the input does not match currently evaluated `rule`
566
621
567
622
A very simple implementation could look like this:
568
623
@@ -590,7 +645,7 @@ The initial size (the number of elements) of the internal arrays other than the
590
645
The arrays are expanded as needed.
591
646
The default is `2`.
592
647
593
-
### API ###
648
+
### API
594
649
595
650
The parser API has only 3 simple functions below.
596
651
@@ -653,9 +708,9 @@ while (pcc_parse(ctx, &ret));
653
708
pcc_destroy(ctx);
654
709
```
655
710
656
-
## Examples ##
711
+
## Examples
657
712
658
-
### Desktop calculator ###
713
+
### Desktop calculator
659
714
660
715
A simple example which provides interactive four arithmetic operations of integers is shown here.
661
716
Note that **left-recursive** grammar rules are defined in this example.
@@ -700,7 +755,7 @@ int main() {
700
755
}
701
756
```
702
757
703
-
### AST builder for Tiny-C ###
758
+
### AST builder for Tiny-C
704
759
705
760
You can find the more practical example in the directory [`examples/ast-tinyc`](examples/ast-tinyc).
706
761
It builds an AST (abstract syntax tree) from an input source file
0 commit comments