A generic AST to represent Python2 and Python3's Abstract Syntax Tree(AST).
GAST provides a compatibility layer between the AST of various Python versions,
as produced by ast.parse from the standard ast module.
>>> import ast, gast
>>> code = open('file.py').read()
>>> tree = ast.parse(code)
>>> gtree = gast.ast_to_gast(tree)
>>> ... # process gtree
>>> tree = gast.gast_to_ast(gtree)
>>> ... # do stuff specific to treegast provides the same API as the ast module. All functions and classes
from the ast module are also available in the gast module, and work
accordingly on the gast tree.
Three notable exceptions:
gast.parsedirectly produces a GAST node. It's equivalent to runninggast.ast_to_gaston the output ofast.parse.
gast.dumpdumps thegastcommon representation, not the original- one.
gast.gast_to_astandgast.ast_to_gastcan be used to convert- from one ast to the other, back and forth.
GAST is tested using tox and Travis on the following Python versions:
- 2.7
- 3.4
- 3.5
- 3.6
- 3.7
- 3.8
- 3.9
- 3.10
- 3.11
The AST used by GAST is the same as the one used in Python3.9, with the
notable exception of ast.arg being replaced by an ast.Name with an
ast.Param context.
The name field of ExceptHandler is represented as an ast.Name with
an ast.Store context and not a str.
For minor version before 3.9, please note that ExtSlice and Index are
not used.
For minor version before 3.8, please note that Ellipsis, Num, Str,
Bytes and NamedConstant are represented as Constant.
For minor version before 3.10, the _field_types field exist for each AST
class, but it's always empty.
To cope with Python3 features, several nodes from the Python2 AST are extended with some new attributes/children, or represented by different nodes
ModuleDefnodes have atype_ignoresattribute.FunctionDefnodes have areturnsattribute and atype_commentattribute.ClassDefnodes have akeywordsattribute.With'scontext_exprandoptional_varsfields are hold in awithitemobject.Fornodes have atype_commentattribute.Raise'stype,instandtbackfields are hold in a singleexcfield, using the transformationraise E, V, T => raise E(V).with_traceback(T).TryExceptandTryFinallynodes are merged in theTrynode.argumentsnodes have akwonlyargsandkw_defaultsattributes.Callnodes loose theirstarargsattribute, replaced by an argument wrapped in aStarrednode. They also loose theirkwargsattribute, wrapped in akeywordnode with the identifier set toNone, as done in Python3.comprehensionnodes have anasyncattribute (that is always set to 0).Ellipsis,NumandStrnodes are represented asConstant.Subscriptwhich don't have anySlicenode assliceattribute (andEllipsisare notSlicenodes) no longer hold anExtSlicebut anIndex(Tuple(...))instead.
- In Python3,
None,TrueandFalseare parsed asConstantwhile they are parsed as regularNamein Python2.
This closely matches the one from https://docs.python.org/3/library/ast.html#abstract-grammar, with a few trade-offs to cope with legacy ASTs.
-- ASDL's six builtin types are identifier, int, string, bytes, object, singleton
module Python
{
mod = Module(stmt* body, type_ignore *type_ignores)
| Interactive(stmt* body)
| Expression(expr body)
| FunctionType(expr* argtypes, expr returns)
-- not really an actual node but useful in Jython's typesystem.
| Suite(stmt* body)
stmt = FunctionDef(identifier name, arguments args,
stmt* body, expr* decorator_list, expr? returns,
string? type_comment, type_param* type_params)
| AsyncFunctionDef(identifier name, arguments args,
stmt* body, expr* decorator_list, expr? returns,
string? type_comment, type_param* type_params)
| ClassDef(identifier name,
expr* bases,
keyword* keywords,
stmt* body,
expr* decorator_list,
type_param* type_params)
| Return(expr? value)
| Delete(expr* targets)
| Assign(expr* targets, expr value, string? type_comment)
| TypeAlias(expr name, type_param* type_params, expr value)
| AugAssign(expr target, operator op, expr value)
-- 'simple' indicates that we annotate simple name without parens
| AnnAssign(expr target, expr annotation, expr? value, int simple)
-- not sure if bool is allowed, can always use int
| Print(expr? dest, expr* values, bool nl)
-- use 'orelse' because else is a keyword in target languages
| For(expr target, expr iter, stmt* body, stmt* orelse, string? type_comment)
| AsyncFor(expr target, expr iter, stmt* body, stmt* orelse, string? type_comment)
| While(expr test, stmt* body, stmt* orelse)
| If(expr test, stmt* body, stmt* orelse)
| With(withitem* items, stmt* body, string? type_comment)
| AsyncWith(withitem* items, stmt* body, string? type_comment)
| Match(expr subject, match_case* cases)
| Raise(expr? exc, expr? cause)
| Try(stmt* body, excepthandler* handlers, stmt* orelse, stmt* finalbody)
| TryStar(stmt* body, excepthandler* handlers, stmt* orelse, stmt* finalbody)
| Assert(expr test, expr? msg)
| Import(alias* names)
| ImportFrom(identifier? module, alias* names, int? level)
-- Doesn't capture requirement that locals must be
-- defined if globals is
-- still supports use as a function!
| Exec(expr body, expr? globals, expr? locals)
| Global(identifier* names)
| Nonlocal(identifier* names)
| Expr(expr value)
| Pass | Break | Continue
-- XXX Jython will be different
-- col_offset is the byte offset in the utf8 string the parser uses
attributes (int lineno, int col_offset)
-- BoolOp() can use left & right?
expr = BoolOp(boolop op, expr* values)
| NamedExpr(expr target, expr value)
| BinOp(expr left, operator op, expr right)
| UnaryOp(unaryop op, expr operand)
| Lambda(arguments args, expr body)
| IfExp(expr test, expr body, expr orelse)
| Dict(expr* keys, expr* values)
| Set(expr* elts)
| ListComp(expr elt, comprehension* generators)
| SetComp(expr elt, comprehension* generators)
| DictComp(expr key, expr value, comprehension* generators)
| GeneratorExp(expr elt, comprehension* generators)
-- the grammar constrains where yield expressions can occur
| Await(expr value)
| Yield(expr? value)
| YieldFrom(expr value)
-- need sequences for compare to distinguish between
-- x < 4 < 3 and (x < 4) < 3
| Compare(expr left, cmpop* ops, expr* comparators)
| Call(expr func, expr* args, keyword* keywords)
| Repr(expr value)
| FormattedValue(expr value, int? conversion, expr? format_spec)
| JoinedStr(expr* values)
| Constant(constant value, string? kind)
-- the following expression can appear in assignment context
| Attribute(expr value, identifier attr, expr_context ctx)
| Subscript(expr value, slice slice, expr_context ctx)
| Starred(expr value, expr_context ctx)
| Name(identifier id, expr_context ctx, expr? annotation,
string? type_comment)
| List(expr* elts, expr_context ctx)
| Tuple(expr* elts, expr_context ctx)
-- col_offset is the byte offset in the utf8 string the parser uses
attributes (int lineno, int col_offset)
expr_context = Load | Store | Del | AugLoad | AugStore | Param
slice = Slice(expr? lower, expr? upper, expr? step)
| ExtSlice(slice* dims)
| Index(expr value)
boolop = And | Or
operator = Add | Sub | Mult | MatMult | Div | Mod | Pow | LShift
| RShift | BitOr | BitXor | BitAnd | FloorDiv
unaryop = Invert | Not | UAdd | USub
cmpop = Eq | NotEq | Lt | LtE | Gt | GtE | Is | IsNot | In | NotIn
comprehension = (expr target, expr iter, expr* ifs, int is_async)
excepthandler = ExceptHandler(expr? type, expr? name, stmt* body)
attributes (int lineno, int col_offset, int? end_lineno, int? end_col_offset)
arguments = (expr* args, expr* posonlyargs, expr? vararg, expr* kwonlyargs,
expr* kw_defaults, expr? kwarg, expr* defaults)
-- keyword arguments supplied to call (NULL identifier for **kwargs)
keyword = (identifier? arg, expr value)
-- import name with optional 'as' alias.
alias = (identifier name, identifier? asname)
attributes (int lineno, int col_offset, int? end_lineno, int? end_col_offset)
withitem = (expr context_expr, expr? optional_vars)
match_case = (pattern pattern, expr? guard, stmt* body)
pattern = MatchValue(expr value)
| MatchSingleton(constant value)
| MatchSequence(pattern* patterns)
| MatchMapping(expr* keys, pattern* patterns, identifier? rest)
| MatchClass(expr cls, pattern* patterns, identifier* kwd_attrs, pattern* kwd_patterns)
| MatchStar(identifier? name)
-- The optional "rest" MatchMapping parameter handles capturing extra mapping keys
| MatchAs(pattern? pattern, identifier? name)
| MatchOr(pattern* patterns)
attributes (int lineno, int col_offset, int end_lineno, int end_col_offset)
type_ignore = TypeIgnore(int lineno, string tag)
type_param = TypeVar(identifier name, expr? bound)
| ParamSpec(identifier name)
| TypeVarTuple(identifier name)
attributes (int lineno, int col_offset, int end_lineno, int end_col_offset)
}
Bugs can be reported through GitHub issues.
If for some reason, you think your bug is security-related and should be subject to responsible disclosure, don't hesitate to contact the maintainer directly.