This is the readme file for make_stub_files.py (msf)
. This file explains what msf does and how it works.
Are you sure you want to use this program? For python 3 programs, including mypy annotations directly in your sources is a better alternative. Leo's new add-mypy-annotations command handles most of the tedious details of inserting annotations.
Also, several other tools create mypy stubs:
- MonkeyType (Python 3),
- PyAnnotate,
- stubgen.
After a brief overview, a step-by-step section will get you started. Full source code for make_stub_files.py is in its github repository. Everything is in the public domain.
This script makes a stub (.pyi) file in the output directory for each source file listed on the command line (wildcard file names are supported). This script never creates directories automatically, nor does it overwrite stub files unless the --overwrite command-line option is in effect.
GvR says, "We actually do have a stub generator as part of mypy now (it has a few options) but yours has the advantage of providing a way to tune the generated signatures...This allows for a nice iterative way of developing stubs."
The script does no type inference. Instead, the user supplies patterns in a configuration file. The script matches these patterns to:
-
The names of arguments in functions and methods and
-
The text of return expressions. Return expressions are the actual text of whatever follows the "return" keyword. The script removes all comments in return expressions and converts all strings to "str". This preprocessing greatly simplifies pattern matching.
As a first example, given the method:
def foo(self, i, s):
if i:
return "abc" # a comment
else:
return s
and the patterns:
i: int
s: str
the script produces the stub:
def foo(i: int, s: str) --> str: ...
make_stub_files
eliminates much of the drudgery of creating python stub (.pyi) files from python source files. Stub files can be used by people who use Python 2.x code bases.
-
Put
make_stub_files.py
on your path. -
Enter a directory containing .py files:
cd myDirectory
-
Generate stubs for foo.py in foo.pyi:
make_stub_files foo.py
-
Look at foo.pyi to see the generated stubs.
-
Regenerate foo.pyi with more verbose output:
make_stub_files foo.py -o -v
The -o (--overwrite) option allows the script to overwrite foo.pyi.
The -v (--verbose) options generates return comments for all stubs in foo.pyi. -
Update foo.pyi:
make_stub_files -o -u
The -u (--update) options updates foo.pyi as follows:
- adds stubs to foo.pyi for classes and defs that are new in foo.py.
- deletes stubs in foo.pyi for classes and defs that no longer exist in foo.py.
- leaves all other stubs in foo.pyi unchanged.
-
Specify a configuration file containing patterns:
make_stub_files -c myConfigFile.cfg -o
usage: make_stub_files.py [options] file1, file2, ...
make_stub_file: Create stub (.pyi) files from python files
positional arguments:
FILE input files
optional arguments:
-h, --help show this help message and exit
-c FILE, --config FILE
full path to configuration file
-d DIR, --dir DIR full path to the output directory
-f, --force-pyx force the parsing of .pyx files
-o, --overwrite overwrite existing stub (.pyi) files
-s, --silent run without messages
--trace-matches trace Pattern.matches
--trace-patterns trace pattern creation
--trace-reduce trace st.reduce_types
--trace-visitors trace visitor methods
-u, --update update stubs in existing stub file
-v, --verbose verbose output in .pyi file
-w, --warn warn about unannotated args
Note: glob.glob wildcards can be used in file1, file2, ...
The --config command-line option specifies the full path to the optional configuration file. The configuration file uses the .ini format. It has several configuration sections, all optional.
The [Def Name Patterns] and [General Patterns] configuration sections specify patterns. All patterns have the form:
find-string: replacement-string
Colons are not allowed in the find-string. This is a limitation of .ini files.
There are three kinds of patterns: balanced, regex and plain.
Balanced patterns are patterns whose find string that:
A: contain either (*)
, [*]
, or {*}
or
B: ends with *
.
Unlike regular expressions, (*)
, [*]
, or {*}
match only
balanced brackets. A trailing *
matches the rest of the string.
Examples:
str(*): str
StubTraverser.do_*
Balanced patterns such as:
[*]: List[*]
work as expected. The script replaces the *
in replacement-strings with
whatever matched *
in the find-string.
Regex patterns (regular expression patterns) are denoted by a
find-string that ends with $
. The trailing $
does not become part of
the find-string. For example:
ab(.*)de$: de\1\1ab
A pattern is a plain pattern if it is neither a balanced nor a regex pattern.
The script matches patterns to all parts of return expressions.
Important: The script applies patterns separately to each return expression. Comments never appear in return expressions, and all strings in return values appear as str. As a result, there is no context to worry about context in which patterns are matched. Very short patterns suffice.
This configuration section specifies the files list, prefix lines and output directory. For example:
[Global]
files:
# Files to be used *only* if no files are given on the command line.
# glob.glob wildcards are supported.
~/leo-editor/leo/core/*.py
output_directory:
# The output directory to be used if no --dir option is given.
~/stubs
prefix:
# Lines to be inserted at the start of each stub file.
from typing import TypeVar, Iterable, Tuple
T = TypeVar('T', int, float, complex)
The script matches the find-strings in this section against names of functions and methods. For methods, the script matches find-strings against names of the form:
class_name.method_name
When a find-string matches, the replacement-string becomes the return type in the stub, without any further pattern matching. That is, this section overrides [General Patterns].
Example 1:
[Def Name Patterns]
myFunction: List[str]
Any function named myFunction returns List[str].
Example 2:
[Def Name Patterns]
MyClass.myMethod: str
The myMethod method of the MyClass class returns str.
Example 3:
[Def Name Patterns]
MyClass.do_*: str
All methods of the MyClass class whose names start with "do_" return str.
For each function or method, the script matches the patterns in this section against all parts of all return expressions in each function or method.
The intent of the patterns in this section should be to reduce return expressions to known types. A known type is a either a name of a type class, such as int, str, long, etc. or a type hint, as per Pep 484.
The script always produces a syntactically correct stub, even if the patterns do not reduce the return expression to a known type. For unknown types, the script does the following:
-
Uses Any as the type of the function or method.
-
Follows the stub with a list of comments giving all the return expressions in the function or method.
For example, suppose that the patterns are not sufficient to resolve the return type of:
def foo(a):
if a:
return a+frungify(a)
else:
return defrungify(a)
The script will create this stub:
def foo(a) --> Any: ...
# return a+frungify(a)
# return defrungify(a)
The comments preserve maximal information about return types, which should help the user to supply a more specific return type. The user can do this in two ways by altering the stub files by hand or by adding new patterns to the config file.
The script eliminates most of the drudgery from creating stub files. The script produces syntactically and semantically correct stub files without any patterns at all. Patterns make it easy to make stubs more specific.
Once we create stub files, mypy will check them by doing real type inference. This will find errors both in the stub files and in the program under test. There is now an easy way to use mypy!
Stubs express design intentions and intuitions as well as types. Until now, there has been no practical way of expressing and testing these assumptions. Now there is.
Using mypy, we can be as specific as we like about types. We can simply annotate that d is a dict, or we can say that d is a dict whose keys are strings and whose values are executables with a union of possible signatures. Stubs are the easy way to play with type inference.
Finally, stubs can simplify the general type inference problem. Without type hints or annotations, the type of everything depends on the type of everything else. Stubs could allow robust, maybe even complete, type inference to be done locally. Stubs help mypy to work faster.
The make-stub-files script does for type/design analysis what Leo's c2py command did for converting C sources to python. It eliminates much of the drudgery associated with creating stub files, leaving the programmer to make non-trivial inferences.
Stub files allow us to explore type checking using mypy as a guide and helper. Stub files are both a design document and an executable, checkable, type specification. Stub files allow those with a Python 2 code base to use mypy.
One could imagine a similar insert_annotations script that would inject function annotations into source files using stub files as data. The "reverse" script should be more straightforward than this script.
Edward K. Ream
January 25 to February 15, 2016
Revised, August 5, 2021.
Are you sure you want to use this program? For python 3 programs, including mypy annotations directly in your sources is a better alternative. Leo's new add-mypy-annotations command handles most of the tedious details of inserting annotations.
Version 1.0: August 10, 2021.
- make_stub_files (msf) now requires Python 3.6 or above.
- Use traditional unit tests instead of Leo's deprecated @test nodes. These tests provide 100% coverage.
- Add several Leo buttons.
- Use the coverage and unittest buttons to run tests.
- Use the msf button to run make_stub_files.py on itself.
- Fix all known bugs.
- A new unit test that ensures that all required ast visitors exist.
- Add scripts/wax_off.py.