Skip to content

grass.tools: Add API to access tools as functions #2923

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 79 commits into
base: main
Choose a base branch
from

Conversation

wenzeslaus
Copy link
Member

@wenzeslaus wenzeslaus commented Apr 18, 2023

This adds a Tools class which allows to access GRASS tools (modules) to be accessed using methods. Once an instance is created, calling a tool is calling a function (method) similarly to grass.jupyter.Map. Unlike grass.script, this does not require generic function name and unlike grass.pygrass module shortcuts, this does not require special objects to mimic the module families.

Outputs are handled through a returned object which is result of automatic capture of outputs and can do conversions from known formats using properties.

The code is included under new grass.experimental package which allows merging the code even when further breaking changes are anticipated.

Features

Function-like calling of tools:

  • All returncode, standard output, and error output are part of one result object similar to what subprocess.run returns.
  • Access and post-processing of the text result (standard output) is done by the result object (its properties or methods).
  • Additionally, the result object can be directly indexed to access a JSON parsed as a Python list or dictionary.
  • Standard input is passed as parameter_name=io.StringIO which takes care of input="-" and piping the text into the subprocess.
  • Tool failure causes an exception with an error message being part of the exception (traceback).
  • A session or env is accepted when creating an object (Tools(session=session)).
  • __dir__ code completion for function names and help() calls work and work even outside of a session.

Other functionality:

  • High-level and low-level usage is possible with four different functions: taking Python function parameters as tool parameters or taking a list of strings forming a command, and processing the parameters before the subprocess call or leaving the parameters as is (2x2 matrix).
  • Handling of inputs and outputs can be customized (e.g., not capturing stdout and stderr).
  • The env parameter is also accepted by individual functions (as with run_command).
  • The Tools object is a context manager, so it can be created with other context managers within one with statement and can include cleanup code in the future.
  • Tools-object-level overwrite and verbosity setting including limited support for the reversed (False) setting which is not possible with the standard flags at the tool level (there is no --overwrite=False in CLI).

Examples

Run a tool:

tools = Tools()
tools.r_random_surface(output="surface", seed=42)

Create a project, start an isolated session, and run tools (XY project for the example):

project = tmpdir / "project"
gs.create_project(project)
with (
    gs.setup.init(project, env=os.environ.copy()) as session,
    Tools(session=session) as tools,
):
    tools.g_region(rows=100, cols=100)
    tools.r_random_surface(output="surface", seed=42)

Work with return values (tool with JSON output):

# accessing a single value from the result:
assert tools.r_info(map="surface", format="json")["rows"] == 100
data = tools.r_info(map="surface", format="json")
# accessing more than one value from the result:
assert data["rows"] == 100
assert data["cols"] == 100

Text input as standard input:

tools.v_in_ascii(
    input=io.StringIO("13.45,29.96,200\n"),
    output="point",
    separator=",",
)

Work with RegionManager and MaskManager (test code):

project = tmpdir / "project"
gs.create_project(project)
with (
    gs.setup.init(project, env=os.environ.copy()) as session,
    gs.RegionManager(rows=100, cols=100, env=session.env),
    gs.MaskManager(env=session.env),
    Tools(session=mapset_session) as tools,
):
    # The tools here respect the local environment,
    # but it does not need to be passed explicitly.
    tools.r_random_surface(output="surface", seed=42)
    tools.r_mask(raster="surface")
    assert tools.r_mask_status(format="json")["present"]
    assert tools.g_region(flags="p", format="json")["rows"] == 100

@landam
Copy link
Member

landam commented Apr 20, 2023

It seems to be an useful addition. On the other hand we have already two APIs to run GRASS modules: grass.script.*_command() and grass.pygrass.modules which is already confusing for the user. What is a benefit of the third one? It would be useful to merge existing APIs into single one instead introducing another one.

@wenzeslaus
Copy link
Member Author

wenzeslaus commented Apr 21, 2023

It seems to be an useful addition.

I still need to provide more context for this, but do you see some benefits already?

On the other hand we have already two APIs to run GRASS modules: grass.script.*_command() and grass.pygrass.modules which is already confusing for the user.

The intro to this is obviously xkcd Standards.

I'm not happy with the two competing interfaces. It's almost three, because we have Module and than also shortcuts.

As far as I understand, grass.script.*_command() was written to closely mimic the Bash experience with minimal involvement of Python. Python layer is mostly just avoiding need to pass all parameters as strings.

grass.pygrass.modules was written to mimic the grass.script.*_command() API and to manipulate the module calls themselves.

What is a benefit of the third one?

The design idea is 1) to make the module (tool) calls as close to Python function calls as possible and 2) to access the results conveniently. To access the (text) results, it tries to mimic subprocess.run.

Additionally, it tries to 1) provide consistent access to all modules and 2) allow for extensibility, e.g., associating session parameters or computational region with a Tools object rather than passing it to every method.

The existing APIs are more general in some ways, especially because they make no assumptions about the output or its size. This API makes the assumption that you want the text output Python or that it is something small and you can just ignore that. If not, you need to use a more general API. After all, Tools itself, is using pipe_command to do the job.

It would be useful to merge existing APIs into single one instead introducing another one.

Given the different goals of the two APIs, I was not able to figure out how these can be merged. For example, the Module class from grass.pygrass was supposed to be a drop-in replacement for run_command, but it was not used that way much (maybe because it forces you to use class as an function). Any suggestions? What would be the features and aspects of each API worth keeping? For example, the Tools object might be able to create instances of the Module class.

I can also see that some parts of the new API could be part of the old ones like output-parsing related properties for the Module class, but there are some existing issues which the new API is trying to fix such as r.slope_aspect spelling in PyGRASS shortcuts and Python function name plus tools name as a string in grass.script.

Finally, the subprocess changed too over the years, introducing new functions with run being the latest addition, so reevaluation of our APIs seems prudent even if it involves adding functions as subprocess did.

Anyway, I think some unification would be an ideal scenario.

@wenzeslaus wenzeslaus force-pushed the add-session-tools-object branch from 96b1d0c to 0c21f1a Compare April 22, 2023 18:34
@wenzeslaus
Copy link
Member Author

wenzeslaus commented Apr 22, 2023

This is how exceptions look like currently in this PR: The error (whole stderr) is part of the exception, i.e., always printed with the traceback, not elsewhere, and it is under the traceback, not above like now (or even somewhere else in case of notebooks and GUI).

Traceback (most recent call last):
  File "experimental/tools.py", line 252, in <module>
    _test()
  File "experimental/tools.py", line 241, in _test
    tools_pro.feed_input_to("13.45,29.96,200").v_in_ascii(
  File "experimental/tools.py", line 185, in wrapper
    return self.run(grass_module, **kwargs)
  File "experimental/tools.py", line 148, in run
    raise gs.CalledModuleError(
grass.exceptions.CalledModuleError: Module run `v.in.ascii input=- output=point format=xstandard` ended with an error.
The subprocess ended with a non-zero return code: 1. See the following errors:

ERROR: Value <xstandard> out of range for parameter <format>
	Legal range: point,standard
Traceback (most recent call last):
  File "experimental/tools.py", line 252, in <module>
    _test()
  File "experimental/tools.py", line 241, in _test
    tools_pro.feed_input_to("13.45,29.96,200").v_in_ascii(
  File "experimental/tools.py", line 185, in wrapper
    return self.run(grass_module, **kwargs)
  File "experimental/tools.py", line 148, in run
    raise gs.CalledModuleError(
grass.exceptions.CalledModuleError: Module run `v.in.ascii input=- output=point format=standard` ended with an error.
The subprocess ended with a non-zero return code: 1. See the following errors:
WARNING: Vector map <point> already exists and will be overwritten
WARNING: Unexpected data in vector header:
         [13.45,29.96,200]
ERROR: Import failed

wenzeslaus added 10 commits June 3, 2023 23:57
This adds a Tools class which allows to access GRASS tools (modules) to be accessed using methods. Once an instance is created, calling a tool is calling a function (method) similarly to grass.jupyter.Map. Unlike grass.script, this does not require generic function name and unlike grass.pygrass module shortcuts, this does not require special objects to mimic the module families.

Outputs are handled through a returned object which is result of automatic capture of outputs and can do conversions from known formats using properties.

Usage example is in the _test() function in the file.

The code is included under new grass.experimental package which allows merging the code even when further breaking changes are anticipated.
@wenzeslaus wenzeslaus force-pushed the add-session-tools-object branch from 7996926 to 24c27e6 Compare June 3, 2023 22:56
@neteler neteler added this to the 8.4.0 milestone Aug 16, 2023
@landam landam added enhancement New feature or request Python Related code is in Python labels Nov 20, 2023
@wenzeslaus wenzeslaus modified the milestones: 8.4.0, Future Apr 26, 2024
@echoix echoix added the conflicts/needs rebase Rebase to or merge with the latest base branch is needed label Nov 7, 2024
@echoix echoix removed the conflicts/needs rebase Rebase to or merge with the latest base branch is needed label Nov 11, 2024
@echoix
Copy link
Member

echoix commented Nov 11, 2024

Solved conflicts

@wenzeslaus
Copy link
Member Author

Comparing to and migrating from run_command family of functions

Here are examples of how the different use cases of run_command and friends look like with the Tools API, organized by the grass.script counterparts to Tools API calls.

You can just thumbs up this if you find that reasonable, but feel free to comment, too. The new API keeps the focus on the tools themselves rather than having user go through different functions to call the tool with different inputs and outputs (run_command vs parse_command vs read_command vs write_command) or even through dedicated wrappers to get the output of the tool in a form reasonable in Python context (g.region as region, g.list as list_strings, etc.)

Imports

# original:
import grass.script as gs
# replacement
from grass.experimental.tools import Tools
import io  # only needed when stdin is used

run_command - just run the tool

# original:
gs.run_command(
    "r.random.surface", output="surface", seed=42
)
# replacement using the run function which is syntactically close to run_command:
tools = Tools()  # same for one or multiple calls
tools.run("r.random.surface", output="surface2", seed=42)  # name as a string
# assuming we already have tools and using the function syntax:
tools.r_random_surface(output="surface3", seed=42)  # name as a function

write_command - provide standard input (text)

# original:
gs.write_command(
    "v.in.ascii",
    input="-",
    output="point1",
    separator=",",
    stdin="13.45,29.96,200\n",
)
# replacement:
tools.run(
    "v.in.ascii",
    input=io.StringIO("13.45,29.96,200\n"),
    output="point2",
    separator=",",
)
# or with function name syntax:
tools.v_in_ascii(
    input=io.StringIO("13.45,29.96,200\n"),
    output="point3",
    separator=",",
)

read_command - get standard output (text)

# original:
assert (
    gs.read_command("g.region", flags="c")
    == "center easting:  0.500000\ncenter northing: 0.500000\n"
)
# replacement:
assert (
    tools.run("g.region", flags="c").stdout
    == "center easting:  0.500000\ncenter northing: 0.500000\n"
)
# or with function name syntax:
assert (
    tools.g_region(flags="c").text
    == "center easting:  0.500000\ncenter northing: 0.500000"
)

parse_command - get machine readable standard output

# original (numbers are strings):
assert gs.parse_command(
    "g.region", flags="c", format="shell"
) == {
    "center_easting": "0.500000",
    "center_northing": "0.500000",
}
# numbers are always numbers with JSON:
assert gs.parse_command(
    "g.region", flags="c", format="json"
) == {
    "center_easting": 0.5,
    "center_northing": 0.5,
}
# replacement with format=shell (numbers are not strings, but actual numbers as in JSON
# if they convert to Python int or float):
assert tools.run("g.region", flags="c", format="shell").keyval == {
    "center_easting": 0.5,
    "center_northing": 0.5,
}
# parse_command with JSON and the function call syntax:
assert tools.g_region(flags="c", format="json").json == {
    "center_easting": 0.5,
    "center_northing": 0.5,
}

parse_command storing JSON output in a variable and accessing individual values

# original:
data = gs.parse_command(
    "g.region", flags="c", format="json"
)
assert data["center_easting"] == 0.5
assert data["center_northing"] == 0.5
# replacement:
data = tools.g_region(flags="c", format="json")
assert data["center_easting"] == 0.5
assert data["center_northing"] == 0.5

Dedicated wrappers: r.mapcalc

# mapcalc wrapper of r.mapcalc
# original:
gs.mapcalc("a = 1")
# replacement for short expressions:
tools.r_mapcalc(expression="b = 1")
# replacement for long expressions:
tools.r_mapcalc(file=io.StringIO("c = 1"))

Dedicated wrappers: g.list

# test data preparation (for comparison of the results):
names = ["a", "b", "c", "surface", "surface2", "surface3"]
# original:
assert gs.list_grouped("raster")["PERMANENT"] == names
# replacement (using the JSON output of g.list):
assert [
    item["name"]
    for item in tools.g_list(type="raster", format="json")
    if item["mapset"] == "PERMANENT"
] == names
# original and replacement (directly comparing the results):
assert gs.list_strings("raster") == [
    item["fullname"] for item in tools.g_list(type="raster", format="json")
]
# original and replacement (directly comparing the results):
assert gs.list_pairs("raster") == [
    (item["name"], item["mapset"])
    for item in tools.g_list(type="raster", format="json")
]

Dedicated wrappers: all other tools

# Wrappers in grass.script usually parse shell-script style key-value pairs,
# and convert values from strings to numbers, e.g. g.region:
assert gs.region()["rows"] == 1
# Conversion is done automatically in Tools and/or with JSON, and the basic tool
# call syntax is more lightweight, so the direct tool call is not that different
# from a wrapper. Direct tool calling also benefits from better defaults (e.g.,
# printing more in JSON) and more consistent tool behavior (e.g., tools accepting
# format="json"). So, direct call of g.region to obtain the number of rows:
assert tools.g_region(flags="p", format="json")["rows"] == 1

run_command with returncode

# original:
assert (
    gs.run_command(
        "r.mask.status", flags="t", errors="status"
    )
    == 1
)
# replacement:
tools = Tools(errors="ignore")
assert tools.run("r.mask.status", flags="t").returncode == 1
assert tools.r_mask_status(flags="t").returncode == 1

run_command with overwrite

# original:
gs.run_command(
    "r.random.surface",
    output="surface",
    seed=42,
    overwrite=True,
)
# replacement:
tools = Tools()
tools.r_random_surface(output="surface", seed=42, overwrite=True)
# or with global overwrite:
tools = Tools(overwrite=True)
tools.r_random_surface(output="surface", seed=42)

@wenzeslaus
Copy link
Member Author

I updated documentation of the Tools class and documentation of the test functions. I also updated the PR description to reflect the latest state.

Does anyone have any unanswered questions about this PR or the Tools API in general? I'm leaning towards moving it from grass.experimental.tools to grass.tools.

@wenzeslaus wenzeslaus modified the milestones: Future, 8.5.0 Jul 2, 2025
@wenzeslaus wenzeslaus changed the title grass.experimental: Add object to access tools as functions grass.tools: Add object to access tools as functions Jul 2, 2025
@wenzeslaus wenzeslaus changed the title grass.tools: Add object to access tools as functions grass.tools: Add API to access tools as functions Jul 2, 2025
@github-actions github-actions bot added the CMake label Jul 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake enhancement New feature or request libraries Python Related code is in Python tests Related to Test Suite
Development

Successfully merging this pull request may close these issues.

4 participants