Skip to content

feat: add custom interpreter support for pixi tasks (close #1844) #3929

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

fecet
Copy link
Contributor

@fecet fecet commented Jun 11, 2025

Fixes #1844

Motivation

The current pixi task execution relies on deno_task_shell, which while powerful, has several limitations:

  1. Limited conditional statement support: deno_task_shell doesn't support complex conditional statements like if-then-else structures
  2. Parameter expansion limitations: Lacks support for advanced parameter expansion features like ${var:-default} and other bash-specific features
  3. Incomplete shell features: Cannot utilize advanced shell-specific features such as nushell's data processing capabilities or bash's advanced scripting functionality

To address these limitations, we introduce custom interpreter support, allowing users to specify particular shells (such as bash, sh, nushell, etc.) to execute tasks.

Implementation Details

Core Changes

  1. Added interpreter field: Introduced an optional interpreter field to task definitions
  2. Separated template processing: Decoupled minijinja template processing from deno_task_shell parsing
  3. Execution path selection: Choose different execution paths based on whether an interpreter is specified

Technical Implementation

1. Task Definition Extension

[tasks]
my-task = { cmd = "echo 'Hello World'", interpreter = "bash" }

2. Execution Flow

  • With interpreter: {interpreter} <<< {processed_command} (by using deno_task_shell::execute_with_pipes)
  • Without interpreter: Use original deno_task_shell execution

3. Template Processing Pipeline

Raw command → minijinja template processing →
├─ With interpreter → Pass directly to interpreter
└─ Without interpreter → deno_task_shell parsing → Execute

Usage Examples

[tasks]
# Execute complex scripts with bash
bash-task = { cmd = "if [ -f file.txt ]; then echo 'exists'; fi", interpreter = "bash" }

# Data processing with nushell
nu-task = { cmd = "ls | where size > 1MB", interpreter = "nu" }

# Ensure POSIX compatibility with sh
posix-task = { cmd = "echo $HOME", interpreter = "sh" }

CLI Usage

# Add a task with custom interpreter
pixi task add my-bash-task "echo 'Hello from bash'" --interpreter bash

# Run task (interpreter is automatically detected from task definition)
pixi run my-bash-task

@fecet fecet force-pushed the feat/task-interpreter branch 3 times, most recently from a0aef07 to f627642 Compare June 13, 2025 10:00
@fecet fecet force-pushed the feat/task-interpreter branch 4 times, most recently from 2ffd9c8 to a71f4bd Compare June 13, 2025 17:30
@zelosleone
Copy link
Contributor

Hey @fecet

Can you please add more tests to this feature? Pref testing different shells with the combination of different operating systems, since from rattler-build we had a lot of errors with cmd-bash compatibility on windows during interpreter selections, this would actually make the pr a lot better with confidence in bug-free approach. Also, in rattler-build we have dedicated support with different interpreters, but generally we also could use cmd/bash to reach the PATH of system and installed tools could be used that way too! So this gives both a sense of freedom to use any interpreter through system shells, but also have dedicated support for nushell/bash/cmd/rscript etc. I think going with this approach could be nice yeah. Also we could avoid skipping parsing the script and just directly dedicate to interpreter as well, this would reduce overhead and we could make it faster!

@phreed
Copy link
Contributor

phreed commented Jun 27, 2025

In general, the interpreter can be run without arguments and interpret what arrives on stdin (and write to stdout) using pipes. There may be certain cases where the interpreter also needs some arguments.
The following is contrived:

[tasks.my-task]
cmd = "ls | where size > 1MB"
interpreter = "nu"
args = [ "--execute", "print 'Big Files'"]

Where are the interpreters acquired?

  • build-dependencies
  • host-dependencies
  • run-dependencies
  • dependencies
  • pypi-dependencies (I hope not)

@fecet fecet force-pushed the feat/task-interpreter branch 2 times, most recently from 3626705 to 6c7f612 Compare July 7, 2025 08:18
@phreed
Copy link
Contributor

phreed commented Jul 7, 2025

Looking at the tests I see that it preserves the previous deno-shell behavior.

cmd: The command being interpreted.

interpreter: (new) The command (what processes this command?) actually run as an interpreter, it must read from stdin and write to stdout.
I was presuming the interpreter would be a program.
It appears that it is itself some kind of (deno?) script.

args: (new) Some named arguments substituted (with minijinja?) into the cmd and interpreter elements.

envs: (new) Set some environment parameters.

The "interpreter" is limited to providing the command to start the interpreter.
The "cmd" is sent to the stdin of the interpreter.
How is the issue of distinguishing between input to the interpreter being supplied on the command line vs. via a pipe on stdin handled?

It was not immediately obvious to me that the tests effectively write a custom interpreter in python.
That is, python itself is not the interpreter.

@phreed
Copy link
Contributor

phreed commented Jul 7, 2025

Would it be helpful for pixi-tasks themselves to be interpreters.

In the tests you can find:

    # Test complex interpreter that removes spaces from input
manifest_content["tasks"] = {
   "remove-spaces-blackbox": {
       "interpreter": '''python -c "import sys; data=sys.stdin.read(); sys.stdout.write(data.replace(' ',''))"''',
    },
    "remove-spaces": {"cmd": "echo 'hello world' | pixi run remove-spaces-blackbox"},
}

I think the "interpreter" would be better as an array as that would make the interpretation of the interpreter unnecessary. (Direct use of https://doc.rust-lang.org/std/process/struct.Command.html rather than spawning a command processor first.)
Also, could pixi tasks not require a "pixi run" as we already have a perfectly good pixi environment.

manifest_content["tasks"] = {
    "remove-spaces-blackbox": {
        "interpreter": ["python", "-c", '''
                 import sys
                 data=sys.stdin.read()
                 sys.stdout.write(data.replace(' ',''))
                 ''']
        },
    "remove-spaces": 
        "cmd": "echo 'hello world' | remove-spaces-blackbox",
        "interpreter": ["deno"]
        },
}

If tasks could be interpreters then could it could become?:

manifest_content["tasks"] = {
    "remove-spaces-interpreter": {
        "cmd": '''
                 import sys
                 data=sys.stdin.read()
                 sys.stdout.write(data.replace(' ',''))
                 ''',
        "interpreter": ["python"]
        },
    "remove-spaces-blackbox": {
        "interpreter": ["remove-spaces-interpreter"]
        },
    "remove-spaces": 
        "cmd": "echo 'hello world' | remove-spaces-blackbox",
        "interpreter": ["deno"]
        },
}

@fecet fecet force-pushed the feat/task-interpreter branch from 6c7f612 to b0211da Compare July 7, 2025 16:25
@fecet
Copy link
Contributor Author

fecet commented Jul 7, 2025

Thanks for driving this discussion and for all the detailed reviews so far. I’ve gone back through the thread, and here are my thoughts:
I’d prefer to maintain a minimal, predictable contract between cmd and the interpreter. Deeply baking in every advanced interpreter feature risks over-engineering the common case and could impose performance penalties that most users won’t expect.

The existing tests were really just proof-of-concepts to show how you could chain multiple pixi run invocations. If someone really needs that level of flexibility—and is comfortable with the extra overhead—it already works today. It’s fine to leave full, first-class support for nested interpreters as an opt-in pattern for power users who understand the trade-offs.

If we decide that these advanced features deserve official, ergonomic support, the best path is for the Pixi core maintainers to define exactly what the API should look like:

  • How interpreter chains appear in the schema
  • How arguments and stdin/stdout interplay should work
  • What cross-platform guarantees we provide

@fecet fecet requested a review from pavelzw July 7, 2025 16:41
@fecet
Copy link
Contributor Author

fecet commented Jul 7, 2025

I’ve just pushed an expanded test suite but I’m not deeply familiar with macOS or Windows internals, so if you spot any gaps, please let me know @zelosleone

@phreed
Copy link
Contributor

phreed commented Jul 7, 2025

Thanks for driving this discussion and for all the detailed reviews so far. I’ve gone back through the thread, and here are my thoughts: I’d prefer to maintain a minimal, predictable contract between cmd and the interpreter. Deeply baking in every advanced interpreter feature risks over-engineering the common case and could impose performance penalties that most users won’t expect.

I agree.

My main concern is that the "interpreter" is unnecessarily complex.
Rather than:

interpreter = '''python -c "import ..."'''

Which necessitates the use of 'deno' or something like it to parse.
I think it would be more reliable to use:

interpreter = ["python", "-c", '''import ...''']

The list would be the command and its arguments to https://doc.rust-lang.org/std/process/struct.Command.html
The 'cmd' element is then provided as stdin to that process.

That seems like the opposite of "deeply baking" to me.

Note: The issue about interpreter chains is definitely a separate topic:

  • related to specific interpreters to handle those chains (like deno currently does with pipes)
  • related to depends-on (which currently do recognize task chains)

@fecet
Copy link
Contributor Author

fecet commented Jul 8, 2025

@phreed, you’re right to flag that concern, but for now the interpreter field will primarily hold single strings like "bash", "python", or "nu". In this case, treating it as a simple string is far more straightforward. Converting it to a list would add even more complexity to an already substantial PR and make it harder to review. If the team agrees it’s necessary, we can tackle that in a follow-up PR.

@phreed
Copy link
Contributor

phreed commented Jul 8, 2025

@phreed, you’re right to flag that concern, but for now the interpreter field will primarily hold single strings like "bash", "python", or "nu". In this case, treating it as a simple string is far more straightforward. Converting it to a list would add even more complexity to an already substantial PR and make it harder to review. If the team agrees it’s necessary, we can tackle that in a follow-up PR.

I made a PR to show what might be entailed in making the changes mentioned.
fecet#1
I also created a bunch of tests (and fixed some minor issues).
I tried to merge in some of your later changes but I may have done it wrong.

@phreed
Copy link
Contributor

phreed commented Jul 8, 2025

Here are some reason why this change might be wanted:

Python Buffered/Unbuffered stdout

If the interpreter is ["python","-u"] rather than just "python" (or equivalently ["python"]) the progress of the process can be monitored.
Of course, using ["python","-u"] would incur some overhead and may not be appropriate for quick scripts.

The project may be an interpreter

It is pretty common for an application to be an interpreter.
For example, I am working on a nushell plugin.
I want to pass different inputs to my nushell plugin to test it.

@fecet fecet force-pushed the feat/task-interpreter branch 4 times, most recently from e769f42 to b2eeb5e Compare July 10, 2025 17:11
@fecet
Copy link
Contributor Author

fecet commented Jul 11, 2025

@phreed @pavelzw The current interpreter accepts input in the form of template strings and uses temfile approach similar to those in GitHub workflow shells. I think this should be general enough. Feel free to ask me to add more tests.
https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#defaultsrunshell

@fecet fecet force-pushed the feat/task-interpreter branch from 774a209 to 93981a6 Compare July 11, 2025 13:14
@fecet fecet force-pushed the feat/task-interpreter branch 3 times, most recently from 94d0fed to d6b8001 Compare July 12, 2025 09:17
@fecet fecet requested a review from phreed July 12, 2025 09:20
Copy link
Contributor

@phreed phreed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those changes look good.
The valid test set looks sufficient.
The invalid test set could be expanded but I think that should be discovered based on usage.

As mentioned before:

  • I think the complexity of using deno_task_shell could be confusing in some cases.
  • deno_task_shell behavior needs to be preserved for backwards compatibility.

@fecet fecet force-pushed the feat/task-interpreter branch 4 times, most recently from 2240833 to 5105e9a Compare July 12, 2025 17:50
@phreed
Copy link
Contributor

phreed commented Jul 12, 2025

I did a quick check to see how rattler-build handles the cmd (stdin vs. tempfile).
It appears they use the tempfile https://github.com/prefix-dev/rattler-build/blob/781746d5554ab70b4ef106e6002c4b460485c189/src/script/interpreter/nushell.rs#L121

@nichmor
Copy link
Contributor

nichmor commented Jul 18, 2025

thanks for your work! I've reviewed it and left some small comments!

@fecet fecet force-pushed the feat/task-interpreter branch from 5105e9a to 3954fd8 Compare July 18, 2025 13:45
@fecet fecet requested a review from nichmor July 18, 2025 13:46
@fecet fecet force-pushed the feat/task-interpreter branch from 3954fd8 to 78062c8 Compare July 18, 2025 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow alternative cross platform shells to be used instead of deno
5 participants