Skip to content

Conversation

@FranjoMindek
Copy link
Contributor

@FranjoMindek FranjoMindek commented Dec 1, 2025

Description

What are "ephemeral" tests?
Before this PR waspc/e2e-tests only had the snapshot tests. They function in a way where we want to keep their artifacts (snapshots) as part of the git tracked files. Ephemeral tests on the other hand are just normal tests where we don't want to keep any artifacts after running them. So think of them as normal tests, we just use this name to keep them distinct from the snapshot tests.

image

Before going to deep into the solution be aware that we are aiming to refactor the whole waspc/e2e-tests by moving them away from the current unix shell commands approach. Instead we will write all of our logic in pure haskell. This is to make them cross platform compatible. That is why the tests which would require complex unix shell command implementations were only partially implemented. As the unix shell commands logic will be dropped, they are not worth the time investment.
More on cross platform compatible e2e tests here: #3404

In short, we want a net gain positive from these tests, not that they cover everything. Standard good code quality rules still apply.

The CLI commands table, which the implementation was based on can be found here: https://www.notion.so/wasp-lang/CLI-Testing-24518a74854c8051b0fef88533592504?source=copy_link
In there we mention which commands we are testing, which we are not, and why.

Note, as we now have additional variant of tests besides the snapshot tests, I decided to refactor the structure so that the snapshot tests logic is moved under the waspc/e2e-tests/SnapshotTest folder.
Changes related to moving snapshot tests folder are in a separate PR: #3468

Ephemeral tests consist of a name (which creates a same-named directory under EphemeralTest/temp) and ephemeral test cases. Each ephemeral test case is a unix shell command. Ephemeral test cases are executed sequentially. This is different from snapshot tests which only ever have a single unix shell command. This approach makes it much easier to build the unix shell commands (since we can split them into steps - test cases), and also we can validate the individual test cases separately. The validity of the test case is done by checking its exit status.

image

Sequentiality is enforced by executing the unix shell commands while creating the Hspec's Spec. Only for the captured exit status to be translated into SUCCESS/FAIL after running all of the tests. This is because I couldn't force Tasty to run them in sequence even while using the sequential function.

Blocked by #3480 which enables us to use the wasp db reset command non-interactively. This PR assumes the above PR is merged.

Fixes: #1963

Type of change

  • 🔧 Just code/docs improvement
  • 🐞 Bug fix
  • 🚀 New/improved feature
  • 💥 Breaking change

@FranjoMindek FranjoMindek marked this pull request as ready for review December 2, 2025 19:00
@Martinsos
Copy link
Member

Cool! I won't get into proper reviewing if others have time because I don't have enough time do to it properly, but I would potentially recommend @cprecioso as a reviewer, he is new to this piece of code and will both benefit from learning about it + offering a fresh perspective.

My only quick feedback is, do we have to call them "ephemeral"? Can't they just be named "tests", and "snapshot" tests are then kind of a specialization? Ephemeral sounds like there is sometihng unusually ephemeral happening, but then one realized "oh ok they are just normal tests" hah, so would be nice to avoid that "misdirection". I guess you maybe did so because you hvae Tests and then you have SnapshotTests and you needed to also somehow name these guys, but maybe that is then design issue (or maybe not).

Sequential part -> I hope when you said they have to execute sequentially, you meant a single test calls its content sequentially, not that all of them have to wait for each other? If actually all tests have to execute sequentially, that kind of sucks, doesn't it? Would be good to understand better in that case if we can not hvae that limitation.

Ok this is my quick take, I hope someobdy else can take this one over.

@cprecioso
Copy link
Member

cprecioso commented Dec 3, 2025

Hey, didn't take a look yet, just a heads-up that I won't be able to fit this review before I go on vacation. I can do it when I come back.

@FranjoMindek
Copy link
Contributor Author

FranjoMindek commented Dec 3, 2025

My only quick feedback is, do we have to call them "ephemeral"? Can't they just be named "tests", and "snapshot" tests are then kind of a specialization? Ephemeral sounds like there is sometihng unusually ephemeral happening, but then one realized "oh ok they are just normal tests" hah, so would be nice to avoid that "misdirection". I guess you maybe did so because you hvae Tests and then you have SnapshotTests and you needed to also somehow name these guys, but maybe that is then design issue (or maybe not).

Yes. It's simply giving them a name. I could name them just Tests instead. Though that makes me feel like it's something which should also affect snapshot tests, but it does not. It's a separate variant.

Sequential part -> I hope when you said they have to execute sequentially, you meant a single test calls its content sequentially, not that all of them have to wait for each other? If actually all tests have to execute sequentially, that kind of sucks, doesn't it? Would be good to understand better in that case if we can not hvae that limitation.

Sadly they are fully sequential because of Sequentiality is enforced by executing the unix shell commands while creating the Hspec's Spec. Because spec creation is not done in parallel, their execution ends up sequential. I only ever managed to do everything in parallel, or do everything sequentially, didn't manage to run the ephemeral test in parallel where their cases are run in sequence. Even when I use Hspec's sequential execution, they start the next shell command before the previous one was finished.

Do note that currently even snapshot tests are executed sequentially (the project generation part, which is also the most expensive part time wise, as it's done sequentially before running the TestTree).

So we do have a lot to gain by optimizing both of them.

Hey, didn't take a look yet, just a heads-up that I won't be able to fit this review before I go on vacation. I can do it when I come back.

That is fine, I still have to fix the issue that interactive tests (e.g. wasp db reset or wasp new) don't work on github runners.

@cprecioso
Copy link
Member

Maybe "Regression tests" is what we want?

@Martinsos
Copy link
Member

I don't think "regression tests" term is used in this way. I would just call them Tests then and that is ok, they are just normal, general tests.

@FranjoMindek FranjoMindek self-assigned this Dec 11, 2025
@FranjoMindek FranjoMindek linked an issue Dec 11, 2025 that may be closed by this pull request
# Conflicts:
#	waspc/e2e-tests/SnapshotTest/snapshots/kitchen-sink-golden/wasp-app/.wasp/out/sdk/wasp/client/hooks.ts
#	waspc/e2e-tests/SnapshotTest/snapshots/wasp-build-golden/wasp-app/.wasp/build/sdk/wasp/client/hooks.ts
#	waspc/e2e-tests/SnapshotTest/snapshots/wasp-build-golden/wasp-app/.wasp/out/sdk/wasp/client/hooks.ts
#	waspc/e2e-tests/SnapshotTest/snapshots/wasp-compile-golden/wasp-app/.wasp/out/sdk/wasp/client/hooks.ts
#	waspc/e2e-tests/SnapshotTest/snapshots/wasp-migrate-golden/wasp-app/.wasp/out/sdk/wasp/client/hooks.ts
Copy link
Member

@cprecioso cprecioso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so I think this is a super good effort, but IMO it requires a bit of a restructuring in this part, otherwise we'd be overcomplicating testing a lot (and we want to make tests one of the easiest things to author in our codebase!)

So, this is how *I* would go about it (I'm very open to discuss and change my mind):

First, I would split this effort into multiple, smaller-scoped PRs:

flowchart BT
  main@{ shape: junction }
  fin([ Done! ])

  main --> discover[Integrate tasty-discover] --> add-simple
  main --> refactor[Refactor DSL] --> add-simple[Add simple tests] --> add-interactive[Add interactive tests] --> fin
Loading

The main thing would be in the "Refactor DSL" PR, where I would mainly migrate off the complex shell builder DSL, and just define tests as IO () actions, i.e. regular HSpec tests. You can use shouldBe and other functions inside helpers to achieve a terse, descriptive syntax. An IO monad is the straightforward way to describe these kind of actions, letting us do logic in the Haskell side instead of makeshift shell scripting.

I would implement this refactor for the existing tests, and then implement the new tests on top, but you could decide to just do it for the new tests, we can discuss. But the idea would be that this way of defining tests could work for both snapshot- and non-snapshot- tests, the only difference being that snapshot-tests would call an assertSnapshot function (well, however it's called in tasty-golden).

With that, both the sequential/parallel problem could be solved, and the interactive commands could be easier to test (with access to stdin and stderr, and regular text manipulation on Haskell side). And the way to author the new tests in this PRs and in the future would be greatly simplified.

What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this file no longer needed?

Comment on lines +46 to +47
waspNewMinimalEphemeralTest,
waspNewMinimalInteractiveEphemeralTest,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If not doing tasty-discover: Instead of manually enumerating the tests here, can we make this file export an array of its tests?

import EphemeralTest.WaspDepsEphemeralTest (waspDepsEphemeralTest)
import EphemeralTest.WaspDockerfileEphemeralTest (waspDockerfileEphemeralTest)
import EphemeralTest.WaspInfoEphemeralTest (waspInfoEphemeralTest)
import EphemeralTest.WaspNewEphemeralTest (waspNewBasicEphemeralTest, waspNewBasicInteractiveEphemeralTest, waspNewMinimalEphemeralTest, waspNewMinimalInteractiveEphemeralTest, waspNewSaasEphemeralTest, waspNewSaasInteractiveEphemeralTest)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If not doing tasty-discover: I'd split this import in multiple lines

import ShellCommands (ShellCommand, ShellCommandBuilder, (~&&))

waspCompletionEphemeralTest :: EphemeralTest
waspCompletionEphemeralTest =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, nice job adding tests for this!

then putStrLn "Skipping end-to-end tests on Windows due to tests using *nix-only commands"
else tests >>= defaultMain

-- TODO: Investigate automatically discovering the tests.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be worth it to investigate tasty-discover in this PR, as we're adding a way longer list of tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so I thought a bit about this, and I think I'd scrap the "EphemeralTest" name, it's really cumbersome and not a convention.

What I see here is that we're doing E2E tests, and some of those happen to use snapshots. That's it, I wouldn't explicitly differentiate between the two. In any case, I'd add a "Snapshot" suffix to the other ones, and leave these ones without any specific names, if you want to mark those as a special case of the others.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Find a way to test the CLI and implement it

4 participants