Skip to content

Review CI platforms and build tools #235

@lukpueh

Description

@lukpueh

We use a mix of platforms (Google Cloud Builds and GitHub Actions) and tools (Docker, Bazel, shell- and python scripts) to mange the complex lind-wasm build process -- an overview is provided in #231.

This issue reviews some of the decisions for choosing these tools and is meant as discussion starter for improvements.

Why use GitHub Actions (GHA)?
Obvious choice when using GitHub repos. Large (3rd-party) action ecosystem. Tightly integrated reporting. Re-uses access control... The main reason, why I recently set up a GHA was that I would not have been able to see the build results/log on GCB, even though I have full access to the GitHub repo, but I am on GHA. (#233, #234)

Why use Google Cloud Build (GCB) instead?
Main reason would be, if you are otherwise bound/subscribed to the Google Cloud infrastructure. This is not the case for us. We use GCB, because we were hitting build time quotas on GHA for our time-consuming toolchain build. Most notably installing clang and building glibc, which takes ~45 minutes.

Why use GHA and GCB?
Assuming we need to use GCB because of quotas, does using GHA in addition add significant overhead? I would argue, it doesn't: We need to manage the GitHub org anyways, and at least one GHA too, that is for publishing docs to GitHub pages (using GCB there doesn't make sense). Using GHA for other lightweight builds/tests in addition to GCB seems reasonable.

What about other tools (Docker, Bazel, shell- and python scripts)?
This is actually more relevant than whether we use GHA and GCB or only one of them. Local build tools should provide a single source of truth for build commands, regardless of where they are invoked (GCB, GHA, local dev env). Choice and usage parameters of local tools should be streamlined for their caching abilities.

What about caching?
Heavyweight build steps (install clang and build glibc) are located early in the pipeline, and related sources are rarely changed. An effective caching strategy should significantly speed up build times, improve developer experience, and might even allow us to stay within the free GHA quota (removing the need for GCB).

For caching we have several options, as each of the tools we are currently using has built-in caching

  • Bazel: Was actually chosen for its caching support. But caching is currently disabled. Setting up meaningful caching in ephemeral runners (on GCB or GHA) doesn't seem trivial.
  • Docker: Dockerfile could be optimized for caching individual layers. We could also enable remote caching to be used in CI.
  • Custom: The main caching we currently use is baked into GCB. This works nicely, but doesn't scale. I also want to cache my local builds.
  • GHA: On a smaller scale, we can also enable caching of e.g. Rust dependencies inside GHA (see #234).

Recommendations

  • Use GHA for lightweight CI jobs (docs, cargo {fmt, clippy, test}) with GHA caching where available
  • Continue using GCB for end-to-end testing (install clang, build glibc and wasmtime, run "unit tests")
    • Bonus: consider streamlining reporting. If we dump test results to the build log and make it accessible to contributors, we might not need all the ci-response-bot infrastructure.
  • Review "local" build tooling (docker, bazel) and caching options
    • Is bazel a good choice given the engineering resources on the project? (heavy-weight, steep learning curve)
    • Can we use docker with multistage builds to get similar results? Would it fit our dev/build workflow?
    • Other options?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions