Skip to content

Latest commit

 

History

History
334 lines (270 loc) · 20.3 KB

README.md

File metadata and controls

334 lines (270 loc) · 20.3 KB

SLSA: Supply-chain Levels for Software Artifacts

NOTE: This site is best viewed at https://slsa.dev.

Supply-chain Levels for Software Artifacts (SLSA, pronounced salsa) is an end-to-end framework for ensuring the integrity of software artifacts throughout the software supply chain. The requirements are inspired by Google’s internal "Binary Authorization for Borg" that has been in use for the past 8+ years and that is mandatory for all of Google's production workloads.

The goal of SLSA is to improve the state of the industry, particularly open source, to defend against the most pressing integrity threats. With SLSA, consumers can make informed choices about the security posture of the software they consume.

IMPORTANT: SLSA is an evolving specification and we are looking for wide-ranging feedback via GitHub issues, email, or feedback form. SLSA is being developed as part of the OpenSSF Digital Identity WG.

Why SLSA?

Supply chain integrity attacks—unauthorized modifications to software packages—have been on the rise in the past two years, and are proving to be common and reliable attack vectors that affect all consumers of software. The software development and deployment supply chain is quite complicated, with numerous threats along the source ➞ build ➞ publish workflow. While point solutions do exist for some specific vulnerabilities, there is no comprehensive end-to-end framework that both defines how to mitigate threats across the software supply chain, and provides reasonable security guarantees. There is an urgent need for a solution in the face of the eye-opening, multi-billion dollar attacks in recent months (e.g. SolarWinds, CodeCov), some of which could have been prevented or made more difficult had such a framework been adopted by software developers and consumers.

SLSA addresses three issues:

  • Software producers want to secure their supply chains but don't know exactly how;
  • Software consumers want to understand and limit their exposure to supply chain attacks but have no means of doing so;
  • Artifact signatures alone prevent only a subset of the attacks we care about.

Supply Chain Threats

SLSA helps to protect against common supply chain attacks. The following image illustrates a typical software supply chain and includes examples of attacks that can occur at every link in the chain. Each type of attack has occured over the past several years and, unfortunately, is increasing as time goes on.

Supply Chain Threats

Many recent high-profile attacks were consequences of supply-chain integrity vulnerabilities, and could have been prevented by SLSA's framework. For example:

Threat Known example How SLSA could have helped
A Submit bad code to the source repository Linux hypocrite commits: Researcher attempted to intentionally introduce vulnerabilities into the Linux kernel via patches on the mailing list. Two-person review caught most, but not all, of the vulnerabilities.
B Compromise source control platform PHP: Attacker compromised PHP's self-hosted git server and injected two malicious commits. A better-protected source code platform would have been a much harder target for the attackers.
C Build with official process but from code not matching source control Webmin: Attacker modified the build infrastructure to use source files not matching source control. A SLSA-compliant build server would have produced provenance identifying the actual sources used, allowing consumers to detect such tampering.
D Compromise build platform SolarWinds: Attacker compromised the build platform and installed an implant that injected malicious behavior during each build. Higher SLSA levels require stronger security controls for the build platform, making it more difficult to compromise and gain persistence.
E Use bad dependency (i.e. A-H, recursively) event-stream: Attacker added an innocuous dependency and then later updated the dependency to add malicious behavior. The update did not match the code submitted to GitHub (i.e. attack F). Applying SLSA recursively to all dependencies would have prevented this particular vector, because the provenance would have indicated that it either wasn't built from a proper builder or that the source did not come from GitHub.
F Upload an artifact that was not built by the CI/CD system CodeCov: Attacker used leaked credentials to upload a malicious artifact to a GCS bucket, from which users download directly. Provenance of the artifact in the GCS bucket would have shown that the artifact was not built in the expected manner from the expected source repo.
G Compromise package repository Attacks on Package Mirrors: Researcher ran mirrors for several popular package repositories, which could have been used to serve malicious packages. Similar to above (F), provenance of the malicious artifacts would have shown that they were not built as expected or from the expected source repo.
H Trick consumer into using bad package Browserify typosquatting: Attacker uploaded a malicious package with a similar name as the original. SLSA does not directly address this threat, but provenance linking back to source control can enable and enhance other solutions.

A SLSA level gives consumers confidence that software has not been tampered with and can be securely traced back to source—something that is difficult, if not impossible, to do with most software today.

What is SLSA?

SLSA is a set of incrementally adoptable security guidelines, established by industry consensus. The standards set by SLSA are guiding principles for both software producers and consumers: producers can follow the guidelines to make their software more secure, and consumers can make decisions based on a software package's security posture. SLSA's four levels are designed to be incremental and actionable, and to protect against specific integrity attacks. SLSA 4 represents the ideal end state, and the lower levels represent milestones with corresponding integrity guarantees.

Terminology

SLSA's framework addresses every step of the software supply chain—the sequence of steps resulting in the creation of an artifact. We represent a supply chain as a directed acyclic graph of sources, builds, dependencies, and packages. One artifact's supply chain is a combination of its dependencies' supply chains plus its own sources and builds.

Software Supply Chain Model

Term Description Example
Artifact An immutable blob of data; primarily refers to software, but SLSA can be used for any artifact. A file, a git commit, a directory of files (serialized in some way), a container image, a firmware image.
Source Artifact that was directly authored or reviewed by persons, without modification. It is the beginning of the supply chain; we do not trace the provenance back any further. Git commit (source) hosted on GitHub (platform).
Build Process that transforms a set of input artifacts into a set of output artifacts. The inputs may be sources, dependencies, or ephemeral build outputs. .travis.yml (process) run by Travis CI (platform).
Package Artifact that is "published" for use by others. In the model, it is always the output of a build process, though that build process can be a no-op. Docker image (package) distributed on DockerHub (platform).
Dependency Artifact that is an input to a build process but that is not a source. In the model, it is always a package. Alpine package (package) distributed on Alpine Linux (platform).

Special cases:

  • A ZIP file is containing source code is a package, not a source, because it is built from some other source, such as a git commit.

SLSA Levels

There are four SLSA levels. SLSA 4 is the current highest level and represents the ideal end state. SLSA 1–3 offer lower security guarantees but are easier to meet. In our experience, achieving SLSA 4 can take many years and significant effort, so intermediate milestones are important. We also use the term SLSA 0 to mean the lack of a SLSA level.

SLSA 1 requires that the build process be fully scripted/automated and generate provenance. Provenance is metadata about how an artifact was built, including the build process, top-level source, and dependencies. Knowing the provenance allows software consumers to make risk-based security decisions. Though provenance at SLSA 1 does not protect against tampering, it offers a basic level of code source identification and may aid in vulnerability management.

SLSA 2 requires using version control and a hosted build service that generates authenticated provenance. These additional requirements give the consumer greater confidence in the origin of the software. At this level, the provenance prevents tampering to the extent that the build service is trusted. SLSA 2 also provides an easy upgrade path to SLSA 3.

SLSA 3 further requires that the source and build platforms meet specific standards to guarantee the auditability of the source and the integrity of the provenance, respectively. We envision an accreditation process whereby auditors certify that platforms meet the requirements, which consumers can then rely on. SLSA 3 provides much stronger protections against tampering than earlier levels by preventing specific classes of threats, such as cross-build contamination.

SLSA 4 is currently the highest level, requiring two-person review of all changes and a hermetic, reproducible build process. Two-person review is an industry best practice for catching mistakes and deterring bad behavior. Hermetic builds guarantee that the provenance's list of dependencies is complete. Reproducible builds, though not strictly required, provide many auditability and reliability benefits. Overall, SLSA 4 gives the consumer a high degree of confidence that the software has not been tampered with.

The SLSA level is not transitive (see explanation). It describes the integrity protections of an artifact's build process and top-level source, but nothing about the artifact's dependencies. Dependencies have their own SLSA ratings, and it is possible for a SLSA 4 artifact to be built from SLSA 0 dependencies.

Level requirements

The following is a summary. For details, click the links in the table for corresponding requirements.

Requirement SLSA 1 SLSA 2 SLSA 3 SLSA 4
Source - Version Controlled
Source - Verified History
Source - Retained Indefinitely 18 mo.
Source - Two-Person Reviewed
Build - Scripted Build
Build - Build Service
Build - Ephemeral Environment
Build - Isolated
Build - Parameterless
Build - Hermetic
Build - Reproducible
Provenance - Available
Provenance - Authenticated
Provenance - Service Generated
Provenance - Non-Falsifiable
Provenance - Dependencies Complete
Common - Security
Common - Access
Common - Superusers

○ = required unless there is a justification

Frequently Asked Questions

Q: Why is SLSA not transitive?

SLSA is not transitive in order to make the problem tractable. If SLSA 4 required dependencies to be SLSA 4, then reaching SLSA 4 would require starting at the very beginning of the supply chain and working forward. This is backwards, forcing us to work on the least risky component first and blocking any progress further downstream. By making each artifact's SLSA rating independent from one another, it allows parallel progress and prioritization based on risk. (This is a lesson we learned when deploying other security controls at scale throughout Google.) We expect SLSA ratings to be composed to describe a supply chain's overall security stance, as described in the case study vision.

Q: What about reproducible builds?

When talking about reproducible builds, there are two related but distinct concepts: "reproducible" and "verified reproducible."

"Reproducible" means that repeating the build with the same inputs results in bit-for-bit identical output. This property provides many benefits, including easier debugging, more confident cherry-pick releases, better build caching and storage efficiency, and accurate dependency tracking.

For these reasons, SLSA 4 requires reproducible builds unless there is a justification why the build cannot be made reproducible. Example justifications include profile-guided optimizations or code signing that invalidates hashes. Note that there is no actual reproduction, just a claim that reproduction is possible.

"Verified reproducible" means using two or more independent build systems to corroborate the provenance of a build. In this way, one can create an overall system that is more trustworthy than any of the individual components. This is often suggested as a solution to supply chain integrity. Indeed, this is one option to secure build steps of a supply chain. When designed correctly, such a system can satisfy all of the SLSA build requirements.

That said, verified reproducible builds are not a complete solution to supply chain integrity, nor are they practical in all cases:

  • Reproducible builds do not address source, dependency, or distribution threats.
  • Reproducers must truly be independent, lest they all be susceptible to the same attack. For example, if all rebuilders run the same pipeline software, and that software has a vulnerability that can be triggered by sending a build request, then an attacker can compromise all rebuilders, violating the assumption above.
  • Some builds cannot easily be made reproducible, as noted above.
  • Closed-source reproducible builds require the code owner to either grant source access to multiple independent rebuilders, which is unacceptable in many cases, or develop multiple, independent in-house rebuilders, which is likely prohibitively expensive.

Therefore, SLSA does not require verified reproducible builds directly. Instead, verified reproducible builds are one option for implementing the requirements.

For more on reproducibility, see Hermetic, Reproducible, or Verifiable?

Detailed example

For a motivating example and vision, see detailed example.

Related work

In parallel to the SLSA specification, there is work to develop core formats and data models. Currently this is joint work between Binary Authorization and in-toto but we invite wider participation.

  • Standard attestation format to express provenance and other attributes. This will allow sources and builders to express properties in a standard way that can be consumed by anyone. Also includes reference implementations for generating these attestations.
  • Policy data model and reference implementation.

For a broader view of the software supply chain problem:

Prior iterations of the ideas presented here:

Other related work:

Other takes on provenance and CI/CD: