Skip to content

Conversation

escapefreeg
Copy link

@escapefreeg escapefreeg commented Sep 4, 2025

The current code uses Go’s built-in gzip library to decompress gzip-formatted images. However, it was found that decompression using Go’s gzip library is slow (especially for large images), while using command-line decompression tools can improve the decompression speed. In the below 10 examples, using decompression commands achieved at least 32% speedup.
image
The containerd community has also discussed and implemented related optimizations, see: containerd/containerd#2640.

Although containerd uses the igzip command for decompression by default (see: https://github.com/containerd/containerd/blob/02eb68472ea11518c419d8866a3a309826bbb3ab/pkg/archive/compression/compression.go#L270 ), above results show that pigz delivers the best performance across most test cases. Therefore, the current priority for invoking decompression commands is pigz > igzip > gzip.

@escapefreeg escapefreeg force-pushed the main branch 4 times, most recently from 8eeada7 to c820467 Compare September 5, 2025 02:22
Copy link
Member

@ChengyuZhu6 ChengyuZhu6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments.

Comment on lines 91 to 117
func findCmdPath(cmd string) string {
return findCmdPathWithLookuper(cmd, exec.LookPath)
}

func findCmdPathWithLookuper(cmd string, lookuper lookuper) string {
path, err := lookuper(cmd)
if err != nil {
return ""
}

// Check if cmd disabled via env variable
value := os.Getenv(disablePrefix + strings.ToUpper(cmd))
if value == "" {
return path
}

disable, err := strconv.ParseBool(value)
if err != nil {
return path
}

if disable {
return ""
}

return path
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend to merge findCmdPathWithLookuper and findCmdPath into one function named findCmdPath.

Copy link
Author

@escapefreeg escapefreeg Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review. Since I want to test the findCmdPath function in estargz/decompressutil_test.go without being affected by the local environment, I introduced the findCmdPathWithLookuper function. By providing a custom mockLookuperFunc, we can simulate locating a command’s path, making the tests deterministic and environment-independent.

Comment on lines 115 to 117
if gzipPath == "" && pigzPath == "" && igzipPath == "" {
t.Skip("Skipping test: gzip, pigz or igzip commands not available in environment")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the test always skipped? I think we should preinstall pigz or igzip in the CI test image.

RUN apt-get update -y && apt-get --no-install-recommends install -y fuse3

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

estargz/build.go Outdated
if bytes.Equal([]byte{0x1F, 0x8B, 0x08}, src[:3]) {
// gzip
// Attempt decompression using commands in priority order: pigz > igzip > gzip
cmdPath := getDecompressCmdPath()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This library doesn't intend to implicitly execute commands. This should be an optional behaviour and the option should be just a callback function with a type smtg like func(io.Reader) (io.Reader, error) with an implementation outside of the estargz module . Also, you can add a flag for ctr-remote like --decompression-helper for easy testing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your suggestion. I have made changes according to your advice. Could you please check if there are any other issues?

@escapefreeg escapefreeg force-pushed the main branch 2 times, most recently from 291d69d to 204d6a5 Compare September 9, 2025 06:39
Usage: "convert to esgz without changing diffID (cannot be used in conjunction with '--estargz-record-in'. must be specified with '--estargz-external-toc')",
},
&cli.StringFlag{
Name: "estargz-gzip-helper",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New options have been added, please update the ctr-remote.md

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

"bytes"
"compress/gzip"
"fmt"
"github.com/containerd/stargz-snapshotter/estargz"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you fix the linter error?

File is not properly formatted (goimports)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


readCloser, writer := io.Pipe()
cmd.Stdin = in
cmd.Stdout = writer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just use StdoutPipe? https://pkg.go.dev/os/exec#Cmd.StdoutPipe

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried this approach, but when an error occurs in the goroutine that runs cmd.Wait, the io.ReadCloser returned by cmd.StdoutPipe cannot notify the decompression reader side that the data has errored, which causes the program to behave abnormally. If don’t use a pipe, I might need to implement a custom ReadCloser wrapper or use context to handle the write and error logic, which could increase code complexity.

In addition, I found that cmd.StdoutPipe internally also uses a pipe, as shown here: https://cs.opensource.google/go/go/+/refs/tags/go1.25.1:src/os/exec/exec.go;drc=ebee011a54f9310099d02a7e7731330539db16cf;l=1074.

Therefore, I have not made changes to the related code for now. If there is a better implementation, please let me know.

@escapefreeg
Copy link
Author

I've reconsidered the functionality of our --estargz-gzip-helper. Since we provide an option for users to choose the command to use, there's actually no need for us to set environment variables to disable related commands. Keeping that function would cause code redundancy, so I've removed the environment variable–related logic in the new commits.

In addition, to make code review easier, I've split the changes made during this PR into separate commits. Once the review is complete, I will rebase to tidy up the commits.

The current pipeline test failures seem unrelated to the code, which appears to be fine. Please let me know if you have any other suggestions.

}
if !clicontext.Bool("estargz-external-toc") {
// copy esgzBaseOpts to esgzOpts to avoid data race
esgzOpts := make([]estargz.Option, len(esgzBaseOpts)+1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coud you elaborate this? Is this accessed from multiple goroutines?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The callback function returned by estargzconvert.LayerConvertWithLayerAndCommonOptsFunc captures the slice esgzOpts. When different layers in an image invoke the conversion function f concurrently, the operation inside fLayerConvertFunc(append(commonOpts, opts[desc.Digest]...)...)(ctx, cs, desc) will append to the captured esgzOpts simultaneously, which can lead to a data race.

If we don’t make a copy, the check for the estargz-gzip-helper flag may appends option to esgzBaseOpts, causing its len and cap  properties to become unequal. whenLayerConvertFunc(append(commonOpts, opts[desc.Digest]...)...)(ctx, cs, desc)executes, concurrent layers may append to the same underlying slice, resulting in a data race.

If we perform a copy, the new esgzOpts always has its len property equal to its cap property. In that case, the append in LayerConvertFunc(append(commonOpts, opts[desc.Digest]...)...)(ctx, cs, desc) will always allocate a new slice due to insufficient capacity, so each layer gets its own private slice and no data race occurs.

The use of copy in esgzexternaltocconvert.LayerConvertWithLayerAndCommonOptsFunc follows the same rationale.

The original code has a similar phenomenon. For example, estargzconvert.LayerConvertWithLayerAndCommonOptsFunc constructs a temporary estargz.Option slice on the fly, ensuring that its len property is equal to its cap property. However, this is a subtle issue that could cause problems in future development. When I submitted the commits, I didn’t fully consider this and went with the current approach. A better solution would be to copy the estargz.Option slice inside both estargzconvert.LayerConvertWithLayerAndCommonOptsFunc and esgzexternaltocconvert.LayerConvertWithLayerAndCommonOptsFunc to completely eliminate the data race. I’ll submit a PR to address this.

@ktock
Copy link
Member

ktock commented Sep 18, 2025

Please squash commits.

@escapefreeg
Copy link
Author

escapefreeg commented Sep 18, 2025

Please squash commits.

Okay, Would it be better if I rebase my commits after this PR #2133 is merged?

@escapefreeg
Copy link
Author

Please squash commits.

I’ve tidied up the commits. Could you take a look and see if everything’s okay? Also, could you help review PR #2133?

Copy link
Member

@ktock ktock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@ktock ktock merged commit 9f5e859 into containerd:main Sep 29, 2025
44 checks passed
When you run `ctr-remote image optimize`, this runs the source image (`ghcr.io/stargz-containers/golang:1.15.3-buster-org`) as a container and profiles all file accesses during the execution.
Then these accessed files are marked as "prioritized" files and will be prefetched on runtime.

You can specify the `--estargz-gzip-helper` flag to specify a command-line helper tool for gzip decompression during optimization. Using command-line tools can speed up the decompression of the original image, thereby accelerating the overall optimization process. Even using the gzip command corresponding to the Go gzip library can achieve approximately 32% speed improvement. For more details, see: [Using decompression commands to improve the layer decompression speed of gzip-formatted images](https://github.com/containerd/stargz-snapshotter/pull/2117). Currently, `--estargz-gzip-helper` only supports `pigz`, `igzip`, and `gzip`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this still work with lazy pulling?

Copy link
Author

@escapefreeg escapefreeg Oct 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the --estargz-gzip-helper is only used to decompress tar.gz-encoded traditional overlayfs images. It does not change the subsequent image conversion/optimization process, and therefore does not affect the lazy pulling of estargz format images. I also converted/optimized the same image using the code before and after the modification. The manifest and overall hash of the converted/optimized image remained identical in both cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants