Skip to content

Conversation

@vinit-chauhan
Copy link

Elastic-package has been extensively used for integrations related tasks. However, most of the commands now are targeted to be run on one package at a time. We don’t have an option if we want to repeat a certain operation across multiple integrations.

This pull request adds two subcommands to the elastic-package to allow bulk operations from the elastic-package.

  1. filter
  2. foreach

Note: Both commands are expected to be run from the integration repository.

Filter subcommand

Filter command adds the ability to filter and return a list of integrations based on specified criteria..

Available Filters:

  • Category
  • Codeowner
  • Input
  • Integration Type ( eg., integration, input )
  • Package Name ( supports glob patterns )
  • Spec Version

You can chain multiple filters and each filter can have multiple comma-separated values
Matching:
All filters must match.
At least one of the values match

Currently, the filter runs sequentially and reads all the manifest files at once and keeps it in memory. We can update the code to use a buffered channel and have a producer and consumer to reduce memory footprint Additionally we can use a worker pool to spread out filtering.

The current elastic-package and spec does not enforce the same package_name and directory name in the repo. Which leads to some integrations having different package_name and directory name.

The filter command by default returns the name of the directory in the integration repo. However, it also provides a flag --output-package-name / -p to make it return the package name.

If no filter flag is provided, the command will return a list of all the integrations.

elastic-package filter --input tcp,udp --code-owner elastic/integration-experience --package-name cisco_* 

Foreach subcommand

The foreach command leverages the filter registry. Therefore all the flags available in the filter are directly available to foreach commands without any code changes.

Additionally, foreach has 1 flag --parallel which allows the user to run commands parallelly using worker pool.

default is 1 ( runs sequentially )

The elastic-package command you want to run goes after -- with all of its flags.

Note: You are only allowed to run allowed elastic-package subcommands ["build","check","clean","format","install","lint","test","uninstall"] (cmd/foreach.go)

elastic-package foreach --input tcp,udp --code-owner elastic/integration-experience --parallel 5 -- test pipeline –generate

File changes:

internal/packages/packages.go: Added function to find the integrations repo root dir and read all manifests.
cmd/filter.go: Filter command implementation
cmd/foreach.go: Foreach command implementation
internal/filter/*: Filter interface and implementation for each filter flag.

Related Issues

AI Tools used

Cursor With Claude-4.5-Sonnet

@vinit-chauhan vinit-chauhan marked this pull request as ready for review October 27, 2025 15:40
@vinit-chauhan vinit-chauhan requested a review from a team as a code owner October 27, 2025 15:40

func executeCommand(args []string, path string) error {
// Look up the elastic-package binary in PATH
execPath, err := exec.LookPath("elastic-package")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as this is a command run on top of other commands of this binary, could we target directly the commands by its functions, instead of looking for the path binary?
what if the binary is not updated or needs to run specific version? just having some thoughts here on how would be a better aproach without depending of having the binary installed as its own dependency.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be possible to do something like this:

cmd := cmd.RootCmd()
cmd.SetArgs(args)
return cmd.Execute()

We could also create our own command that is like cmd.RootCmd(), but where we only add the commands with AddCommand that we allow.

Dir: path,
Stdout: io.Discard,
Stderr: os.Stderr,
Env: os.Environ(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when running the command, could we select the variables strictly needed instead of just bulking all the system ones?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This came across my mind at first but I did not go through with it.
The reason I chose to pass all the env variables, is that elastic-package already has access to all the env variables. Additionally, it saves us from future bugs where we add some capability with specific env var and forget to add it to a list on other part of codebase.

Let me know if we want to have a list if env variables to limit it.


// MustFindIntegrationRoot finds and returns the path to the root folder of a package from the working directory.
// It fails with an error if the package root can't be found.
func MustFindIntegrationRoot() (string, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all this code related to detect the repository root can be reused from https://github.com/vinit-chauhan/elastic-package/blob/caaff9abcc9abb60ec525e40eab441ffd99b7828/internal/files/repository.go#L16 wdyt?

There the "integrations root" is considered the git repository of the integrations repository.

(m.Type == dataStreamTypeLogs || m.Type == dataStreamTypeMetrics || m.Type == dataStreamTypeSynthetics || m.Type == dataStreamTypeTraces),
nil
}
func isIntegrationRepo(path string) (bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, i think we can use the reference of the git repository instead of the go.mod file. @jsoriano wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree.

Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! This will be helpful. Added some questions and suggestions.

Comment on lines 145 to 146
FilterIntegrationTypeFlagName = "integration-type"
FilterIntegrationTypeFlagDescription = "integration types to filter by (comma-separated values)"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integration is a type of package, this should be probably called package type.

Suggested change
FilterIntegrationTypeFlagName = "integration-type"
FilterIntegrationTypeFlagDescription = "integration types to filter by (comma-separated values)"
FilterPackageTypeFlagName = "package-type"
FilterPackageTypeFlagDescription = "package types to filter by (comma-separated values)"

Comment on lines 158 to 159
FilterPackagesFlagName = "package-name"
FilterPackagesFlagDescription = "package names to filter by (comma-separated values)"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can call this flag directly packages, or package-path to avoid confusion between package paths and their names.

FilterSpecVersionFlagDescription = "Package spec version to filter by (semver)"

ForeachPoolSizeFlagName = "parallel"
ForeachPoolSizeFlagDescription = "number of packages to execute in parallel (defaults to serial execution)"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ForeachPoolSizeFlagDescription = "number of packages to execute in parallel (defaults to serial execution)"
ForeachPoolSizeFlagDescription = "Number of subcommands to execute in parallel (defaults to serial execution)"


func (f *PackageNameFlag) Matches(dirName string, manifest *packages.PackageManifest) bool {
for _, pattern := range f.patterns {
if pattern.Match(dirName) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, here it is confusing to have a filter package name, that is actually matching paths.


// ReadAllPackageManifests reads all the package manifests in the given root directory.
func ReadAllPackageManifests(root string) ([]PackageDirNameAndManifest, error) {
files, err := filepath.Glob(filepath.Join(root, "packages", "*", PackageManifestFile))
Copy link
Member

@jsoriano jsoriano Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The packages directory is a convention in the integrations repository, but is not formalized. Actually it is intended that developers can place their packages wherever they want. For example in package-spec and elastic-package repositories there are packages under ./test/packages, and it would be nice to be able to use this tool there too 🙂

The idea of the --packages flag proposed in #2327 was to allow to customize this, even if the default is still ./packages.

In #2327 I also proposed an --auto flag, that was intended to look for packages in a repository, by looking for manifest files and so on. There is no need to implement it, just in case.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm - That makes sense. I overlooked the fact that we allow the integrations in places other than integration repo.

I'll update the code to allow different paths in --packages.

for the --auto; if no filter is provided, it will work return all packages and foreach would perform action on all of those.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. Underscores in Go file or package names should be avoided when possible. Specially on file names where some suffixes may have meaning, such as _test.go, or _linux.go.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, when I added _flag it felt weird to me as well. but I did it to make the flag files distinct. But I'll remove that.

Path: execPath,
Args: append([]string{execPath}, args...),
Dir: path,
Stdout: io.Discard,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why discarding stdout?

Comment on lines 47 to 65
func (f *FilterFlagBase) Name() string {
return f.name
}

func (f *FilterFlagBase) Description() string {
return f.description
}

func (f *FilterFlagBase) Shorthand() string {
return f.shorthand
}

func (f *FilterFlagBase) DefaultValue() string {
return f.defaultValue
}

func (f *FilterFlagBase) Register(cmd *cobra.Command) {
cmd.Flags().StringP(f.Name(), f.Shorthand(), f.DefaultValue(), f.Description())
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. No need to define all these functions as this is only used here, right?

Suggested change
func (f *FilterFlagBase) Name() string {
return f.name
}
func (f *FilterFlagBase) Description() string {
return f.description
}
func (f *FilterFlagBase) Shorthand() string {
return f.shorthand
}
func (f *FilterFlagBase) DefaultValue() string {
return f.defaultValue
}
func (f *FilterFlagBase) Register(cmd *cobra.Command) {
cmd.Flags().StringP(f.Name(), f.Shorthand(), f.DefaultValue(), f.Description())
}
func (f *FilterFlagBase) Register(cmd *cobra.Command) {
cmd.Flags().StringP(f.name, f.shorthand, f.defaultValue, f.description)
}

Comment on lines 13 to 35
// FilterFlag defines the basic interface for filter flags.
type FilterFlag interface {
Name() string
Description() string
Shorthand() string
DefaultValue() string

Register(cmd *cobra.Command)
IsApplied() bool
}

// Filter extends FilterFlag with filtering capabilities.
// It defines the interface for filtering packages based on specific criteria.
type Filter interface {
FilterFlag

Parse(cmd *cobra.Command) error
Validate() error
ApplyTo(pkgs []packages.PackageDirNameAndManifest) ([]packages.PackageDirNameAndManifest, error)
// Matches checks if a package matches the filter criteria.
// dirName is the directory name of the package in package root.
Matches(dirName string, manifest *packages.PackageManifest) bool
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. Is FilterFlag interface ever used? I think these interfaces can be reduced to one, specially if we remove the methods to access attributes.

Suggested change
// FilterFlag defines the basic interface for filter flags.
type FilterFlag interface {
Name() string
Description() string
Shorthand() string
DefaultValue() string
Register(cmd *cobra.Command)
IsApplied() bool
}
// Filter extends FilterFlag with filtering capabilities.
// It defines the interface for filtering packages based on specific criteria.
type Filter interface {
FilterFlag
Parse(cmd *cobra.Command) error
Validate() error
ApplyTo(pkgs []packages.PackageDirNameAndManifest) ([]packages.PackageDirNameAndManifest, error)
// Matches checks if a package matches the filter criteria.
// dirName is the directory name of the package in package root.
Matches(dirName string, manifest *packages.PackageManifest) bool
}
// Filter defines the interface for filtering packages based on specific criteria.
type Filter interface {
Register(cmd *cobra.Command)
IsApplied() bool
Parse(cmd *cobra.Command) error
Validate() error
ApplyTo(pkgs []packages.PackageDirNameAndManifest) ([]packages.PackageDirNameAndManifest, error)
// Matches checks if a package matches the filter criteria.
// dirName is the directory name of the package in package root.
Matches(dirName string, manifest *packages.PackageManifest) bool
}

Comment on lines 13 to 27
// splitAndTrim splits a string by delimiter and trims whitespace from each element
func splitAndTrim(s, delimiter string) map[string]struct{} {
if s == "" {
return nil
}
parts := strings.Split(s, delimiter)
result := make(map[string]struct{}, len(parts))
for _, part := range parts {
trimmed := strings.TrimSpace(part)
if trimmed != "" {
result[trimmed] = struct{}{}
}
}
return result
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. No need to use a map here, looking to this function and description one would expect a list of strings as result.

Suggested change
// splitAndTrim splits a string by delimiter and trims whitespace from each element
func splitAndTrim(s, delimiter string) map[string]struct{} {
if s == "" {
return nil
}
parts := strings.Split(s, delimiter)
result := make(map[string]struct{}, len(parts))
for _, part := range parts {
trimmed := strings.TrimSpace(part)
if trimmed != "" {
result[trimmed] = struct{}{}
}
}
return result
}
// splitAndTrim splits a string by delimiter and trims whitespace from each element
func splitAndTrim(s, delimiter string) []string {
if s == "" {
return nil
}
parts := strings.Split(s, delimiter)
var result []string
for _, part := range parts {
trimmed := strings.TrimSpace(part)
if trimmed != "" && !slices.Contains(result, trimmed) {
result = append(result, trimmed)
}
}
return result
}

@vinit-chauhan vinit-chauhan self-assigned this Oct 30, 2025
@vinit-chauhan vinit-chauhan marked this pull request as draft October 31, 2025 04:31
@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

History

cc @vinit-chauhan

- update --packages to use only package name; added another filter to filter by package dirs.
- Added auto discovery of package for configurable depth.
- Added flag to exclude dirs from filter process.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add subcommand for bulk actions

4 participants