Skip to content

Upgrade mops setpackage #459

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 45 commits into
base: master
Choose a base branch
from

Conversation

hdwhdw
Copy link
Contributor

@hdwhdw hdwhdw commented Jul 24, 2025

Why I did it

How I did it

How to verify it

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

hdwhdw and others added 30 commits July 8, 2025 12:58
This commit establishes the foundation for the SONiC upgrade service with:

- Core gRPC server with TLS support and reflection
- SystemInfo service with platform detection and disk space monitoring
- Protocol buffer definitions for all services
- Build system with comprehensive Makefile and tool management
- Security hardening with golangci-lint and TLS configuration
- Debian packaging support for production deployment
- Testing framework with e2e test structure
- Container compatibility with path resolution
- CI pipeline setup and coverage reporting
- Development tooling and Docker support

The infrastructure provides a solid foundation for adding feature-specific
functionality in subsequent branches while maintaining security and
code quality standards.
This commit adds comprehensive project documentation:

- README.md: Complete getting started guide and API overview
- ARCHITECTURE.md: System design and component architecture
- TLS.md: TLS configuration and security setup
- cmd/README.md: Command-line tools documentation
- internal/README.md: Internal packages overview
- pkg/README.md: Public server packages documentation
- cmd/test/diskspace/: Disk space analysis test utility

This documentation provides the foundation for understanding and
contributing to the SONiC upgrade service project.
- Complete gRPC API documentation for SystemInfo and FirmwareManagement services
- Request/response message specifications with protobuf definitions
- Usage examples with grpcurl commands
- Error handling and status code reference
- Configuration and deployment guidelines
- Development and testing instructions
- Change validate-coverage to test-coverage in ci target
- Coverage is still reported but no longer blocks CI
- Allows all branches to pass CI regardless of coverage percentage
- Remove vendor and model fields from GetPlatformTypeResponse
- Update GetPlatformIdentifierString to return only platform identifier
- Update all tests to reflect simplified API
- Regenerate protobuf files
- All tests pass and CI is green
- Completely rewrite hostinfo package to be minimal
- Remove all vendor/model parsing and complex logic
- Extract platform from machine.conf using simple field priority
- PlatformInfo now only contains ConfigMap and Platform fields
- GetPlatformIdentifierString simply returns the platform string
- Simplified all tests to match the minimal implementation
- Returns raw platform strings like 'x86_64-mlnx_msn4600c-r0'
- Removed ~400 lines of complex vendor/model extraction code
- All tests pass, CI is green, coverage maintained
- Default TLS to disabled for easier development/testing
- Add --enable-tls flag to optionally enable TLS
- Pass DISABLE_TLS environment variable to container
- Show TLS status in deployment completion message
- Pass -rootfs=/host to opsd-server command
- This fixes 'machine.conf not found' error in SONiC containers
- The /host mount point contains the actual host filesystem
- Go flags use single dash, not double dash
- Changed --addr to -addr and --shutdown-timeout to -shutdown-timeout
- This fixes the rootfs parameter not being parsed correctly
- Changed from /host/machine.conf to /machine.conf
- The rootfs path is applied by paths.ToHost function
- This fixes platform detection when rootfs is set to /host
- Remove redundant binary path from docker run command
- Only pass flags after the image name
- Fix entrypoint to use single-dash Go flags
- This ensures -rootfs=/host is properly parsed by the server
Use adduser --system --group instead of separate groupadd/useradd commands.
This approach automatically sets secure defaults:
- Shell to /usr/sbin/nologin
- Home directory to /nonexistent
- Proper system user restrictions
Changed all references from sonic-ops-server to opsd-server to match
the actual binary name used in the project.
The script duplicated functionality already available in tools.mk.
Use 'make install-protoc && make proto' instead.
Removed specific reference to .azure-pipelines/api-service-ci.yml
which does not exist in the repository.
The getMachineConfPath() function was incorrectly looking for machine.conf
at /machine.conf instead of /host/machine.conf, which is the standard
location in SONiC systems.

This fix resolves:
- hostinfo unit test failures (TestGetMachineConfPath)
- e2e test failures (TestGetPlatformType_E2E)

All tests now pass and CI is clean.
- Mount host root filesystem (/) to /host in container instead of /host to /host
- Add --privileged flag for system operations and hardware access
- Enables proper access to SONiC filesystem and tools from containerized service
…to opsd-server

- Fix debian/rules to copy correct binary name (bin/opsd-server)
- Remove dh_auto_clean to prevent double cleaning that was causing build failures
- This fixes 'make deb' command to build packages successfully

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Remove debian/files from git tracking (build artifact, not source)
- Add debian build artifacts to .gitignore:
  - debian/files (package file list)
  - debian/*.substvars (substitution variables)
  - debian/*.debhelper.log (build logs)

These files are generated during package builds and should not be committed.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Remove all gRPC service implementations to create a clean baseline
with only reflection services. This provides a minimal foundation
for implementing specific gRPC services in subsequent branches.

Changes:
- Remove SystemInfo and FirmwareManagement services completely
- Remove all proto definitions and generated files
- Remove supporting internal packages (hostinfo, diskspace, paths)
- Remove service-specific tests and utilities
- Update server.go to register only reflection services
- Update Makefile to remove proto dependencies
- Rewrite documentation for minimal foundation
- Add CLAUDE.md files to .gitignore

The server now provides only:
- gRPC reflection services (grpc.reflection.v1.ServerReflection)
- TLS configuration support
- Basic configuration management
- Graceful shutdown handling

This creates the cleanest possible baseline for adding services
while maintaining all essential gRPC server infrastructure.
Remove debian/opsd-server.substvars as it is a build artifact generated
by dpkg-buildpackage and should not be committed to version control.

The .gitignore already correctly excludes *.substvars files.
Complete renaming of all project components to better reflect the
project's purpose as a standalone gNMI/gNOI/gNSI server foundation.

Changes:
- Binary renamed from opsd-server to sonic-gnmi-standalone
- Debian package renamed from sonic-ops-server to sonic-gnmi-standalone
- Docker image renamed to gnmi-standalone-test (for development/testing)
- Docker user changed from opsd to gnmi
- Environment variables changed from OPSD_* to GNMI_*
- Updated all documentation to use new names
- Added sonic-gnmi-standalone to .gitignore for debian builds
- Simplified ARCHITECTURE.md to be concise and factual

The server functionality remains unchanged - this is purely a naming
update to better communicate the project's broader scope beyond just
operations/upgrades.
Improve configuration consistency by using command-line flags for all
settings instead of mixing flags and environment variables.

Changes:
- Replace DISABLE_TLS env var with --no-tls flag in config.go
- Update all documentation to use new flag syntax
- Update TLS.md with consistent flag-based examples
- Update scripts to use --no-tls instead of DISABLE_TLS=true
- Update docker build script variable names
- Clean up unused imports
- Update code comments to reference new flag

Benefits:
- Consistent CLI interface using only flags
- Standard Unix CLI patterns (--no-tls)
- No mixed configuration sources
- Clearer help output and documentation

Example usage:
  ./bin/sonic-gnmi-standalone           # TLS enabled (default)
  ./bin/sonic-gnmi-standalone --no-tls  # TLS disabled
Move configuration logging from config.Initialize() to main() for better
visibility. Configuration is now logged at default level instead of requiring
-v=1, making it easier for users to verify startup settings.
The deployment script was looking for image 'gnmi:latest' but the
Makefile builds 'gnmi-standalone-test:latest'. Updated the script
to use the correct image name.
hdwhdw added 5 commits July 22, 2025 11:29
Update the deployment script to use current command-line flags instead
of stale environment variables:
- Use --addr flag instead of OPSD_ADDR environment variable
- Use --no-tls flag instead of NO_TLS environment variable
- Use --rootfs flag instead of -rootfs= argument
- Rename OPSD_ADDR variable to SERVER_ADDR for clarity

This aligns the deployment script with the current server configuration.
Update import paths from 'upgrade-service' to 'sonic-gnmi-standalone'
to match the project structure and fix binary compilation.
The build_deploy.sh script was incorrectly passing the binary name as an
argument to the container, causing the entrypoint to execute:
/usr/local/bin/sonic-gnmi-standalone sonic-gnmi-standalone --args...

This caused flag parsing to fail since "sonic-gnmi-standalone" was treated
as an invalid flag. Fixed by:

- Removing binary name from container arguments in build_deploy.sh
- Simplifying docker-entrypoint.sh to pass through all arguments directly
- Adding debug output to show actual container command being executed

The container now properly starts with --no-tls flag and listens on the
configured port without TLS certificate errors.
- Renamed build_deploy.sh to build_deploy_testonly.sh to indicate test-only purpose
- Changed container name from 'gnmi' to 'gnmi-standalone-testonly' to avoid conflicts
- Updated README.md to reference the new script name

This makes it clear that this deployment method is intended for testing only
and prevents accidental conflicts with production containers.
- Remove gnmi user creation and USER directive
- Run as root to fix permission issues with --rootfs flag
- Allows SetPackage to write files to host filesystem locations like /tmp
- Fixes permission denied errors when using sonic-gnoi CLI tool
@mssonicbld
Copy link

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@hdwhdw hdwhdw force-pushed the upgrade-mops-setpackage branch from eddc2e1 to 1a46623 Compare July 25, 2025 20:40
@mssonicbld
Copy link

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

hdwhdw and others added 8 commits July 25, 2025 15:48
This improves the package structure by placing server configuration
alongside the server implementation, making the API more discoverable
and the package structure more intuitive.

Co-Authored-By: Claude <[email protected]>
- Create ServerBuilder with fluent API for service configuration
- Add support for dynamic service enabling/disabling via builder pattern
- Disable gochecknoinits linter (init() functions are legitimate Go feature)
- Improve server structure with better method ordering (funcorder lint)
- Add infrastructure foundation for service registration

The builder provides a clean foundation for service-specific branches
to add their own service registrations without modifying core server logic.

Co-Authored-By: Claude <[email protected]>
Implemented a skeleton for the gNOI System service to prepare for
SetPackage implementation:

- Added github.com/openconfig/gnoi dependency for proto definitions
- Created pkg/gnoi/system package with System service implementation
- Implemented SetPackage method returning UNIMPLEMENTED status
- Added GRPCServer() accessor to pkg/server for service registration
- Registered System service in main server startup

The skeleton is fully functional and can be tested with grpcurl:
  grpcurl -plaintext localhost:50051 gnoi.system.System/SetPackage

This provides the foundation for implementing package installation
functionality using the previously added download package.
Implemented the SetPackage RPC with the following features and constraints:

- Only supports remote download via HTTP protocol
- Only supports MD5 hash verification
- Rejects direct package transfer (contents streaming not yet supported)
- Rejects non-HTTP protocols (SFTP, SCP, etc. not yet supported)
- Rejects SHA256/SHA512 hash methods (only MD5 currently supported)

The implementation:
- Downloads packages from HTTP URLs using the existing download package
- Verifies MD5 checksums using the existing checksum package
- Handles rootFS configuration for containerized deployments
- Uses temporary files with proper cleanup on errors
- Provides clear error messages for unsupported scenarios

Tested with grpcurl using streaming JSON messages:
- Successful download and installation with rootFS prefix
- Proper rejection of unsupported hash methods
- Proper rejection of direct transfer attempts
- Proper rejection of missing remote_download

This provides a foundation for package installation while clearly
communicating current limitations to clients.
- Create pkg/client/config package with shared Config struct for gRPC clients
- Create pkg/client/gnoi package with SystemClient and SetPackage method
- Implement sonic-gnoi CLI tool with cobra for system set-package command
- Support HTTP downloads with MD5 checksum validation
- Add proper help system and example usage
- Test successful with httpbin.org download

Note: Container permission issue with /tmp access requires running as root user
(fix should be implemented in infra branch)
Move gNOI System service implementation from pkg/gnoi/system/ to
pkg/server/gnoi/system/ to better organize server-side implementations
under the server package hierarchy.

Update main.go to use ServerBuilder pattern for clean service registration.

Co-Authored-By: Claude <[email protected]>
After rebasing on infrastructure branch, extend the ServerBuilder pattern
to register the gNOI System service. This demonstrates how service-specific
branches can cleanly extend the infrastructure foundation.

Changes:
- Add gNOI System service registration to ServerBuilder
- Update go.mod with gNOI dependencies
- Fix linting issues (godot, funcorder)
- Add newlines to files for proper formatting

The builder pattern now provides a clean foundation that can be extended
for additional services like gNMI, gNOI File, etc.

Co-Authored-By: Claude <[email protected]>
@hdwhdw hdwhdw force-pushed the upgrade-mops-setpackage branch from 1a46623 to 7b26dde Compare July 25, 2025 21:20
@mssonicbld
Copy link

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Implement complete client-server integration testing with:

- TestGNOISystemSetPackageLoopback: Happy path with HTTP download and MD5 validation
- TestGNOISystemSetPackageLoopback_InvalidMD5: Error handling for checksum mismatches
- TestGNOISystemSetPackageLoopback_DownloadFailure: Network error handling
- TestGNOISystemSetPackageLoopback_AbsolutePath: RootFS path prefixing validation

The tests validate the complete workflow from client request through
server processing and back to client response, ensuring robust
integration between client and server components.

Changes:
- Add tests/loopback/gnoi_system_test.go with 4 comprehensive test cases
- Add make test-loopback target to Makefile
- Update .PHONY declarations
- All tests pass and integrate with CI pipeline

Co-Authored-By: Claude <[email protected]>
@mssonicbld
Copy link

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants