Skip to content

Add foundational infrastructure for Standalone gNMI server #444

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 38 commits into
base: master
Choose a base branch
from

Conversation

hdwhdw
Copy link
Contributor

@hdwhdw hdwhdw commented Jul 9, 2025

Dependencies

Parent PR: None (merges to master)


Why I did it

This PR establishes a minimal gRPC server foundation as the baseline for implementing SONiC gNMI/gNOI/gNSI services in a standalone, image-independent manner.

SONiC needs a clean, modular approach to implementing gRPC services that:

  • Works independently of SONiC image versions and host components
  • Can be delivered as a standalone container to any SONiC device
  • Provides a minimal foundation without premature service implementations
  • Supports gradual, reviewable addition of specific services

Current gNMI implementations are tightly coupled to specific SONiC versions. This standalone approach eliminates version dependencies by providing a self-contained, containerized solution.

How I did it

This PR provides a minimal gRPC server foundation:

Minimal Server Implementation:

  • Basic gRPC server with configurable TLS support
  • Only gRPC reflection services enabled (no business logic)
  • Clean baseline ready for service additions
  • Modular architecture supporting incremental development

Core Infrastructure:

  • cmd/server: Main server application with graceful shutdown
  • internal/config: Configuration management with CLI flags and environment variables
  • pkg/server: Minimal gRPC server implementation with reflection

Development Infrastructure:

  • Comprehensive Makefile with CI pipeline support
  • Build, test, and code quality tooling
  • Docker support with security-focused design
  • Debian packaging for distribution

Documentation:

  • README with getting started guide
  • API reference for adding services
  • Architecture documentation
  • Development workflow guides

How to verify it

Build and Test:

cd sonic-gnmi-standalone
make ci                 # Run full CI validation
make build             # Build the server
make test              # Run tests

Run the Server:

# Development mode
DISABLE_TLS=true ./bin/sonic-gnmi-standalone --addr=:50051

# Production mode with TLS
./bin/sonic-gnmi-standalone --tls-cert=server.crt --tls-key=server.key

# Container mode (with host filesystem mounted)
./bin/sonic-gnmi-standalone --rootfs=/mnt/host --addr=:50051

Test gRPC Server:

# List available services (should only show reflection)
grpcurl -plaintext localhost:50051 list

# Expected output:
# grpc.reflection.v1.ServerReflection
# grpc.reflection.v1alpha.ServerReflection

Which release branch to backport (provide reason below if selected)

Not applicable - this is a new standalone component. The service is designed to be delivered as an independent container that can work with any SONiC version without requiring backporting.

Description for the changelog

Add minimal gRPC server foundation (sonic-gnmi-standalone) for implementing standalone gNMI/gNOI/gNSI services

Link to config_db schema for YANG module changes

Not applicable - this service operates independently of config_db and does not modify existing YANG schemas.

A picture of a cute animal (not mandatory but encouraged)

     /\_/\
    ( o.o )
     > ^ <   "Starting fresh with a clean foundation\! 🎯"

Additional Context:

This PR has been restructured to provide a minimal gRPC server foundation without any service implementations. The approach has changed from the original multi-PR series to:

  1. This PR: Minimal gRPC server with only reflection services
  2. Future PRs: Individual service implementations can be added as needed

Key changes from the original approach:

  • Renamed from upgrade-service to sonic-gnmi-standalone to reflect broader scope
  • Removed all service implementations to provide cleanest possible baseline
  • Binary renamed to sonic-gnmi-standalone for clarity
  • Focused on being a foundation for any gRPC service, not just upgrades

This minimal approach allows for:

  • Clean review of just the infrastructure
  • Flexible addition of services based on actual needs
  • Clear separation between foundation and business logic
  • Easy testing and validation of the base server

hdwhdw added 4 commits July 8, 2025 12:58
This commit establishes the foundation for the SONiC upgrade service with:

- Core gRPC server with TLS support and reflection
- SystemInfo service with platform detection and disk space monitoring
- Protocol buffer definitions for all services
- Build system with comprehensive Makefile and tool management
- Security hardening with golangci-lint and TLS configuration
- Debian packaging support for production deployment
- Testing framework with e2e test structure
- Container compatibility with path resolution
- CI pipeline setup and coverage reporting
- Development tooling and Docker support

The infrastructure provides a solid foundation for adding feature-specific
functionality in subsequent branches while maintaining security and
code quality standards.
This commit adds comprehensive project documentation:

- README.md: Complete getting started guide and API overview
- ARCHITECTURE.md: System design and component architecture
- TLS.md: TLS configuration and security setup
- cmd/README.md: Command-line tools documentation
- internal/README.md: Internal packages overview
- pkg/README.md: Public server packages documentation
- cmd/test/diskspace/: Disk space analysis test utility

This documentation provides the foundation for understanding and
contributing to the SONiC upgrade service project.
- Complete gRPC API documentation for SystemInfo and FirmwareManagement services
- Request/response message specifications with protobuf definitions
- Usage examples with grpcurl commands
- Error handling and status code reference
- Configuration and deployment guidelines
- Development and testing instructions
- Change validate-coverage to test-coverage in ci target
- Coverage is still reported but no longer blocks CI
- Allows all branches to pass CI regardless of coverage percentage
@mssonicbld
Copy link

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

hdwhdw added 10 commits July 9, 2025 12:55
- Remove vendor and model fields from GetPlatformTypeResponse
- Update GetPlatformIdentifierString to return only platform identifier
- Update all tests to reflect simplified API
- Regenerate protobuf files
- All tests pass and CI is green
- Completely rewrite hostinfo package to be minimal
- Remove all vendor/model parsing and complex logic
- Extract platform from machine.conf using simple field priority
- PlatformInfo now only contains ConfigMap and Platform fields
- GetPlatformIdentifierString simply returns the platform string
- Simplified all tests to match the minimal implementation
- Returns raw platform strings like 'x86_64-mlnx_msn4600c-r0'
- Removed ~400 lines of complex vendor/model extraction code
- All tests pass, CI is green, coverage maintained
- Default TLS to disabled for easier development/testing
- Add --enable-tls flag to optionally enable TLS
- Pass DISABLE_TLS environment variable to container
- Show TLS status in deployment completion message
- Pass -rootfs=/host to opsd-server command
- This fixes 'machine.conf not found' error in SONiC containers
- The /host mount point contains the actual host filesystem
- Go flags use single dash, not double dash
- Changed --addr to -addr and --shutdown-timeout to -shutdown-timeout
- This fixes the rootfs parameter not being parsed correctly
- Changed from /host/machine.conf to /machine.conf
- The rootfs path is applied by paths.ToHost function
- This fixes platform detection when rootfs is set to /host
- Remove redundant binary path from docker run command
- Only pass flags after the image name
- Fix entrypoint to use single-dash Go flags
- This ensures -rootfs=/host is properly parsed by the server
Use adduser --system --group instead of separate groupadd/useradd commands.
This approach automatically sets secure defaults:
- Shell to /usr/sbin/nologin
- Home directory to /nonexistent
- Proper system user restrictions
Changed all references from sonic-ops-server to opsd-server to match
the actual binary name used in the project.
The script duplicated functionality already available in tools.mk.
Use 'make install-protoc && make proto' instead.
@mssonicbld
Copy link

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Removed specific reference to .azure-pipelines/api-service-ci.yml
which does not exist in the repository.
@mssonicbld
Copy link

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

The getMachineConfPath() function was incorrectly looking for machine.conf
at /machine.conf instead of /host/machine.conf, which is the standard
location in SONiC systems.

This fix resolves:
- hostinfo unit test failures (TestGetMachineConfPath)
- e2e test failures (TestGetPlatformType_E2E)

All tests now pass and CI is clean.
@mssonicbld
Copy link

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

The deployment script was looking for image 'gnmi:latest' but the
Makefile builds 'gnmi-standalone-test:latest'. Updated the script
to use the correct image name.
@mssonicbld
Copy link

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

hdwhdw added 2 commits July 22, 2025 11:29
Update the deployment script to use current command-line flags instead
of stale environment variables:
- Use --addr flag instead of OPSD_ADDR environment variable
- Use --no-tls flag instead of NO_TLS environment variable
- Use --rootfs flag instead of -rootfs= argument
- Rename OPSD_ADDR variable to SERVER_ADDR for clarity

This aligns the deployment script with the current server configuration.
Update import paths from 'upgrade-service' to 'sonic-gnmi-standalone'
to match the project structure and fix binary compilation.
@mssonicbld
Copy link

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

The build_deploy.sh script was incorrectly passing the binary name as an
argument to the container, causing the entrypoint to execute:
/usr/local/bin/sonic-gnmi-standalone sonic-gnmi-standalone --args...

This caused flag parsing to fail since "sonic-gnmi-standalone" was treated
as an invalid flag. Fixed by:

- Removing binary name from container arguments in build_deploy.sh
- Simplifying docker-entrypoint.sh to pass through all arguments directly
- Adding debug output to show actual container command being executed

The container now properly starts with --no-tls flag and listens on the
configured port without TLS certificate errors.
@mssonicbld
Copy link

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

- Renamed build_deploy.sh to build_deploy_testonly.sh to indicate test-only purpose
- Changed container name from 'gnmi' to 'gnmi-standalone-testonly' to avoid conflicts
- Updated README.md to reference the new script name

This makes it clear that this deployment method is intended for testing only
and prevents accidental conflicts with production containers.
@mssonicbld
Copy link

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

- Remove gnmi user creation and USER directive
- Run as root to fix permission issues with --rootfs flag
- Allows SetPackage to write files to host filesystem locations like /tmp
- Fixes permission denied errors when using sonic-gnoi CLI tool
@mssonicbld
Copy link

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@hdwhdw hdwhdw requested review from Ryangwaite and saiarcot895 July 25, 2025 01:04
@mssonicbld
Copy link

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

This improves the package structure by placing server configuration
alongside the server implementation, making the API more discoverable
and the package structure more intuitive.

Co-Authored-By: Claude <[email protected]>
@mssonicbld
Copy link

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

- Create ServerBuilder with fluent API for service configuration
- Add support for dynamic service enabling/disabling via builder pattern
- Disable gochecknoinits linter (init() functions are legitimate Go feature)
- Improve server structure with better method ordering (funcorder lint)
- Add infrastructure foundation for service registration

The builder provides a clean foundation for service-specific branches
to add their own service registrations without modifying core server logic.

Co-Authored-By: Claude <[email protected]>
@mssonicbld
Copy link

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants