State Management Database (SMD)

About

The State Management Database (SMD) is a robust service designed for monitoring, tracking, and managing hardware components in high-performance computing (HPC) environments. It performs dynamic inventory discovery, interrogates hardware controllers, and maintains real-time state and lifecycle data. SMD captures essential component details such as hardware status, logical roles, architecture, and resource capabilities, making this information accessible via REST queries and event-driven notifications. Additionally, it facilitates component grouping, partitioning, power operations, firmware management, and boot-time configuration. By maintaining a comprehensive hardware inventory and tracking system changes, SMD ensures efficient resource management, operational continuity, and streamlined troubleshooting across diverse HPC infrastructures.

SMD provides inventory management services for HPC systems based on BMC discovery and enumeration.

Overview

SMD is responsible for:

Discovering hardware inventory.
Tracking hardware state, logical roles, and enabled/disabled status.
Creating and managing component groups and partitions.
Storing and retrieving Redfish endpoint and component data.
Monitoring hardware state transitions via Redfish events.

Build & Install

Using GoReleaser

SMD employs GoReleaser for automated releases and build metadata tracking.

To build locally:

Set Environment Variables

export GIT_STATE=$(if git diff-index --quiet HEAD --; then echo 'clean'; else echo 'dirty'; fi)
export BUILD_HOST=$(hostname)
export GO_VERSION=$(go version | awk '{print $3}')
export BUILD_USER=$(whoami)

Install GoReleaser

Follow GoReleaser’s installation guide.

Build Locally

goreleaser release --snapshot --clean

Built binaries will be located in the dist/ directory.

Testing

Running the CT Tests in a Docker Compose Environment

Start services using the quick start guide

Use the quick start guide to start the services. See README

Edit openchami-svcs.yml and add ENABLE_DISCOVERY=true to the SMD container's environment variable list.

Create a docker compose file to start the Redfish Emulator. For an example see computes.yml

Start the docker compose containers. Use the directions in the quick start, but also add -d computes.yml to start the simulator containers.

For example:
```
docker compose -f base.yml -f postgres.yml -f jwt-security.yml -f haproxy-api-gateway.yml -f  openchami-svcs.yml -f autocert.yml -f coredhcp.yml -f configurator.yml -f computes.yml up -d
```
Build the SMD test image
```
make ct-image
```
Set environment variables
```
export COMPOSE_NAME=quickstart
export SMD_VERSION=v2.18.0
```
Note: SMD_VERSION is the version of the test image. The version of the running SMD container is in openchami-svcs.yml

Add nodes to SMD

This discovers hardware using the redfish interfaces simulated by the Redfish Interface Emulator (RIE).

docker run -it --rm --network ${COMPOSE_NAME}_internal smd-test:${SMD_VERSION}  smd-test smd-discover -n x0c0s1b0 -n x0c0s2b0 -n x0c0s3b0 -n x0c0s4b0

Run non-destructive tests

docker run -it --rm --network ${COMPOSE_NAME}_internal  smd-test:${SMD_VERSION}  smd-test test -t smoke -t 1-hardware-checks -t 2-non-disruptive -t 3-disruptive

Run destructive tests (Optional)

docker run -it --rm --network ${COMPOSE_NAME}_internal  smd-test:${SMD_VERSION}  smd-test test -t 4-destructive-initial -t 5-destructive-final

These tests will destore some of SMD's data, which will not be easily recovered.

Miscellaneous Test Options

List the available tests

```
docker run -it --rm --network ${COMPOSE_NAME}_internal  smd-test:${SMD_VERSION}  smd-test list
```

Run the tests with tavern files from a local directory

```
docker run -it --rm --network ${COMPOSE_NAME}_internal -v $(pwd)/test/ct:/tests/ct  smd-test:${SMD_VERSION}  smd-test test -t smoke -t 1-hardware-checks -t 2-non-disruptive -t 3-disruptive
```
Note: This example can be run from the root directory of a clone of the SMD git repository.

Run the tavern tests directly with pytest

```
docker run -it --rm --network ${COMPOSE_NAME}_internal  smd-test:${SMD_VERSION} pytest -vvvv /tests/api/1-hardware-checks --rootdir=/ --tavern-global-cfg /opt/smd-test/libs/tavern_global_config_ct_test.yaml
```

Running

Runtime Options

Environment variables can be set for runtime configurations:

RF_MSG_HOST   # Kafka host:port:topic
SMD_PROXY     # socks5 proxy for Redfish endpoint interrogation
SMD_DBTYPE    # Database type (default: postgres)
SMD_DBNAME    # Database name (default: hmsds)
SMD_DBUSER    # Database user (default: hmsdsuser)
SMD_DBHOST    # Database hostname (e.g., cray-smd-postgres in Kubernetes)
SMD_DBPORT    # Database port (default: 5432)
SMD_DBPASS    # Database password
SMD_DBOPTS    # Additional DB parameters
LOGLEVEL      # Logging level (0-4)

Running Outside Kubernetes

To run SMD locally with a PostgreSQL database:

Start a local PostgreSQL container:

sudo docker run --rm --name cray-smd-postgres -e POSTGRES_PASSWORD=hmsdsuser \
-e POSTGRES_USER=hmsdsuser -e POSTGRES_DB=hmsds -d -p 5432:5432 postgres:10.8

Initialize the database schema:

sudo docker run --name smd-init --link cray-smd-postgres:cray-smd-postgres \
-e SMD_DBHOST=cray-smd-postgres -e SMD_DBOPTS="sslmode=disable" -e SMD_DBPASS=hmsdsuser \
-d dtr.dev.cray.com:443/cray/cray-smd-init:latest

Start the SMD service:

sudo docker run --name smd --net host -p 27779:27779 -e SMD_DBHOST=127.0.0.1 \
-e SMD_DBPASS=hmsdsuser -e SMD_DBOPTS="sslmode=disable" \
-e SMD_PROXY="socks5://127.0.0.1:9999" -d dtr.dev.cray.com:443/cray/cray-smd:latest

Verify the service is running:

curl -k https://localhost:27779/smd/hsm/v2/groups

Using Proxy Mode to Get Access to Non-Local BMCs

Find the machine you wish to discover and ssh to it with dynamic port forwarding enabled on the local port you gave for SMD_PROXY:

ssh -D 9999 [email protected]

Leave this window open until you are finished with the discovery.

Double check /etc/hosts for the BMC IP addresses that are assigned to the nodes you wish to discover, in case they are non-standard ones

Discovering Nodes Once HSM is Properly Running

If the proxy has been set up (or you are running locally on an SMS), then you can then create endpoints for every BMC you wish to discover using their native BMC IP addresses.

NOTE: If you need particular NIDs and Roles, you will need to set up xname entries in /hsm/v2/Defaults/NodeMaps BEFORE discovery OR patch the NID and/or Role fields after discovery:

See: https://connect.us.cray.com/confluence/display/HSOS/HSM+Documentation+for+SPS2+-+Setting+Default+NIDs

Example creation and discovery of preview system computes

These are the usual computes found on a standard preview system, but you can easily adapt this example for whatever is in /etc/hosts. Just make sure you use the BMC xname and a raw IP (if using a socks5 proxy):

curl -k -d '{"ID": "x0c0s28b0", "RediscoverOnUpdate":true, "Hostname":"10.4.0.5", "User": "root","Password": "somePassword"}' -H "Content-Type: application/json" -X POST https://localhost:27779/hsm/v2/Inventory/RedfishEndpoints
curl -k -d '{"ID": "x0c0s26b0", "RediscoverOnUpdate":true, "Hostname":"10.4.0.6", "User": "root","Password": "somePassword"}' -H "Content-Type: application/json" -X POST https://localhost:27779/hsm/v2/Inventory/RedfishEndpoints
curl -k -d '{"ID": "x0c0s24b0", "RediscoverOnUpdate":true, "Hostname":"10.4.0.7", "User": "root","Password": "somePassword"}' -H "Content-Type: application/json" -X POST https://localhost:27779/hsm/v2/Inventory/RedfishEndpoints
curl -k -d '{"ID": "x0c0s21b0", "RediscoverOnUpdate":true, "Hostname":"10.4.0.8", "User": "root","Password": "somePassword"}' -H "Content-Type: application/json" -X POST https://localhost:27779/hsm/v2/Inventory/RedfishEndpoints

Note

the above path is assuming you are running docker in a bare container (see above). Otherwise use 'https:///apis/smd/hsm/v2/... instead of 'https://localhost:27779/hsm/...'

Note

Also note that inventory discovery is a read-only operation and should not do anything to the endpoints besides walk them via GETs. The "RediscoverOnUpdate":true field is important because it will automatically kick off inventory discovery.

API Overview

Please refer to Architecture and Design Details for API overview

Additional API Documentation

The complete HSM (smd) API documentation is included in the Cray API docs. This is the nightly-generated version. Content is generated in an automated fashion from the current swagger.yaml file.

http://web.us.cray.com/~ekoen/cray-portal/public

Latest detailed API usage examples:

https://github.com/OpenCHAMI/smd/blob/master/docs/examples.adoc (current)

Latest swagger.yaml (if you would prefer to use the OpenAPI viewer of your choice):

https://github.com/OpenCHAMI/smd/blob/master/api/swagger_v2.yaml (current)

Name		Name	Last commit message	Last commit date
Latest commit History 830 Commits
.github/workflows		.github/workflows
api		api
cmd		cmd
configs		configs
docs		docs
internal		internal
migrations/postgres		migrations/postgres
pkg		pkg
scripts		scripts
test		test
.gitignore		.gitignore
.goreleaser.yaml		.goreleaser.yaml
.license_check.yaml		.license_check.yaml
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
Dockerfile.CSM		Dockerfile.CSM
Dockerfile.pprof		Dockerfile.pprof
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SMD-DESIGN.md		SMD-DESIGN.md
entrypoint.sh		entrypoint.sh
go.mod		go.mod
go.sum		go.sum
runSnyk.sh		runSnyk.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

State Management Database (SMD)

Table of Contents

About

Overview

Build & Install

Using GoReleaser

Set Environment Variables

Install GoReleaser

Build Locally

Testing

Running the CT Tests in a Docker Compose Environment

Miscellaneous Test Options

List the available tests

Run the tests with tavern files from a local directory

Run the tavern tests directly with pytest

Running

Runtime Options

Running Outside Kubernetes

Using Proxy Mode to Get Access to Non-Local BMCs

Discovering Nodes Once HSM is Properly Running

API Overview

Additional API Documentation

More Reading

About

Uh oh!

Releases 25

Packages

Uh oh!

Uh oh!

Contributors 16

Uh oh!

Languages

License

OpenCHAMI/smd

Folders and files

Latest commit

History

Repository files navigation

State Management Database (SMD)

Table of Contents

About

Overview

Build & Install

Using GoReleaser

Set Environment Variables

Install GoReleaser

Build Locally

Testing

Running the CT Tests in a Docker Compose Environment

Miscellaneous Test Options

List the available tests

Run the tests with tavern files from a local directory

Run the tavern tests directly with pytest

Running

Runtime Options

Running Outside Kubernetes

Using Proxy Mode to Get Access to Non-Local BMCs

Discovering Nodes Once HSM is Properly Running

API Overview

Additional API Documentation

More Reading

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 25

Packages 0

Uh oh!

Uh oh!

Contributors 16

Uh oh!

Languages

Packages