The State Management Database (SMD) is a robust service designed for monitoring, tracking, and managing hardware components in high-performance computing (HPC) environments. It performs dynamic inventory discovery, interrogates hardware controllers, and maintains real-time state and lifecycle data. SMD captures essential component details such as hardware status, logical roles, architecture, and resource capabilities, making this information accessible via REST queries and event-driven notifications. Additionally, it facilitates component grouping, partitioning, power operations, firmware management, and boot-time configuration. By maintaining a comprehensive hardware inventory and tracking system changes, SMD ensures efficient resource management, operational continuity, and streamlined troubleshooting across diverse HPC infrastructures.
- SMD provides inventory management services for HPC systems based on BMC discovery and enumeration.
SMD is responsible for:
- Discovering hardware inventory.
- Tracking hardware state, logical roles, and enabled/disabled status.
- Creating and managing component groups and partitions.
- Storing and retrieving Redfish endpoint and component data.
- Monitoring hardware state transitions via Redfish events.
SMD employs GoReleaser for automated releases and build metadata tracking.
To build locally:
export GIT_STATE=$(if git diff-index --quiet HEAD --; then echo 'clean'; else echo 'dirty'; fi)
export BUILD_HOST=$(hostname)
export GO_VERSION=$(go version | awk '{print $3}')
export BUILD_USER=$(whoami)Follow GoReleaser’s installation guide.
goreleaser release --snapshot --cleanBuilt binaries will be located in the dist/ directory.
-
Start services using the quick start guide
Use the quick start guide to start the services. See README
Edit
openchami-svcs.ymland addENABLE_DISCOVERY=trueto the SMD container's environment variable list.Create a docker compose file to start the Redfish Emulator. For an example see computes.yml
Start the docker compose containers. Use the directions in the quick start, but also add
-d computes.ymlto start the simulator containers.For example:
docker compose -f base.yml -f postgres.yml -f jwt-security.yml -f haproxy-api-gateway.yml -f openchami-svcs.yml -f autocert.yml -f coredhcp.yml -f configurator.yml -f computes.yml up -d -
Build the SMD test image
make ct-image -
Set environment variables
export COMPOSE_NAME=quickstart export SMD_VERSION=v2.18.0Note:
SMD_VERSIONis the version of the test image. The version of the running SMD container is inopenchami-svcs.yml -
Add nodes to SMD
This discovers hardware using the redfish interfaces simulated by the Redfish Interface Emulator (RIE).
docker run -it --rm --network ${COMPOSE_NAME}_internal smd-test:${SMD_VERSION} smd-test smd-discover -n x0c0s1b0 -n x0c0s2b0 -n x0c0s3b0 -n x0c0s4b0 -
Run non-destructive tests
docker run -it --rm --network ${COMPOSE_NAME}_internal smd-test:${SMD_VERSION} smd-test test -t smoke -t 1-hardware-checks -t 2-non-disruptive -t 3-disruptive -
Run destructive tests (Optional)
docker run -it --rm --network ${COMPOSE_NAME}_internal smd-test:${SMD_VERSION} smd-test test -t 4-destructive-initial -t 5-destructive-finalThese tests will destore some of SMD's data, which will not be easily recovered.
```
docker run -it --rm --network ${COMPOSE_NAME}_internal smd-test:${SMD_VERSION} smd-test list
```
```
docker run -it --rm --network ${COMPOSE_NAME}_internal -v $(pwd)/test/ct:/tests/ct smd-test:${SMD_VERSION} smd-test test -t smoke -t 1-hardware-checks -t 2-non-disruptive -t 3-disruptive
```
Note: This example can be run from the root directory of a clone of the SMD git repository.
```
docker run -it --rm --network ${COMPOSE_NAME}_internal smd-test:${SMD_VERSION} pytest -vvvv /tests/api/1-hardware-checks --rootdir=/ --tavern-global-cfg /opt/smd-test/libs/tavern_global_config_ct_test.yaml
```
Environment variables can be set for runtime configurations:
RF_MSG_HOST # Kafka host:port:topic
SMD_PROXY # socks5 proxy for Redfish endpoint interrogation
SMD_DBTYPE # Database type (default: postgres)
SMD_DBNAME # Database name (default: hmsds)
SMD_DBUSER # Database user (default: hmsdsuser)
SMD_DBHOST # Database hostname (e.g., cray-smd-postgres in Kubernetes)
SMD_DBPORT # Database port (default: 5432)
SMD_DBPASS # Database password
SMD_DBOPTS # Additional DB parameters
LOGLEVEL # Logging level (0-4)To run SMD locally with a PostgreSQL database:
- Start a local PostgreSQL container:
sudo docker run --rm --name cray-smd-postgres -e POSTGRES_PASSWORD=hmsdsuser \ -e POSTGRES_USER=hmsdsuser -e POSTGRES_DB=hmsds -d -p 5432:5432 postgres:10.8
- Initialize the database schema:
sudo docker run --name smd-init --link cray-smd-postgres:cray-smd-postgres \ -e SMD_DBHOST=cray-smd-postgres -e SMD_DBOPTS="sslmode=disable" -e SMD_DBPASS=hmsdsuser \ -d dtr.dev.cray.com:443/cray/cray-smd-init:latest - Start the SMD service:
sudo docker run --name smd --net host -p 27779:27779 -e SMD_DBHOST=127.0.0.1 \ -e SMD_DBPASS=hmsdsuser -e SMD_DBOPTS="sslmode=disable" \ -e SMD_PROXY="socks5://127.0.0.1:9999" -d dtr.dev.cray.com:443/cray/cray-smd:latest
- Verify the service is running:
curl -k https://localhost:27779/smd/hsm/v2/groups
Find the machine you wish to discover and ssh to it with dynamic port forwarding enabled on the local port you gave for SMD_PROXY:
ssh -D 9999 [email protected]Leave this window open until you are finished with the discovery.
Double check /etc/hosts for the BMC IP addresses that are assigned to the nodes you wish to discover, in case they are non-standard ones
If the proxy has been set up (or you are running locally on an SMS), then you can then create endpoints for every BMC you wish to discover using their native BMC IP addresses.
NOTE: If you need particular NIDs and Roles, you will need to set up xname entries in /hsm/v2/Defaults/NodeMaps BEFORE discovery OR patch the NID and/or Role fields after discovery:
Example creation and discovery of preview system computes
These are the usual computes found on a standard preview system, but you can easily adapt this example for whatever is in /etc/hosts. Just make sure you use the BMC xname and a raw IP (if using a socks5 proxy):
curl -k -d '{"ID": "x0c0s28b0", "RediscoverOnUpdate":true, "Hostname":"10.4.0.5", "User": "root","Password": "somePassword"}' -H "Content-Type: application/json" -X POST https://localhost:27779/hsm/v2/Inventory/RedfishEndpoints
curl -k -d '{"ID": "x0c0s26b0", "RediscoverOnUpdate":true, "Hostname":"10.4.0.6", "User": "root","Password": "somePassword"}' -H "Content-Type: application/json" -X POST https://localhost:27779/hsm/v2/Inventory/RedfishEndpoints
curl -k -d '{"ID": "x0c0s24b0", "RediscoverOnUpdate":true, "Hostname":"10.4.0.7", "User": "root","Password": "somePassword"}' -H "Content-Type: application/json" -X POST https://localhost:27779/hsm/v2/Inventory/RedfishEndpoints
curl -k -d '{"ID": "x0c0s21b0", "RediscoverOnUpdate":true, "Hostname":"10.4.0.8", "User": "root","Password": "somePassword"}' -H "Content-Type: application/json" -X POST https://localhost:27779/hsm/v2/Inventory/RedfishEndpoints
Note
the above path is assuming you are running docker in a bare container (see above). Otherwise use 'https:///apis/smd/hsm/v2/... instead of 'https://localhost:27779/hsm/...'
Note
Also note that inventory discovery is a read-only operation and should not do anything to the endpoints besides walk them via GETs. The "RediscoverOnUpdate":true field is important because it will automatically kick off inventory discovery.
Please refer to Architecture and Design Details for API overview
The complete HSM (smd) API documentation is included in the Cray API docs. This is the nightly-generated version. Content is generated in an automated fashion from the current swagger.yaml file.
http://web.us.cray.com/~ekoen/cray-portal/public
Latest detailed API usage examples:
https://github.com/OpenCHAMI/smd/blob/master/docs/examples.adoc (current)
Latest swagger.yaml (if you would prefer to use the OpenAPI viewer of your choice):
https://github.com/OpenCHAMI/smd/blob/master/api/swagger_v2.yaml (current)
- Architecture and Design Details
- API Definitions
- Full API Documentation
- HPE’s Original SMD Documentation
For advanced configuration, troubleshooting, and database management, refer to additional documentation in the docs/ directory.