Skip to content

Conversation

@DaMandal0rian
Copy link
Contributor

@DaMandal0rian DaMandal0rian commented Oct 23, 2024

User description

  • function to prune images
  • argument for network
  • argument for plot_size
  • argument for cache_percentage
  • function to restart compose
  • better handling of ssh connection with retries
  • improve sed logic

PR Type

Enhancement


Description

  • Added retry logic to the run_command function, allowing SSH commands to be retried multiple times with a delay.
  • Enhanced the modify_env_file function to update additional environment variables such as plot_size, cache_percentage, and network.
  • Introduced new functions for Docker operations: docker_compose_restart to restart Docker services, docker_cleanup to prune Docker images and containers, and handle_node to manage node operations.
  • Updated the command-line argument parsing to include new options like network, plot_size, and cache_percentage, providing more flexibility in node management.

Changes walkthrough 📝

Relevant files
Enhancement
manage_subspace.py
Enhance SSH command execution and Docker management           

scripts/launch-nodes/manage_subspace.py

  • Added retry logic to run_command function for SSH command execution.
  • Enhanced modify_env_file to update multiple environment variables.
  • Introduced new functions for Docker operations:
    docker_compose_restart, docker_cleanup, and handle_node.
  • Updated argument parsing to include new options like network,
    plot_size, and cache_percentage.
  • +127/-123

    💡 PR-Agent usage: Comment /help "your question" on any pull request to receive relevant information

    - function to prune images
    - argument for network
    - argument for plot_size
    - argument for cache_percentage
    - function to restart compose
    - better handling of ssh connection with retries
    - improve sed logic
    @github-actions
    Copy link

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
    🧪 No relevant tests
    🔒 Security concerns

    Command Injection:
    The code uses string interpolation for constructing shell commands in several places, which could lead to command injection if the inputs are not properly sanitized or validated. This is particularly concerning in the modify_env_file function where environment variables are directly inserted into shell commands.

    ⚡ Recommended focus areas for review

    Error Handling
    The error handling in the modify_env_file function might not correctly handle partial failures where some commands succeed and others fail. This could lead to inconsistent states if not all environment variables are updated successfully.

    Logging Levels
    The logging levels used across different functions are inconsistent. For example, some errors are logged at the INFO level instead of ERROR, which could lead to important errors being overlooked in logs.

    Command Injection
    The use of string interpolation with external input in shell commands, such as in the modify_env_file function, could potentially lead to command injection vulnerabilities if inputs are not properly sanitized.

    @github-actions
    Copy link

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Score
    Possible bug
    Add error handling for potential decoding issues in command outputs

    Refactor the run_command function to handle the decoding of stdout and stderr within
    a try-except block to catch and handle potential decoding errors.

    scripts/launch-nodes/manage_subspace.py [42-45]

     stdin, stdout, stderr = client.exec_command(command)
     stdout.channel.recv_exit_status()
    -output = stdout.read().decode('utf-8')
    -error = stderr.read().decode('utf-8')
    +try:
    +    output = stdout.read().decode('utf-8')
    +    error = stderr.read().decode('utf-8')
    +except UnicodeDecodeError as e:
    +    logger.error(f"Decoding error: {e}")
    +    raise
    Suggestion importance[1-10]: 8

    Why: Introducing error handling for potential decoding issues in command outputs is a valuable enhancement. It ensures that any Unicode decoding errors are caught and logged, preventing unexpected crashes and aiding in debugging.

    8
    Add error handling for SSH connections to improve robustness

    Add error handling for the ssh_connect function within the main function to ensure
    that the SSH connection is established before proceeding with node handling.

    scripts/launch-nodes/manage_subspace.py [201-202]

     client = ssh_connect(timekeeper_node['host'], timekeeper_node['user'], timekeeper_node['ssh_key'])
    +if client is None:
    +    logger.error(f"Failed to connect to timekeeper node {timekeeper_node['host']}")
    +    return
     handle_node(client, timekeeper_node, args.subspace_dir, args.release_version, pot_external_entropy=args.pot_external_entropy, network=args.network, prune=args.prune, restart=args.restart)
    Suggestion importance[1-10]: 7

    Why: Adding a check to ensure the SSH connection is established before proceeding with node handling improves the robustness of the code. It prevents attempts to handle nodes with a failed connection, which could lead to errors.

    7
    Ensure the protocol_version_hash is valid before using it to update configurations

    Implement a check to ensure that the protocol_version_hash is not empty before using
    it to update the bootstrap node to prevent potential issues with incorrect
    configuration.

    scripts/launch-nodes/manage_subspace.py [247-250]

    -if protocol_version_hash:
    +if protocol_version_hash and protocol_version_hash.strip():
         logger.info(f"Connecting to the bootstrap node {bootstrap_node['host']}...")
         client = ssh_connect(bootstrap_node['host'], bootstrap_node['user'], bootstrap_node['ssh_key'])
         handle_node(client, bootstrap_node, args.subspace_dir, args.release_version, genesis_hash=protocol_version_hash, pot_external_entropy=args.pot_external_entropy, network=args.network, prune=args.prune, restart=args.restart)
    +else:
    +    logger.error("Protocol version hash is empty; skipping bootstrap node update.")
    Suggestion importance[1-10]: 6

    Why: The suggestion to check if protocol_version_hash is not empty before using it adds a layer of validation, preventing potential issues with incorrect configurations. This enhances the reliability of the update process.

    6

    @DaMandal0rian DaMandal0rian force-pushed the feat/launch-script-update branch from 62b1143 to 57e84f0 Compare October 23, 2024 06:16
    update readme file
    
    update readme file
    @DaMandal0rian DaMandal0rian force-pushed the feat/launch-script-update branch from 57e84f0 to b53e008 Compare October 23, 2024 06:50

    ```bash
    source subspace_env/bin/activate
    ```

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    These shouldn't be removed

    --pot_external_entropy random_value --network gemini-3h --plot-size 10G --cache-percentage 15

    # prune images
    python manage_subspace.py --config nodes.toml --release_version gemini-3h-2024-sep-17 --subspace_dir /home/ubuntu/subspace/subspace --prune

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Would this prune the release specified in the release_version or it will prune all other releases except the one specified?

    @DaMandal0rian DaMandal0rian merged commit 9e20b1b into main Oct 23, 2024
    1 check passed
    @DaMandal0rian DaMandal0rian deleted the feat/launch-script-update branch October 23, 2024 11:52
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    3 participants