Skip to content

Conversation

@darkjive
Copy link

This commit adds comprehensive ROCm 6.2 support and VRAM optimization for AMD GPUs, specifically targeting systems with 8-16GB VRAM.

Changes:

  • Updated webui.sh to use ROCm 6.2 instead of 5.7 for AMD GPUs

  • Added webui-user-rocm62.sh: Optimized launch script with:

    • PyTorch ROCm 6.2 installation command
    • PYTORCH_HIP_ALLOC_CONF for memory fragmentation prevention
    • Optimized command-line flags (--medvram, --opt-split-attention, etc.)
    • Detailed inline documentation
  • Added ROCM_VRAM_OPTIMIZATION.md: Comprehensive 400+ line guide covering:

    • Launch configuration and environment variables
    • WebUI settings optimization
    • Generation settings for different VRAM amounts
    • ControlNet optimization techniques
    • Recommended workflows for quality and performance
    • Extensive troubleshooting section
    • Performance benchmarks
  • Added README_ROCM.md: Quick start guide for ROCm setup

Key optimizations:

  • Memory fragmentation prevention via expandable_segments
  • Optimal command-line arguments for 16GB VRAM
  • Two-phase workflow (generate at 512x512, upscale separately)
  • ControlNet low VRAM mode configuration
  • Batch processing best practices

Benefits:

  • Prevents OOM errors on 16GB VRAM GPUs
  • Improved stability for long generation sessions
  • Better quality outputs through optimized workflows
  • Faster iteration with recommended settings

Description

  • a simple description of what you're trying to accomplish
  • a summary of changes in code
  • which issues it fixes, if any

Screenshots/videos:

Checklist:

This commit adds comprehensive ROCm 6.2 support and VRAM optimization
for AMD GPUs, specifically targeting systems with 8-16GB VRAM.

Changes:
- Updated webui.sh to use ROCm 6.2 instead of 5.7 for AMD GPUs
- Added webui-user-rocm62.sh: Optimized launch script with:
  * PyTorch ROCm 6.2 installation command
  * PYTORCH_HIP_ALLOC_CONF for memory fragmentation prevention
  * Optimized command-line flags (--medvram, --opt-split-attention, etc.)
  * Detailed inline documentation

- Added ROCM_VRAM_OPTIMIZATION.md: Comprehensive 400+ line guide covering:
  * Launch configuration and environment variables
  * WebUI settings optimization
  * Generation settings for different VRAM amounts
  * ControlNet optimization techniques
  * Recommended workflows for quality and performance
  * Extensive troubleshooting section
  * Performance benchmarks

- Added README_ROCM.md: Quick start guide for ROCm setup

Key optimizations:
- Memory fragmentation prevention via expandable_segments
- Optimal command-line arguments for 16GB VRAM
- Two-phase workflow (generate at 512x512, upscale separately)
- ControlNet low VRAM mode configuration
- Batch processing best practices

Benefits:
- Prevents OOM errors on 16GB VRAM GPUs
- Improved stability for long generation sessions
- Better quality outputs through optimized workflows
- Faster iteration with recommended settings
@darkjive darkjive closed this Nov 15, 2025
@darkjive darkjive deleted the claude/rocm-vram-optimization-01JFCpmEVdjjMz6RAcUf8Mmq branch November 15, 2025 05:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants