Skip to content

Releases: deiteris/voice-changer

b2309

24 Aug 22:28
8b35e1d
Compare
Choose a tag to compare

This is a hotfix for b2307.

Fixes voice changer loading with CUDA version on Windows.

b2307

24 Aug 12:07
f25d4fb
Compare
Choose a tag to compare

This is a minor maintenance release.

Changes

  • For PyTorch models, PyTorch has been updated to v2.4.0 in the following versions:
    • CUDA version. Now also uses CUDA 12+ and CuDNN 9+ which may offer better performance. This has also reduced the prebuilt version size.
    • CPU version.
    • macOS version (ARM only).
  • In DirectML version, torch-directml has been updated to 0.2.4.dev240815. No performance change is expected since the new operators are not used by the voice changer.
  • For ONNX models, ONNX runtime has been updated to 1.19 in the following versions:
    • CUDA version. Now also uses CUDA 12+ and CuDNN 9+ which may offer better performance.
    • DirectML version. DirectML runtime and opset support have been updated which may offer better performance.
    • CPU version.
    • macOS version. CoreML runtime now has support for more operators which are used by the voice changer and may offer better performance.

b2300

11 Aug 16:45
edd2341
Compare
Choose a tag to compare

This is a minor maintenance release.

Fixes FP16 support detection with older NVIDIA GPUs (Maxwell GPUs and earlier).

b2298

11 Aug 13:15
958a25f
Compare
Choose a tag to compare

Changes

  • "export to onnx" button has been replaced with the new "Convert to ONNX" setting in Advanced setting. The behavior of this setting has also changed. When this option is checked, uploaded or selected models (if not converted before) will be automatically converted to ONNX in the same slot. When unchecked, the original PyTorch model will be used. Note that this option does not replace the original model, but adds an ONNX variant to it. If uploaded model is already in ONNX format, it will be loaded as is.
    image
  • Performance monitor now shows the model type including the used runtime (f.e., onnxRVC for ONNX RVC models and pyTorchRVCv2 for original RVC v2 models).

Improvements

  • WASAPI no longer requires matching the sample rate of input and output devices, the audio will be automatically resampled by the system audio mixer. However, it's still recommended if possible.

Fixes

  • ASIO channel selection now correctly selects an input/output channel.
  • "Operation in progress" dialog will now appear when changing long-running options in Advanced settings.
  • Fixed a potential bug when FP16 ONNX model could fail to load if it was generated previously and removed.
  • Model settings (Pitch, Index, etc.) no longer reset to last saved settings after changing GPU.

Experimental

  • When using inference on CPU, the contentvec embedder model will be quantized using INT8 precision. This significantly reduces RAM usage and slightly reduces CPU usage. Currently, there's no option to opt out from this behavior.

Miscellaneous

  • Updated WebUI npm dependencies.

Known issues

  • WDM-KS and true ASIO devices produce crackling audio. For the lowest delay, the current workaround is to use WASAPI or FlexASIO.

b2277

06 Aug 18:31
270a673
Compare
Choose a tag to compare

Improvements

  • ASIO devices now can specify input and output channels. For FlexASIO and ASIO4ALL, usually no changes are required since they normally work on default channels.
    image

Fixes

  • Fix model merging.
  • Exclude GeForce MX series from FP16 since it's not supported by them.

b2271

04 Aug 14:08
8bcfcdb
Compare
Choose a tag to compare

Important change

The voice changer will now constantly utilize CPU/GPU when voice conversion is enabled. This is to address an issue when the voice changer may lag after a short period of silence or demonstrate inconsistent performance that can be observed with NVIDIA GPUs in previous versions.

New

  • "perf" metric now includes a graph. The graph shows data points over last 5 seconds (if chunk size is less than 100ms) or more. Performance graph allows you to see if there're performance fluctuations and adjust chunk size depending on the usage over time.
    performance_graph
  • Most of the settings now include tooltips with explanation. Just hover over the text with dotted underline.
    image
  • When changing settings that may have negative impact on performance in DirectML version, a notification will appear near GPU, asking to switch between CPU and GPU.
    dml_gpu_warning
  • Introduced Just-in-Time (JIT) compilation for PyTorch models. This slightly improves performance (fast response on the first start, a bit lower latency) and reduces memory usage in some cases. However, currently, it increases model loading time. To opt out from JIT compilation, set "Disable JIT compilation" option in Advanced settings to "on". Note that JIT is currently not available for DirectML devices so this option won't have any effect for them.

Changes

  • Increased input volume slider range to 250%

Experimental

  • Version for NVIDIA includes 2 bat files to lock or reset GPU core and memory clocks to address possible inconsistent performance issues. Note that the support for frequency locking is not guaranteed by all NVIDIA GPUs, you can verify that the option took effect with GPU-Z. Both scripts must run with administrator rights to take effect:
    • force_gpu_clocks.bat - queries GPU core and memory clocks and locks to reported maximum clocks.
    • reset_gpu_clocks.bat - resets GPU core and memory frequencies.

b2245

31 Jul 11:59
45063ad
Compare
Choose a tag to compare

Changes

  • Changed the background color of performance stats block.

  • buf metric has been moved to perf.

  • perf now indicates the performance by highlighting the inference time in green, yellow or red when a certain condition is met. The following table shows an example of 3 conditions that it may indicate:

    Stable Potentially unstable
    / High usage
    Unstable
    image image image
    image image image

    The logic of these conditions is the following:

    • Stable - your inference speed is sufficient for the selected chunk size. Usually, no actions required.
    • Potentially unstable - your inference speed is sufficient but audio may be unstable when other processes run concurrently. Operation in this range will also incur high GPU usage. Increasing Chunk size or reducing Extra is recommended.
    • Unstable - your inference speed is insufficient for the selected chunk size. Increase Chunk size or reduce Extra.

Experimental

  • In client audio mode, additional audio buffering used to compensate for the lag was removed. This should reduce audio latency in this mode (up to 25% less latency) and not cause any issues during stable operation, but report issues if any.

b2241

30 Jul 21:09
80e3800
Compare
Choose a tag to compare

Changes

  • Increased out and mon volume range to 400%.
  • Performance stats refresh rate is limited to 100ms. This may reduce usage caused by UI with Chunk size smaller than 100ms.

Fixes

  • Fixed default value of SilenceFront option on first startup.
  • In cases when model is already ONNX, export to ONNX now will show UI error instead of downloading empty file.

Improvements

  • Downloader now shows the name of the file which failed to download.
  • Minor UI performance improvement when reporting performance stats.

b2224

28 Jul 20:06
9c7a642
Compare
Choose a tag to compare

Breaking changes

S.Thresh. has changed the measurement unit which is incompatible with the previous settings. Change the setting to apply updated measurement unit.

Changes

  • TUNE option is renamed to PITCH.
  • S.Thresh. option is renamed to In. Sens. (Input Sensitivity).
  • In. Sens. now uses dB to measure volume level.
  • Added measurement unit label to In. Sens.
  • "vol" now uses dB to report volume level.
  • Fixed typo in Crossfade size unit (ms to s).
  • Added "ping" metric to show ping in client mode.
  • Rename "res" to "total"
  • Calculate total latency based on Chunk, Crossfade size and "ping".
  • Monitor gain moved to GAIN group.
  • GAIN option is renamed to VOL.
  • VOL unit of measurement changed to percentage.
  • Limited VOL range to reasonable values.
  • "buf" is now a static value.

Fixes

  • Volume is correctly reported in server mode.
  • Volume is correctly reported when passthru is enabled.

Improvements

  • Reduce CPU usage caused by web app in client audio mode (up to 4% usage reduction with Ryzen 7 5800H).

b2215

27 Jul 11:23
e21dbb9
Compare
Choose a tag to compare

This is a minor maintenance release.

In case the app crashes during startup, the app will no longer close instantly.
image