Skip to content

[Feature]: internal processing is 32 bit float but output is truncated to 16 bit integer - allow for 32 or 24 bit output #258

@dts350z

Description

@dts350z

Description

Claude Code:

1. 24-bit PCM output support via soundfile

Files:

  • audio_separator\separator\common_separator.py__init__() + write_audio_pydub()
  • audio_separator\separator\separator.py__init__() + config dict
  • audio_separator\utils\cli.py — argument parser + Separator instantiation

Date: 2025-02-12

Problem:
Models output float32 (~24 bits of precision) but write_audio_pydub() truncates all output to int16 (16-bit) before writing. The existing write_audio_soundfile() path (--use_soundfile) has a bug where float data is assigned to an int16 interleave array without scaling by 32767, producing silent output. Documentation claimed 24-bit output but the library was always writing 16-bit.

Changes:

common_separator.py

Added output_subtype from config (after line ~85):

self.output_subtype = config.get("output_subtype", "PCM_16")

Added early-return soundfile path in write_audio_pydub() (before the int16 conversion):

# For WAV/FLAC with PCM_24, use soundfile directly (pydub can't do 24-bit)
file_format = stem_path.lower().split(".")[-1]
if self.output_subtype == "PCM_24" and file_format in ("wav", "flac"):
    import soundfile as sf
    sf.write(stem_path, stem_source, self.sample_rate, subtype="PCM_24")
    self.logger.debug(f"Exported 24-bit {file_format.upper()} via soundfile to {stem_path}")
    return

separator.py

Added output_subtype="PCM_16" parameter to __init__(), stored as self.output_subtype, and included in the config dict passed to architecture-specific separators.

cli.py

Added --output_subtype argument (default "PCM_16") and passed it to the Separator constructor.

Why: Preserves the full precision of model output. 24-bit PCM captures the float32 model output without the quantization noise from 16-bit truncation. The existing pydub int16 path remains untouched for 16-bit output and lossy formats (MP3, M4A, etc.).

Note: The broken write_audio_soundfile() path was not modified — instead, the fix uses soundfile.write() directly inside the existing write_audio_pydub() method with an early return for 24-bit lossless formats. This avoids the interleaving bugs in the soundfile path while keeping the change minimal.

Configuration: Controlled by separation_bit_depth in the [output_processing] section of UpMixery.ini. The value is passed to audio_separator_launcher.py via --bit-depth, which maps it to --output_subtype PCM_24 or PCM_16.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions