Skip to content

[S2T]XXXXy语言识别代码中数据类型不匹配的错误 #4098

@qin121212

Description

@qin121212

Describe the bug
File "/mnt/d/wsl/paddlespeech/PaddleSpeech/paddlespeech/audio/utils/tensor_utils.py", line 251, in st_reverse_pad_list
index = index * seq_mask
TypeError: (InvalidType) Type promotion only support calculations between floating-point numbers and between complex and real numbers. But got different data type x: int64, y: bool. (at /paddle/paddle/phi/common/type_promotion.h:228)

To Reproduce
测试语言转文本的过程中出现的,测试后端代码如下:
@app.post("/api/asr")
async def asr(audio_file: UploadFile = File(...)):
start_time = time.time()

temp_audio_path = os.path.join(OUTPUT_DIR, f"asr_test_{uuid.uuid4()}.wav")
with open(temp_audio_path, "wb") as f:
    f.write(await audio_file.read())

try:
    # 执行ASR转换
    result = asr_executor(
        audio_file=temp_audio_path,
        force_yes=True,
        lang="zh",
        model="conformer_wenetspeech"
    )
    
    processing_time = time.time() - start_time
    print(f"ASR处理耗时: {processing_time:.2f}秒")
    
    return {
        "status": "success", 
        "text": result, 
        "processing_time": processing_time
    }
except Exception as e:
    print(f"ASR处理错误: {str(e)}")
    return {"status": "error", "message": str(e)}, 500
finally:
    # 清理临时文件
    if os.path.exists(temp_audio_path):
        os.remove(temp_audio_path)

Expected behavior
预期在执行乘法之前,显式地将布尔类型的 seq_mask 转换为整数类型(True 变为 1,False 变为 0)。这样一来,乘法操作就可以正常进行了。
具体修改:
修改文件地址:PaddleSpeech\paddlespeech\audio\utils\tensor_utils.py
index = index * seq_mask ==》 index = index * seq_mask.astype(index.dtype)

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions