-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Describe the bug
File "/mnt/d/wsl/paddlespeech/PaddleSpeech/paddlespeech/audio/utils/tensor_utils.py", line 251, in st_reverse_pad_list
index = index * seq_mask
TypeError: (InvalidType) Type promotion only support calculations between floating-point numbers and between complex and real numbers. But got different data type x: int64, y: bool. (at /paddle/paddle/phi/common/type_promotion.h:228)
To Reproduce
测试语言转文本的过程中出现的,测试后端代码如下:
@app.post("/api/asr")
async def asr(audio_file: UploadFile = File(...)):
start_time = time.time()
temp_audio_path = os.path.join(OUTPUT_DIR, f"asr_test_{uuid.uuid4()}.wav")
with open(temp_audio_path, "wb") as f:
f.write(await audio_file.read())
try:
# 执行ASR转换
result = asr_executor(
audio_file=temp_audio_path,
force_yes=True,
lang="zh",
model="conformer_wenetspeech"
)
processing_time = time.time() - start_time
print(f"ASR处理耗时: {processing_time:.2f}秒")
return {
"status": "success",
"text": result,
"processing_time": processing_time
}
except Exception as e:
print(f"ASR处理错误: {str(e)}")
return {"status": "error", "message": str(e)}, 500
finally:
# 清理临时文件
if os.path.exists(temp_audio_path):
os.remove(temp_audio_path)
Expected behavior
预期在执行乘法之前,显式地将布尔类型的 seq_mask 转换为整数类型(True 变为 1,False 变为 0)。这样一来,乘法操作就可以正常进行了。
具体修改:
修改文件地址:PaddleSpeech\paddlespeech\audio\utils\tensor_utils.py
index = index * seq_mask ==》 index = index * seq_mask.astype(index.dtype)