Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

版面分析获取印章识别位置不对,且没有image字段 #2821

Closed
seanzhang-zhichen opened this issue Jan 12, 2025 · 2 comments
Closed
Assignees

Comments

@seanzhang-zhichen
Copy link

docker镜像版本:
paddlex3.0.0b2-paddlepaddle3.0.0b2-gpu-cuda11.8-cudnn8.6-trt8.5: Pulling from paddlex/paddlex
Digest: sha256:b05c8ecb3cb659a960baeef74245fa9e1d390cf3875a476ae846b9ed81ca31a0
Status: Image is up to date for ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlex/paddlex:paddlex3.0.0b2-paddlepaddle3.0.0b2-gpu-cuda11.8-cudnn8.6-trt8.5
ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlex/paddlex:paddlex3.0.0b2-paddlepaddle3.0.0b2-gpu-cuda11.8-cudnn8.6-trt8.5

部署命令:

docker run --gpus all --name paddlex_layout_parsing -v $PWD:/paddle -p 8003:8080 \
    --shm-size=8g -it -d \
    ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlex/paddlex:paddlex3.0.0b2-paddlepaddle3.0.0b2-gpu-cuda11.8-cudnn8.6-trt8.5 \
    /bin/bash -c "paddlex --serve --pipeline layout_parsing"

问题1:返回的位置不对

测试图片:

18

根据返回的图像坐标截取的内容:
seal_image

问题2:没有 image字段

测试代码:

import base64
import requests
from PIL import Image, ImageDraw
import io

API_URL = "http://10.250.10.182:8003/layout-parsing"  # 服务URL
image_path = "./data/18.jpg"

# 对本地图像进行Base64编码
with open(image_path, "rb") as file:
    image_bytes = file.read()
    image_data = base64.b64encode(image_bytes).decode("ascii")

payload = {
    "file": image_data,  # Base64编码的文件内容
    "fileType": 1,
    "useImgOrientationCls": True,
    "useImgUnwrapping": True,
    "useSealTextDet": True,
}

# 调用API
response = requests.post(API_URL, json=payload)

# 处理接口返回数据
assert response.status_code == 200
result = response.json()["result"]
print("\nDetected layout elements:")

# 打开原始图片
original_image = Image.open(image_path)
print("Image size:", original_image.size)  # 打印原图像的大小
print("Image mode:", original_image.mode)   # 打印原图像的模式
draw = ImageDraw.Draw(original_image)  # 创建一个可以在图像上绘制的对象



for res in result["layoutParsingResults"]:
    for ele in res["layoutElements"]:
        print(ele)
        if ele["label"] == "seal":  # 如果元素是印章
            print("===============================")
            print(ele.keys())
            print("bbox:", ele["bbox"])
            print("label:", ele["label"])
            print("text:", repr(ele["text"]))
            image_data = ele["image"]
            # 将Base64编码的图像数据解码并保存为JPEG文件
            seal_image_data = base64.b64decode(image_data)  # 解码Base64数据
            seal_image = Image.open(io.BytesIO(seal_image_data))  # 转为图像对象
            
            # 保存图像
            seal_image.save("seal_image.jpg", "JPEG")  # 保存为JPEG格式
            print("Seal image saved as 'seal_image.jpg'")

报错:

dict_keys(['bbox', 'label', 'text', 'layoutType'])
bbox: [1038.466064453125, 89.07820129394531, 1319.3720703125, 357.3177490234375]
label: seal
text: '发货专用章烽火通信科技股份有限公司'
Traceback (most recent call last):
  File "D:\workplace\zhangzc\mxgent-ai\ocr\test.py", line 47, in <module>
    image_data = ele["image"]
177490234375], 'label': 'seal', 'text': '发货专用章烽火通信科技股份有限公司', 'layoutType': 'double'}
===============================
dict_keys(['bbox', 'label', 'text', 'layoutType'])
bbox: [1038.466064453125, 89.07820129394531, 1319.3720703125, 357.3177490234375]
label: seal
text: '发货专用章烽火通信科技股份有限公司'
Traceback (most recent call last):
  File "D:\workplace\zhangzc\mxgent-ai\ocr\test.py", line 47, in <module>
    image_data = ele["image"]
                 ~~~^^^^^^^^^
KeyError: 'image'
@Sunting78
Copy link
Collaborator

Sunting78 commented Jan 14, 2025

您好,不启动服务,进行本地测试是一样的结果吗?

@seanzhang-zhichen
Copy link
Author

您好,不启动服务,进行本地测试是一样的结果吗?

无法截取原图的问题,本地部署也有碰到过,但是后面换成用opencv截取就解决了,应该是电脑分辨率对pillow crop有影响。

至于image字段没有的问题,后面没有再进行测试了,最后,还有个问题是,save to json的这个方法,真的有点奇怪,只传入文件名时,会自动给 拼接个res,但是传入 带多层文件夹路径的文件名时,又不会,建议修改下api,正常地返回json就行了,无法理解为什么要设计成这样

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants