Skip to content

九宫格图片检验测试,使用resnet18或者PP-HGNetV2-B4,杂糅项目,仅供参考,过mys验证码

License

Notifications You must be signed in to change notification settings

kafuneri/captcha-tools

 
 

Repository files navigation

About

Code forked from luguoyixiazi/test_nine. This fork primarily adds Docker build configurations for convenience. Intended for personal testing only; usability is not guaranteed.

Usage

1. Install Docker

Numerous tutorials are available online. The recommended installation command is:

bash <(curl -sSL [https://linuxmirrors.cn/docker.sh](https://linuxmirrors.cn/docker.sh))

2. Build the Image (Optional)

git clone [https://github.com/kafuneri/captcha-tools.git](https://github.com/kafuneri/captcha-tools.git)
cd captcha-tools
docker compose up -d # Build and run the image

PS: If using a server in mainland China, please configure a proxy or change package sources accordingly when building the image.

4. Use the Pre-built Image

Using docker-compose.yaml:

version: '3'
services:
  captcha-tools:
    image: kafuneri/captcha-tools:latest # Use captcha-tools:arm64 for arm64 devices
    container_name: captcha-tools
    network_mode: host  # Set to host network mode
    restart: always

5. Integrate with MihoyoBBSTools

Replace the captcha.py file in your MihoyoBBSTools installation with the captcha.py provided in this repository.

Nine-Square Grid Test Code

This project is for learning and communication purposes only. Do not use it for commercial purposes. You are responsible for any consequences.

Reference Projects

Model and V4 dataset: https://github.com/taisuii/ClassificationCaptchaOcr API: https://github.com/ravizhan/geetest-v3-click-crack

Running Steps

1. Install Dependencies

(Optional) a. If training with PaddlePaddle, you also need to install paddlex and the image classification module. Refer to the project https://github.com/PaddlePaddle/PaddleX for installation instructions.

(* Required!) b. Create a model folder in the project root directory and place the model file(s) inside. Name them resnet18.onnx or PP-HGNetV2-B4.onnx. The default model used is PP-HGNetV2-B4.onnx. If using ResNet, set use_v3_model to False in the code, as the model inputs/outputs differ (you may need to modify the code yourself).

pip install -r requirements.txt

2. Prepare Your Own Dataset (V3 and V4 differ) (Optional)

a. Training ResNet18 (Optional)
  • Refer to the referenced project above for dataset details. However, that project uses a V4 dataset. V3 lacks a demo; adapt as needed. Using a V4 dataset to train for V3 without code modification results in poor accuracy.
  • The main difference is image dimensions. V4 APIs provide two images: a target image and a nine-square grid. V3 combines them, requiring target cropping. V3 target images have low clarity. V4 grid images, after removing black borders, are 100x86 pixels. V3 grid images are 112x112. It's unclear what transformations V4 applies compared to V3; modifying preprocessing is necessary.
b. Training PP-HGNetV2-B4 (Optional)

This model was chosen arbitrarily from Paddle. The dataset format is as follows. If using a V4 dataset for V3 training, consider applying more data augmentation/transformations.

    dataset
    ├─images    # Path for all images
    ├─label.txt # Label file path, format per line: <index> <space> <class_name>, e.g., 15 Globe
    ├─train.txt # Training images list, format per line: <image_path> <space> <class_index>, e.g., images/001.jpg 0
    └─Validation and test sets follow the same format
c. To crop V3 images, use crop_image_v3 in crop_image.py. For V4, use crop_image. Write your own cropping script as needed.

3. Train the Model (Optional)

  • To train ResNet18, run python train.py
  • To train PP-HGNetV2-B4, run python train_paddle.py

4. Convert the Model to ONNX (Optional)

5. Start the FastAPI Service (Requires a trained ONNX model)

Run python main.py (By default, it uses the Paddle ONNX model. Modify the comments/code if you want to use ResNet18).

Due to potential issues with trajectory generation, verification might succeed locally but fail on the target system. It is recommended to increase the number of retry attempts. The trained Paddle model achieves an accuracy above 99.9%.

6. API Call

Example Python call:

import httpx

def game_captcha(gt: str, challenge: str):
    try:
        res = httpx.get("[http://127.0.0.1:9645/pass_nine](http://127.0.0.1:9645/pass_nine)", params={'gt': gt, 'challenge': challenge, 'use_v3_model': True}, timeout=10)
        res.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
        datas = res.json().get('data', {})
        if datas.get('result') == 'success':
            return datas.get('validate')
    except httpx.RequestError as exc:
        print(f"An error occurred while requesting {exc.request.url!r}: {exc}")
    except Exception as e:
        print(f"An unexpected error occurred: {e}")

    return None # Returns None on failure, 'validate' string on success

关于

代码fork自luguoyixiazi/test_nine,在此仅添加docker构建配置以方便使用,仅供个人测试使用,不保证可用性

食用方法

1.安装docker

教程很多,不再赘述,推荐使用

bash <(curl -sSL https://linuxmirrors.cn/docker.sh)

2.构建镜像(可选)

git clone https://github.com/kafuneri/captcha-tools.git 
cd captcha-tools
docker compose up -d #构建并运行镜像

PS:国内服务器构建镜像时请自行配置代理或换源

4.使用构建好的镜像

使用docker-compose.yaml

version: '3'
services:
  captcha-tools:
    image: kafuneri/captcha-tools:latest # arm64设备使用captcha-tools:arm64
    container_name: captcha-tools
    network_mode: host  # 设置为 host 网络模式
    restart: always

修改MihoyoBBSTools中的captcha.py为该项目中的captcha.py

九宫格测试代码

本项目仅供学习交流使用,请勿用于商业用途,否则后果自负。

本项目仅供学习交流使用,请勿用于商业用途,否则后果自负。

本项目仅供学习交流使用,请勿用于商业用途,否则后果自负。

参考项目

模型及V4数据集:https://github.com/taisuii/ClassificationCaptchaOcr

api:https://github.com/ravizhan/geetest-v3-click-crack

运行步骤

1.安装依赖

(可选)a.如果要训练paddle的话还得安装paddlex及图像分类模块,安装看项目https://github.com/PaddlePaddle/PaddleX

(* 必选!)b.模型需要在项目目录下新建一个model文件夹,然后把模型文件放进去,具体命名可以是resnet18.onnx或者PP-HGNetV2-B4.onnx,默认使用PP-HGNetV2-B4模型,如果用resnet则use_v3_model设置为False,因为模型的输入输出不一样,可以自行修改

pip install -r requirements.txt

2.自行准备数据集,V3和V4有区别(可选)

a. 训练resnet18(可选)
  • 数据集详情参考上面标注的项目,但是上面项目是V4数据集,V3没有demo,自行发挥吧,用V4练V3不改代码正确率有点感人
  • 主要是V4的尺寸和V3有差别,V4的api直接给两张图,一张是目标图,一张是九宫格,V3放在一起要切目标,且V3目标图清晰度很低,V4九宫格切了之后是100 * 86的图(去掉黑边),但是V3九宫格切的是112 * 112,不确定V4九宫格内容在V3基础上做了什么变换,反正改预处理就完事了
b. 训练PP-HGNetV2-B4(可选)

在paddle上随便找的,数据集格式如下,如果拿V4练V3,建议是多整点变换

   dataset
   ├─images   #所有图片存放路径
   ├─label.txt #标签路径,每一行数据格式为 <序号>+<空格>+<类别>,如15 地球仪
   ├─train.txt #训练图片,每一行数据格式为 <图片路径>+<空格>+<类别>,如images/001.jpg 0
   └─验证集和测试集同上
c. 如果要切V3的图用crop_image.py的crop_image_v3,切V4则使用crop_image,自行编写切图脚本

3.训练模型(可选)

  • 训练resnet18运行 python train.py
  • 如果训练PP-HGNetV2-B4运行python train_paddle.py

4.模型转换为onnx(可选)

5.启动fastapi服务(必须要有训练完成的onnx格式模型)

运行 python main.py(默认用的paddle的onnx模型,如果要用resnet18可以自己改注释)

由于轨迹问题,可能会出现验证正确但是结果失败,所以建议增加retry次数,训练后的paddle模型正确率在99.9%以上

6.api调用

python调用如:

import httpx

def game_captcha(gt: str, challenge: str):
	res = httpx.get("http://127.0.0.1:9645/pass_nine",params={'gt':gt,'challenge':challenge,'use_v3_model':True},timeout=10)
	datas = res.json()['data']
    if datas['result'] == 'success':
        return datas['validate']
    return None # 失败返回None 成功返回validate

--宣传--

欢迎大家支持我的其他项目喵~~~~~~~~

About

九宫格图片检验测试,使用resnet18或者PP-HGNetV2-B4,杂糅项目,仅供参考,过mys验证码

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.1%
  • Dockerfile 2.9%