Tropical Algae's Blog
467 words
2 minutes
paddleOCR模型微调
LINXIAXING
/
PaddleOCR
Waiting for api.github.com...
00K
0K
0K
Waiting...

Repository forked from PaddlePaddle/PaddleOCR


Document#

DescriptionURL
环境搭建https://github.com/LINXIAXING/PaddleOCR/blob/main/doc/doc_ch/environment.md
模型集合https://github.com/LINXIAXING/PaddleOCR/blob/main/doc/doc_ch/models_list.md
OCR数据集https://github.com/LINXIAXING/PaddleOCR/blob/main/doc/doc_ch/dataset/ocr_datasets.md
表格数据集https://github.com/LINXIAXING/PaddleOCR/blob/main/doc/doc_ch/dataset/table_datasets.md
版面分析数据集https://github.com/LINXIAXING/PaddleOCR/blob/main/doc/doc_ch/dataset/layout_datasets.md
数据标注https://github.com/LINXIAXING/PaddleOCR/blob/main/PPOCRLabel/README.md
PP-Structurehttps://github.com/LINXIAXING/PaddleOCR/blob/main/ppstructure/docs/quickstart.md

环境搭建#

python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
git clone https://github.com/PaddlePaddle/PaddleOCR.git
cd ./PaddleOCR
python install -r ./requirements.txt

或者通过docker pull paddleocr:laest获取最新的PaddleOCR镜像

数据集准备#

使用公开的数据集或通过PaddleOCR提供的打标工具制作数据集:

pip install -r ./PPOCRLabel/requirements.txt
python ./PPOCRLabel/PPOCRLabel.py --lang=ch

Dataset for OCR#

数据集树状目录:

/dataset
├─det
│  ├─test_imgs
│  ├─train_imgs
│  ├─filestate.txt
│  ├─test_label.txt
│  ├─train_label.txt
│  └─Cache.cach
└─rec
    ├─crop_img_test
    ├─crop_img_train
    ├─rec_gt_test.txt
    └─rec_gt_train.txt

det数据label格式:

<file_path> [{"transcription": "", "point": [[0, 0], [0, 1], [1, 0], [1, 1]], "difficult": false}, ]

rec数据label格式:

<file_path> <ground_truth>
TIP

数据不支持中文,且label中file_path中取相对位置,这意味着你或许需要对数据进行再处理。

Dataset for PP-Structure#

PASS

训练配置#

配置文件路径:/PaddleOCR/configs

NOTE

OCR实测DB模型/PaddleOCR/configs/det/det_mv3_db.yml+CRNN模型/PaddleOCR/configs/rec/rec_r34_vd_none_bilstm_ctc.yml模型效果较佳。

Table OCR模型使用SLANet模型/PaddleOCR/configs/table/SLANet_ch.yml效果较佳。

模型训练#

单机单卡训练:

python tools/train.py -c <config.yml>

单机多卡训练:

python -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py \ 
-c <config.yml>

多机多卡训练:

python -m paddle.distributed.launch --ips="xx.xx.xx.xx,xx.xx.xx.xx" \
--gpus '0,1,2,3' tools/train.py -c <config.yml>

model_training

模型评估#

python tools/eval.py -c <config.yml>  -o Global.checkpoints="{path/to/weights}/best_accuracy"

评估同样可以多机多卡,参考训练命令。

模型部署使用#

导出模型:

python tools/export_model.py -c <config.yml> -o \
Global.pretrained_model="./output/db_mv3/best_accuracy" \
Global.save_inference_dir="./inference/det_db_inference/"

替换模型权重:

默认的权重加载/下载路径在~/.paddleocr/下,将导出的模型替换到对应文件夹下再进行测试。

img_path = './test.jpg'
ocr = PaddleOCR(use_angle_cls=True, use_gpu=False,
lang="ch")  # need to run only once to download and load model into memory
result = ocr.ocr(img_path, det=True, cls=False, rec=True)
# 预测结果包含识别BOX的位置信息,文字以及置信度
# [[[x, y]]*4, (word, confidence)]
for line in result[0]:
	print(line)

image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result[0]]
txts = [line[1][0] for line in result[0]]
scores = [line[1][1] for line in result[0]]
im_show = draw_ocr(image, boxes, txts, scores)
im_show = Image.fromarray(im_show)
im_show.save('./result.jpg')
paddleOCR模型微调
https://tropicalalgae.cn/posts/paddle/paddleocr/
Author
Tropical algae
Published at
2024-05-09