PP-YOLOE+ 실습 및 YOLOv11과 비교 - PaddleDetection, COCO JSON 변환, 모델 구조 차이점 정리

실습 개요

YOLOv11 실습(이전 포스팅)과 동일한 Camera 탐지 데이터로 Baidu의 PP-YOLOE+(PaddleDetection)를 학습하고, 두 모델의 차이점을 비교 정리했습니다.

YOLOv11 vs PP-YOLOE+ 전체 비교

항목	YOLOv11 (Ultralytics)	PP-YOLOE+ (PaddleDetection)
개발사	Ultralytics	Baidu PaddlePaddle
프레임워크	PyTorch	PaddlePaddle
백본(Backbone)	CSPDarknet	CSPRepResNet
Neck	PANet	CSPPAN
Head	Decoupled Head	ET-Head (Task Aligned)
Loss	DFL + VFL	DFL + VFL
사전학습 가중치	.pt 파일	.pdparams 파일
실행 방식	Python API (model.train())	CLI 명령어 or Python API
설정 방식	함수 인자 직접 전달	YAML 파일 기반
어노테이션 형식	YOLO txt (cx cy w h)	COCO JSON (x_min y_min w h)
평가 지표	mAP50, mAP50-95	AP, AP50, AP75 (COCO 기준)

1~4단계: 데이터 준비 (YOLOv11과 동일)

Open Images V7에서 Camera 클래스 데이터 수집 및 YOLO 형식 변환까지는 이전 실습과 동일합니다. 이미 camera_yolo_v7/ 폴더가 있다면 5단계부터 진행합니다.

5단계: ⭐ YOLO txt → COCO JSON 변환

PaddleDetection은 COCO JSON 형식을 사용합니다. YOLO의 이미지별 .txt 파일을 하나의 JSON으로 변환해야 합니다.

YOLO vs COCO 어노테이션 비교

항목	YOLO 형식	COCO JSON 형식
파일 구조	이미지마다 .txt 1개	전체를 묶은 .json 1개
좌표 표현	cx cy w h (0~1 정규화)	x_min y_min width height (픽셀)
클래스 정보	숫자 인덱스만	categories 목록에 이름 포함

def yolo_txt_to_coco_json(image_dir, label_dir, output_json_path, classes):
    coco = {
        'images': [],
        'annotations': [],
        'categories': [{'id': i, 'name': name} for i, name in enumerate(classes)]
    }
    image_id = 0
    annotation_id = 0

    for img_filename in sorted(os.listdir(image_dir)):
        with Image.open(os.path.join(image_dir, img_filename)) as img:
            width, height = img.size

        coco['images'].append({
            'id': image_id,
            'file_name': img_filename,
            'width': width, 'height': height
        })

        label_path = os.path.join(label_dir, os.path.splitext(img_filename)[0] + '.txt')
        if os.path.exists(label_path):
            with open(label_path) as f:
                for line in f:
                    cls_id, cx, cy, bw, bh = map(float, line.split())

                    # ⭐ YOLO → COCO 좌표 변환
                    x_min = (cx - bw / 2) * width    # 정규화 → 픽셀
                    y_min = (cy - bh / 2) * height
                    box_w = bw * width
                    box_h = bh * height

                    coco['annotations'].append({
                        'id': annotation_id,
                        'image_id': image_id,
                        'category_id': int(cls_id),
                        'bbox': [round(x_min,2), round(y_min,2), round(box_w,2), round(box_h,2)],
                        'area': round(box_w * box_h, 2),
                        'iscrowd': 0
                    })
                    annotation_id += 1
        image_id += 1

    with open(output_json_path, 'w') as f:
        json.dump(coco, f, indent=2)

# train / val 각각 변환
for split in ['train', 'val']:
    yolo_txt_to_coco_json(
        image_dir=f'./camera_yolo_v7/images/{split}',
        label_dir=f'./camera_yolo_v7/labels/{split}',
        output_json_path=f'./camera_yolo_v7/annotations/instances_{split}.json',
        classes=['Camera']
    )

6단계: ⭐ PP-YOLOE+ 설정 파일 생성

PaddleDetection은 YAML 기반 설정 파일로 모든 학습 옵션을 관리합니다.

PP-YOLOE+ 모델 크기 비교

모델	파라미터	mAP (COCO)	추천 VRAM
PP-YOLOE+_s	7.9M	43.7%	4GB 이상
PP-YOLOE+_m	23.4M	49.8%	8GB 이상 ✅
PP-YOLOE+_l	52.2M	52.9%	16GB 이상
PP-YOLOE+_x	98.4M	54.7%	24GB 이상

# camera_coco.yml - 데이터셋 설정
metric: COCO
num_classes: 1

TrainDataset:
  name: COCODataSet
  image_dir: /path/to/camera_yolo_v7/images/train
  anno_path: /path/to/camera_yolo_v7/annotations/instances_train.json
  dataset_dir: /path/to/camera_yolo_v7
  data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']

EvalDataset:
  name: COCODataSet
  image_dir: /path/to/camera_yolo_v7/images/val
  anno_path: /path/to/camera_yolo_v7/annotations/instances_val.json

# ppyoloe_plus_crn_s_camera.yml - 학습 설정
_BASE_: [
  '../runtime.yml',
  './_base_/optimizer_300e.yml',
  './_base_/ppyoloe_plus_crn.yml',
  './_base_/ppyoloe_plus_reader.yml',
  './camera_coco.yml',
]

snapshot_epoch: 5
epoch: 20
pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_80e_coco.pdparams

PPYOLOEHead:
  num_classes: 1

TrainReader:
  batch_size: 16

7단계: ⭐ PP-YOLOE+ 학습 실행

import subprocess

train_cmd = [
    'python', 'tools/train.py',
    '-c', 'configs/ppyoloe/ppyoloe_plus_crn_s_camera.yml',
    '--eval',        # 학습 중 검증 수행
    '--amp',         # 자동 혼합 정밀도 (GPU 속도↑, 메모리↓)
    '-o', 'use_gpu=True',
    '-o', 'save_dir=./output/camera_detection',
]
subprocess.run(train_cmd)

학습 로그 지표 비교

항목	YOLOv11	PP-YOLOE+
Loss 항목	box_loss, cls_loss, dfl_loss	loss_bbox, loss_cls, loss_dfl
평가 지표명	mAP50, mAP50-95	AP50, AP (COCO 기준)
체크포인트	best.pt, last.pt	best_model.pdparams

8단계: ⭐ 평가 - COCO 평가 기준

python tools/eval.py \
    -c configs/ppyoloe/ppyoloe_plus_crn_s_camera.yml \
    -o weights=./output/camera_detection/best_model.pdparams \
    -o use_gpu=True

COCO 평가 지표 해설

지표	설명	YOLOv11 대응
AP	IoU 0.5~0.95 평균 정밀도	mAP50-95
AP50	IoU 0.5 기준 정밀도	mAP50
AP75	IoU 0.75 기준 (더 엄격)	-
APs	소형 객체 (32px 이하)	-
APm	중형 객체 (32~96px)	-
APl	대형 객체 (96px 초과)	-

9단계: ⭐ 추론 (infer.py)

python tools/infer.py \
    -c configs/ppyoloe/ppyoloe_plus_crn_s_camera.yml \
    -o weights=./output/camera_detection/best_model.pdparams \
    -o use_gpu=True \
    --infer_img /path/to/image.jpg \
    --output_dir ./infer_output \
    --draw_threshold 0.25 \
    --save_results  # 결과를 bbox.json으로도 저장

PP-YOLOE+ 핵심 기술 포인트

1. Task Aligned Learning (TAL)

분류(Classification)와 회귀(Regression) 작업을 함께 정렬하여 더 정확한 박스를 학습합니다. 기존 YOLO 계열이 두 태스크를 독립적으로 처리하는 것과 달리, TAL은 두 태스크의 품질이 모두 높은 앵커를 우선적으로 학습합니다.

2. Varifocal Loss (VFL)

양성(Positive)/음성(Negative) 샘플 불균형 문제를 해소하는 손실 함수입니다. 배경이 대부분인 객체 탐지에서 어려운 양성 샘플에 더 큰 가중치를 부여합니다.

3. CSPRepResNet 백본

경량화(Re-parameterization)와 성능을 동시에 잡은 백본입니다. 학습 시에는 복잡한 구조를 사용하고, 추론 시에는 단순한 구조로 변환되어 속도가 빨라집니다.

자동 적용 데이터 증강 기법

Mosaic: 4장 이미지 조각 합성 (YOLO와 동일)
MixUp: 두 이미지를 투명하게 혼합
HSV 변형: 색조/채도/명도 무작위 변경
Random Crop / Flip: 임의 자르기 및 좌우 반전

어떤 모델을 선택해야 할까?

상황	추천 모델
빠른 프로토타이핑, 간단한 API 원함	YOLOv11 (Ultralytics)
이미 PaddlePaddle 환경, 세밀한 YAML 제어 필요	PP-YOLOE+
산업용 정밀도 최적화 필요	PP-YOLOE+_l/x
엣지 디바이스 배포	YOLOv11n / PP-YOLOE+_s
커뮤니티/문서화 중요	YOLOv11 (더 활발한 커뮤니티)

'勉強 > A.I.' 카테고리의 다른 글

YOLOv11 객체 탐지 실습 - Open Images V7 데이터 수집부터 학습/평가까지 (FiftyOne, Ultralytics) (0)	2026.06.02

Ressentiment

PP-YOLOE+ 실습 및 YOLOv11과 비교 - PaddleDetection, COCO JSON 변환, 모델 구조 차이점 정리

실습 개요

YOLOv11 vs PP-YOLOE+ 전체 비교

1~4단계: 데이터 준비 (YOLOv11과 동일)

5단계: ⭐ YOLO txt → COCO JSON 변환

YOLO vs COCO 어노테이션 비교

6단계: ⭐ PP-YOLOE+ 설정 파일 생성

PP-YOLOE+ 모델 크기 비교

7단계: ⭐ PP-YOLOE+ 학습 실행

학습 로그 지표 비교

8단계: ⭐ 평가 - COCO 평가 기준

COCO 평가 지표 해설

9단계: ⭐ 추론 (infer.py)

PP-YOLOE+ 핵심 기술 포인트

1. Task Aligned Learning (TAL)

2. Varifocal Loss (VFL)

3. CSPRepResNet 백본

자동 적용 데이터 증강 기법

어떤 모델을 선택해야 할까?

'勉強 > A.I.' 카테고리의 다른 글

티스토리툴바

PP-YOLOE+ 실습 및 YOLOv11과 비교 - PaddleDetection, COCO JSON 변환, 모델 구조 차이점 정리

실습 개요

YOLOv11 vs PP-YOLOE+ 전체 비교

1~4단계: 데이터 준비 (YOLOv11과 동일)

5단계: ⭐ YOLO txt → COCO JSON 변환

YOLO vs COCO 어노테이션 비교

6단계: ⭐ PP-YOLOE+ 설정 파일 생성

PP-YOLOE+ 모델 크기 비교

7단계: ⭐ PP-YOLOE+ 학습 실행

학습 로그 지표 비교

8단계: ⭐ 평가 - COCO 평가 기준

COCO 평가 지표 해설

9단계: ⭐ 추론 (infer.py)

PP-YOLOE+ 핵심 기술 포인트

1. Task Aligned Learning (TAL)

2. Varifocal Loss (VFL)

3. CSPRepResNet 백본

자동 적용 데이터 증강 기법

어떤 모델을 선택해야 할까?

'勉強 > A.I.' 카테고리의 다른 글

'勉強/A.I.' Related Articles

티스토리툴바