Manga109 YOLO Segmentation Dataset

Dataset Manga109 đã được chuyển đổi sang YOLO Instance Segmentation format, sẵn sàng cho training với Ultralytics YOLO.

📊 Dataset Information

Metric Value
Total Images 10,147
Train Images 8,204
Val Images 1,943
Classes 6
Image Size 1654 × 1170 pixels
Manga Titles 109

⚠️ Note: 472 images không có annotations (trang bìa, mục lục) đã được loại bỏ để đảm bảo chất lượng training.

🏷️ Classes

ID Name Description (EN) Mô tả (VI)
0 frame Manga panel frames Khung panel manga
1 text Text/dialogue Văn bản/đối thoại
2 face Character faces Khuôn mặt nhân vật
3 body Character bodies Thân thể nhân vật
4 balloon Speech balloons Bong bóng thoại
5 onomatopoeia Sound effects Từ tượng thanh

📈 Class Distribution

Class Train Count Percentage
body ~128,000 21.9%
text ~120,500 20.6%
balloon ~105,600 18.1%
face ~96,900 16.5%
frame ~85,300 14.5%
onomatopoeia ~47,400 8.4%

⚠️ Class Imbalance: body gấp ~2.6 lần onomatopoeia. Cân nhắc sử dụng class weighting khi training.

📁 Dataset Structure

manga109_yolo/
├── images/
│   ├── train/          # 8,204 images
│   └── val/            # 1,943 images
├── labels/
│   ├── train/          # YOLO format labels
│   └── val/
├── data.yaml           # YOLO config
└── README.md           # This file

🚀 Usage with YOLO

Basic Training

from ultralytics import YOLO

# Load model
model = YOLO('yolo11n-seg.pt')

# Train
model.train(
    data='path/to/data.yaml',
    epochs=100,
    imgsz=640,
    batch=16,
)

Recommended Config for Manga

model.train(
    data='path/to/data.yaml',
    epochs=150,
    imgsz=640,
    batch=16,
    patience=50,
    
    # Dense detection (avg 68 objects/image)
    max_det=600,
    iou=0.5,
    
    # Manga-specific augmentation
    flipud=0.0,     # Don't flip vertically (text will be upside down)
    mosaic=0.8,
    mixup=0.0,      # Don't mix (will corrupt text/balloons)
)

⚠️ Important Notes

  1. Dense Annotations: Average 68 objects/image, max 545. Tune max_det accordingly.
  2. Class Imbalance: Consider class weighting for onomatopoeia.
  3. Train/Val Split: Split by manga title (not by image) to avoid data leakage.
  4. Academic Use Only: Manga109 license restricts commercial use.

📜 License

This dataset is derived from Manga109 dataset. Manga109 is provided for academic purposes only.

⚠️ Commercial use is NOT permitted.

Please refer to the original Manga109 license: http://www.manga109.org/

📚 Citation

@article{manga109,
  title={Manga109 dataset and creation of metadata},
  author={Aizawa, Kiyoharu and Matsui, Yusuke and others},
  journal={International Conference on Document Analysis and Recognition (ICDAR)},
  year={2017}
}

@inproceedings{manga109_2020,
  title={Building a Manga Dataset "Manga109" with Annotations for Multimedia Applications},
  author={Matsui, Yusuke and others},
  booktitle={IEEE MultiMedia},
  year={2020}
}

🔗 Links


Dataset converted on January 4, 2026

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support