Manga109 YOLO Segmentation Dataset

Dataset Manga109 đã được chuyển đổi sang YOLO Instance Segmentation format, sẵn sàng cho training với Ultralytics YOLO.

📊 Dataset Information

Metric	Value
Total Images	10,147
Train Images	8,204
Val Images	1,943
Classes	6
Image Size	1654 × 1170 pixels
Manga Titles	109

⚠️ Note: 472 images không có annotations (trang bìa, mục lục) đã được loại bỏ để đảm bảo chất lượng training.

🏷️ Classes

ID	Name	Description (EN)	Mô tả (VI)
0	frame	Manga panel frames	Khung panel manga
1	text	Text/dialogue	Văn bản/đối thoại
2	face	Character faces	Khuôn mặt nhân vật
3	body	Character bodies	Thân thể nhân vật
4	balloon	Speech balloons	Bong bóng thoại
5	onomatopoeia	Sound effects	Từ tượng thanh

📈 Class Distribution

Class	Train Count	Percentage
body	~128,000	21.9%
text	~120,500	20.6%
balloon	~105,600	18.1%
face	~96,900	16.5%
frame	~85,300	14.5%
onomatopoeia	~47,400	8.4%

⚠️ Class Imbalance: body gấp ~2.6 lần onomatopoeia. Cân nhắc sử dụng class weighting khi training.

📁 Dataset Structure

manga109_yolo/
├── images/
│   ├── train/          # 8,204 images
│   └── val/            # 1,943 images
├── labels/
│   ├── train/          # YOLO format labels
│   └── val/
├── data.yaml           # YOLO config
└── README.md           # This file

🚀 Usage with YOLO

Basic Training

from ultralytics import YOLO

# Load model
model = YOLO('yolo11n-seg.pt')

# Train
model.train(
    data='path/to/data.yaml',
    epochs=100,
    imgsz=640,
    batch=16,
)

Recommended Config for Manga

model.train(
    data='path/to/data.yaml',
    epochs=150,
    imgsz=640,
    batch=16,
    patience=50,
    
    # Dense detection (avg 68 objects/image)
    max_det=600,
    iou=0.5,
    
    # Manga-specific augmentation
    flipud=0.0,     # Don't flip vertically (text will be upside down)
    mosaic=0.8,
    mixup=0.0,      # Don't mix (will corrupt text/balloons)
)

⚠️ Important Notes

Dense Annotations: Average 68 objects/image, max 545. Tune max_det accordingly.
Class Imbalance: Consider class weighting for onomatopoeia.
Train/Val Split: Split by manga title (not by image) to avoid data leakage.
Academic Use Only: Manga109 license restricts commercial use.

📜 License

This dataset is derived from Manga109 dataset. Manga109 is provided for academic purposes only.

⚠️ Commercial use is NOT permitted.

Please refer to the original Manga109 license: http://www.manga109.org/

📚 Citation

@article{manga109,
  title={Manga109 dataset and creation of metadata},
  author={Aizawa, Kiyoharu and Matsui, Yusuke and others},
  journal={International Conference on Document Analysis and Recognition (ICDAR)},
  year={2017}
}

@inproceedings{manga109_2020,
  title={Building a Manga Dataset "Manga109" with Annotations for Multimedia Applications},
  author={Matsui, Yusuke and others},
  booktitle={IEEE MultiMedia},
  year={2020}
}

🔗 Links

Original Manga109: http://www.manga109.org/
COCO Format: https://cocodataset.org/#format-data
Ultralytics YOLO: https://docs.ultralytics.com/

Dataset converted on January 4, 2026

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support