CaMMT: Benchmarking Culturally Aware Multimodal Machine Translation Paper • 2505.24456 • Published May 30, 2025
AfroXLMR-Social: Adapting Pre-trained Language Models for African Languages Social Media Text Paper • 2503.18247 • Published Mar 24, 2025
Afri-MCQA: Multimodal Cultural Question Answering for African Languages Paper • 2601.05699 • Published Jan 9 • 3
Accept or Deny? Evaluating LLM Fairness and Performance in Loan Approval across Table-to-Text Serialization Approaches Paper • 2508.21512 • Published Aug 29, 2025
Ethio-ASR: Joint Multilingual Speech Recognition and Language Identification for Ethiopian Languages Paper • 2603.23654 • Published 29 days ago
AfrIFact: Cultural Information Retrieval, Evidence Extraction and Fact Checking for African Languages Paper • 2604.00706 • Published 22 days ago
Ethio-ASR: Joint Multilingual Speech Recognition and Language Identification for Ethiopian Languages Paper • 2603.23654 • Published 29 days ago
LLM2Vec-Gen: Generative Embeddings from Large Language Models Paper • 2603.10913 • Published Mar 11 • 44
FineInstructions: Scaling Synthetic Instructions to Pre-Training Scale Paper • 2601.22146 • Published Jan 29 • 11
SindBERT, the Sailor: Charting the Seas of Turkish NLP Paper • 2510.21364 • Published Oct 24, 2025 • 1
The German Commons - 154 Billion Tokens of Openly Licensed Text for German Language Models Paper • 2510.13996 • Published Oct 15, 2025 • 9
Llama-GENBA-10B: A Trilingual Large Language Model for German, English and Bavarian Paper • 2509.05668 • Published Sep 6, 2025 • 8
Voice Conversion Improves Cross-Domain Robustness for Spoken Arabic Dialect Identification Paper • 2505.24713 • Published May 30, 2025
MasakhaNER: Named Entity Recognition for African Languages Paper • 2103.11811 • Published Mar 22, 2021