A collection of resources for Galician Grammar Error Correction.
AI & ML interests
None defined yet.
Recent Activity
View all activity
Collection of corpora prepared from specific domains mainly in Galician language.
Collection of datasets in Galician for fine-tuning, instruction tuning or training purposes.
TTS models trained using the CoquiTTS Python library.
Experiments associated with the paper 'Continued Pretraining and Interpretability-Based Evaluation for Low-Resource Languages: A Galician Case Study'
Datasets for training and evaluation of ASR models.
Galician BERT based in MrBERT model
CorpusNÓS is the largest collection of data in Galician language for training LLM.
Collection of datasets in Galician for LLM evaluation. It includes translations from already existing datasets as well as datasets created by us.
-
Open Generative Large Language Models for Galician
Paper • 2406.13893 • Published -
proxectonos/Carvalho-Salamandra-Instruct
Text Generation • 8B • Updated • 8 -
Nos-PT/Llama-Carvalho-PT-GL
Text Generation • 8B • Updated • 565 • • 2 -
proxectonos/Llama-3.1-Carballo-Instr3
Text Generation • 8B • Updated • 18 •
Automatic Speech Recognition models
Older MT models trained with older libraries and datasets.
Datasets for training and evaluation of TTS models.
A collection of resources for Galician Grammar Error Correction.
Galician BERT based in MrBERT model
Collection of corpora prepared from specific domains mainly in Galician language.
CorpusNÓS is the largest collection of data in Galician language for training LLM.
Collection of datasets in Galician for fine-tuning, instruction tuning or training purposes.
Collection of datasets in Galician for LLM evaluation. It includes translations from already existing datasets as well as datasets created by us.
-
Open Generative Large Language Models for Galician
Paper • 2406.13893 • Published -
proxectonos/Carvalho-Salamandra-Instruct
Text Generation • 8B • Updated • 8 -
Nos-PT/Llama-Carvalho-PT-GL
Text Generation • 8B • Updated • 565 • • 2 -
proxectonos/Llama-3.1-Carballo-Instr3
Text Generation • 8B • Updated • 18 •
TTS models trained using the CoquiTTS Python library.
Automatic Speech Recognition models
Experiments associated with the paper 'Continued Pretraining and Interpretability-Based Evaluation for Low-Resource Languages: A Galician Case Study'
Older MT models trained with older libraries and datasets.
Datasets for training and evaluation of ASR models.
Datasets for training and evaluation of TTS models.