NewsBERT post-1900 LoRA adapter (3 epochs)

A LoRA adapter for TextMachineProject/NewsBERT_1800-1920, fine-tuned for three epochs on newspaper text (post-1900) from the Heritage Made Digital (HMD14) and Living with Machines (LwM) collections.

Training details

  • Period: post-1900
  • Base model: TextMachineProject/NewsBERT_1800-1920
  • Method: LoRA (PEFT), target modules: query, value, word_embeddings
  • LoRA rank: 16, alpha: 32, dropout: 0.05
  • Task: Masked Language Modelling (15% masking probability)
  • Sequence length: 128 tokens (sliding window, stride 96)
  • Epochs: 3
  • Batch size: 256

Usage

from transformers import AutoTokenizer, AutoModelForMaskedLM
from peft import PeftModel

base = AutoModelForMaskedLM.from_pretrained("TextMachineProject/NewsBERT_1800-1920")
tokenizer = AutoTokenizer.from_pretrained("TextMachineProject/NewsBERT_1800-1920")
model = PeftModel.from_pretrained(base, "TextMachineProject/NewsBERT_post_1900_lora_3epochs")
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for TextMachineProject/NewsBERT_post_1900_lora_3epochs

Adapter
(10)
this model