Files
lucidia-main/lucidia_llm/data/schemas.md
2025-08-08 01:18:53 -07:00

281 B

Dataset Schemas

Pretraining Dataset

  • Input text: Raw text for language modeling.

SFT Dataset

  • Instruction: User instruction text.
  • Response: Assistant response text.

RLHF Pairs

  • Chosen: Preferred assistant response.
  • Rejected: Less preferred assistant response.