Files
lucidia-main/lucidia_llm/data/schemas.md
2025-08-08 01:18:53 -07:00

13 lines
281 B
Markdown

# Dataset Schemas
## Pretraining Dataset
- Input text: Raw text for language modeling.
## SFT Dataset
- Instruction: User instruction text.
- Response: Assistant response text.
## RLHF Pairs
- Chosen: Preferred assistant response.
- Rejected: Less preferred assistant response.