mirror of
https://github.com/blackboxprogramming/lucidia.git
synced 2026-03-17 09:37:56 -05:00
281 B
281 B
Dataset Schemas
Pretraining Dataset
- Input text: Raw text for language modeling.
SFT Dataset
- Instruction: User instruction text.
- Response: Assistant response text.
RLHF Pairs
- Chosen: Preferred assistant response.
- Rejected: Less preferred assistant response.