mirror of
https://github.com/blackboxprogramming/lucidia.git
synced 2026-03-17 08:57:17 -05:00
13 lines
281 B
Markdown
13 lines
281 B
Markdown
# Dataset Schemas
|
|
|
|
## Pretraining Dataset
|
|
- Input text: Raw text for language modeling.
|
|
|
|
## SFT Dataset
|
|
- Instruction: User instruction text.
|
|
- Response: Assistant response text.
|
|
|
|
## RLHF Pairs
|
|
- Chosen: Preferred assistant response.
|
|
- Rejected: Less preferred assistant response.
|