mirror of
https://github.com/blackboxprogramming/lucidia.git
synced 2026-03-17 08:57:17 -05:00
Add data/schemas.md with dataset schema descriptions
This commit is contained in:
committed by
GitHub
parent
69eb0ae00c
commit
fa4f69097f
12
lucidia_llm/data/schemas.md
Normal file
12
lucidia_llm/data/schemas.md
Normal file
@@ -0,0 +1,12 @@
|
|||||||
|
# Dataset Schemas
|
||||||
|
|
||||||
|
## Pretraining Dataset
|
||||||
|
- Input text: Raw text for language modeling.
|
||||||
|
|
||||||
|
## SFT Dataset
|
||||||
|
- Instruction: User instruction text.
|
||||||
|
- Response: Assistant response text.
|
||||||
|
|
||||||
|
## RLHF Pairs
|
||||||
|
- Chosen: Preferred assistant response.
|
||||||
|
- Rejected: Less preferred assistant response.
|
||||||
Reference in New Issue
Block a user