|
41 | 41 | " 2. We use the instruction-finetuned LLM to generate multiple responses and have LLMs rank them based on given preference criteria\n",
|
42 | 42 | " 3. We use an LLM to generate preferred and dispreferred responses given certain preference criteria\n",
|
43 | 43 | "- In this notebook, we consider approach 3\n",
|
44 |
| - "- This notebook uses a 70 billion parameter Llama 3.1-Instruct model through ollama to generate preference labels for an instruction dataset\n", |
| 44 | + "- This notebook uses a 70 billion parameters Llama 3.1-Instruct model through ollama to generate preference labels for an instruction dataset\n", |
45 | 45 | "- The expected format of the instruction dataset is as follows:\n",
|
46 | 46 | "\n",
|
47 | 47 | "\n",
|
48 | 48 | "### Input\n",
|
49 | 49 | "\n",
|
50 |
| - "```python\n", |
| 50 | + "```json\n", |
51 | 51 | "[\n",
|
52 | 52 | " {\n",
|
53 | 53 | " \"instruction\": \"What is the state capital of California?\",\n",
|
|
71 | 71 | "\n",
|
72 | 72 | "The output dataset will look as follows, where more polite responses are preferred (`'chosen'`), and more impolite responses are dispreferred (`'rejected'`):\n",
|
73 | 73 | "\n",
|
74 |
| - "```python\n", |
| 74 | + "```json\n", |
75 | 75 | "[\n",
|
76 | 76 | " {\n",
|
77 | 77 | " \"instruction\": \"What is the state capital of California?\",\n",
|
|
98 | 98 | "]\n",
|
99 | 99 | "```\n",
|
100 | 100 | "\n",
|
101 |
| - "### Ouput\n", |
| 101 | + "### Output\n", |
102 | 102 | "\n",
|
103 | 103 | "\n",
|
104 | 104 | "\n",
|
|
135 | 135 | "id": "8bcdcb34-ac75-4f4f-9505-3ce0666c42d5",
|
136 | 136 | "metadata": {},
|
137 | 137 | "source": [
|
138 |
| - "## Installing Ollama and Downloading Llama 3" |
| 138 | + "## Installing Ollama and Downloading Llama 3.1" |
139 | 139 | ]
|
140 | 140 | },
|
141 | 141 | {
|
|
353 | 353 | "source": [
|
354 | 354 | "from pathlib import Path\n",
|
355 | 355 | "\n",
|
356 |
| - "json_file = Path(\"..\") / \"01_main-chapter-code\" / \"instruction-data.json\"\n", |
| 356 | + "json_file = Path(\"..\", \"01_main-chapter-code\", \"instruction-data.json\")\n", |
357 | 357 | "\n",
|
358 | 358 | "with open(json_file, \"r\") as file:\n",
|
359 | 359 | " json_data = json.load(file)\n",
|
|
498 | 498 | "metadata": {},
|
499 | 499 | "source": [
|
500 | 500 | "- If we find that the generated responses above look reasonable, we can go to the next step and apply the prompt to the whole dataset\n",
|
501 |
| - "- Here, we add a `'chosen`' key for the preferred response and a `'rejected'` response for the dispreferred response" |
| 501 | + "- Here, we add a `'chosen'` key for the preferred response and a `'rejected'` response for the dispreferred response" |
502 | 502 | ]
|
503 | 503 | },
|
504 | 504 | {
|
|
0 commit comments