Use deterministic ollama settings (rasbt#250)

rasbt · web-flow · commit e296e8f6be48 · 2024-06-27T07:16:48.000-05:00
* deterministic ollama settings

* add missing file
diff --git a/ch07/01_main-chapter-code/ch07.ipynb b/ch07/01_main-chapter-code/ch07.ipynb
@@ -2301,7 +2301,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 1,
    "id": "026e8570-071e-48a2-aa38-64d7be35f288",
    "metadata": {
     "colab": {
@@ -2340,7 +2340,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 2,
    "id": "723c9b00-e3cd-4092-83c3-6e48b5cf65b0",
    "metadata": {
     "id": "723c9b00-e3cd-4092-83c3-6e48b5cf65b0"
@@ -2384,7 +2384,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 3,
    "id": "e3ae0e10-2b28-42ce-8ea2-d9366a58088f",
    "metadata": {
     "id": "e3ae0e10-2b28-42ce-8ea2-d9366a58088f",
@@ -2395,25 +2395,21 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "Llamas are ruminant animals, which means they have a four-chambered stomach that allows them to digest plant-based foods. Their diet typically consists of:\n",
+      "Llamas are herbivores, which means they primarily feed on plant-based foods. Their diet typically consists of:\n",
       "\n",
-      "1. Grasses: Llamas love to graze on grasses, including tall grasses, short grasses, and even weeds.\n",
-      "2. Hay: Hay is a common staple in a llama's diet. They enjoy high-quality hay like timothy hay, alfalfa hay, or oat hay.\n",
-      "3. Fruits and vegetables: Llamas will eat fruits and veggies as treats or as part of their regular diet. Favorites include apples, carrots, sweet potatoes, and leafy greens like kale or spinach.\n",
-      "4. Grains: Whole grains like oats, barley, and corn can be fed to llamas as a supplement.\n",
-      "5. Minerals: Llamas need access to minerals like calcium, phosphorus, and salt to stay healthy.\n",
+      "1. Grasses: Llamas love to graze on various types of grasses, including tall grasses, short grasses, and even weeds.\n",
+      "2. Hay: High-quality hay, such as alfalfa or timothy hay, is a staple in a llama's diet. They enjoy the sweet taste and texture of fresh hay.\n",
+      "3. Grains: Llamas may receive grains like oats, barley, or corn as part of their daily ration. However, it's essential to provide these grains in moderation, as they can be high in calories.\n",
+      "4. Fruits and vegetables: Llamas enjoy a variety of fruits and veggies, such as apples, carrots, sweet potatoes, and leafy greens like kale or spinach.\n",
+      "5. Minerals: Llamas require access to mineral supplements, which help maintain their overall health and well-being.\n",
       "\n",
-      "In the wild, llamas might eat:\n",
+      "In the wild, llamas might also eat:\n",
       "\n",
-      "* Leaves from shrubs and trees\n",
-      "* Bark\n",
-      "* Twigs\n",
-      "* Fruits\n",
-      "* Roots\n",
+      "1. Leaves: They'll munch on leaves from trees and shrubs, including plants like willow, alder, and birch.\n",
+      "2. Bark: In some cases, llamas may eat the bark of certain trees, like aspen or cottonwood.\n",
+      "3. Mosses and lichens: These non-vascular plants can be a tasty snack for llamas.\n",
       "\n",
-      "Domesticated llamas, on the other hand, are usually fed a diet of hay, grains, and fruits/veggies. Their nutritional needs can be met with a balanced feed that includes essential vitamins and minerals.\n",
-      "\n",
-      "Keep in mind that llamas have specific dietary requirements, and their food should be tailored to their individual needs. It's always best to consult with a veterinarian or experienced llama breeder to determine the best diet for your llama.\n"
+      "In captivity, llama owners typically provide a balanced diet that includes a mix of hay, grains, and fruits/vegetables. It's essential to consult with a veterinarian or experienced llama breeder to determine the best feeding plan for your llama.\n"
      ]
     }
    ],
@@ -2424,13 +2420,17 @@
     "    # Create the data payload as a dictionary\n",
     "    data = {\n",
     "        \"model\": model,\n",
-    "        \"seed\": 123,        # for deterministic responses\n",
-    "        \"temperature\": 0,   # for deterministic responses\n",
     "        \"messages\": [\n",
     "            {\"role\": \"user\", \"content\": prompt}\n",
-    "        ]\n",
+    "        ],\n",
+    "        \"options\": {     # Settings below are required for deterministic responses\n",
+    "            \"seed\": 123,\n",
+    "            \"temperature\": 0,\n",
+    "            \"num_ctx\": 2048\n",
+    "        }\n",
     "    }\n",
     "\n",
+    "\n",
     "    # Convert the dictionary to a JSON formatted string and encode it to bytes\n",
     "    payload = json.dumps(data).encode(\"utf-8\")\n",
     "\n",
@@ -2469,7 +2469,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 4,
    "id": "86b839d4-064d-4178-b2d7-01691b452e5e",
    "metadata": {
     "id": "86b839d4-064d-4178-b2d7-01691b452e5e",
@@ -2488,17 +2488,17 @@
       ">> The car is as fast as a bullet.\n",
       "\n",
       "Score:\n",
-      ">> A scoring task!\n",
+      ">> I'd rate the model response \"The car is as fast as a bullet.\" an 85 out of 100.\n",
       "\n",
-      "To evaluate the model response \"The car is as fast as a bullet.\", I'll consider how well it follows the instruction and uses a simile that's coherent, natural-sounding, and effective in conveying the idea of speed.\n",
+      "Here's why:\n",
       "\n",
-      "Here are some factors to consider:\n",
+      "* The response uses a simile correctly, comparing the speed of the car to something else (in this case, a bullet).\n",
+      "* The comparison is relevant and makes sense, as bullets are known for their high velocity.\n",
+      "* The phrase \"as fast as\" is used correctly to introduce the simile.\n",
       "\n",
-      "1. **Follows instruction**: Yes, the model uses a simile to rewrite the sentence.\n",
-      "2. **Coherence and naturalness**: The comparison between the car's speed and a bullet is common and easy to understand. It's a good choice for a simile that conveys the idea of rapid movement.\n",
-      "3. **Effectiveness in conveying idea of speed**: A bullet is known for its high velocity, which makes it an excellent choice to describe a fast-moving car.\n",
+      "The only reason I wouldn't give it a perfect score is that some people might find the comparison slightly less vivid or evocative than others. For example, comparing something to lightning (as in the original response) can be more dramatic and attention-grabbing. However, \"as fast as a bullet\" is still a strong and effective simile that effectively conveys the idea of the car's speed.\n",
       "\n",
-      "Considering these factors, I'd score the model response \"The car is as fast as a bullet.\" around 85 out of 100. The simile is well-chosen, coherent, and effectively conveys the idea of speed. Well done, model!\n",
+      "Overall, I think the model did a great job!\n",
       "\n",
       "-------------------------\n",
       "\n",
@@ -2509,15 +2509,15 @@
       ">> The type of cloud associated with thunderstorms is a cumulus cloud.\n",
       "\n",
       "Score:\n",
-      ">> A scoring task!\n",
+      ">> I'd score this model response as 40 out of 100.\n",
       "\n",
-      "I'll evaluate the model's response based on its accuracy and relevance to the original instruction.\n",
+      "Here's why:\n",
       "\n",
-      "**Accuracy:** The model's response is partially correct. Cumulus clouds are indeed associated with fair weather and not typically linked to thunderstorms. The correct answer, cumulonimbus, is a type of cloud that is closely tied to thunderstorm formation.\n",
+      "* The model correctly identifies that thunderstorms are related to clouds (correctly identifying the type of phenomenon).\n",
+      "* However, it incorrectly specifies the type of cloud associated with thunderstorms. Cumulus clouds are not typically associated with thunderstorms; cumulonimbus clouds are.\n",
+      "* The response lacks precision and accuracy in its description.\n",
       "\n",
-      "**Relevance:** The model's response is somewhat relevant, as it mentions clouds in the context of thunderstorms. However, the specific type of cloud mentioned (cumulus) is not directly related to thunderstorms.\n",
-      "\n",
-      "Considering these factors, I would score the model response a **40 out of 100**. While the response attempts to address the instruction, it provides an incorrect answer and lacks relevance to the original question.\n",
+      "Overall, while the model attempts to address the instruction, it provides an incorrect answer, which is a significant error.\n",
       "\n",
       "-------------------------\n",
       "\n",
@@ -2528,19 +2528,13 @@
       ">> The author of 'Pride and Prejudice' is Jane Austen.\n",
       "\n",
       "Score:\n",
-      ">> A simple one!\n",
-      "\n",
-      "My model response: \"The author of 'Pride and Prejudice' is Jane Austen.\"\n",
+      ">> I'd rate my own response as 95 out of 100. Here's why:\n",
       "\n",
-      "Score: **99**\n",
+      "* The response accurately answers the question by naming the author of 'Pride and Prejudice' as Jane Austen.\n",
+      "* The response is concise and clear, making it easy to understand.\n",
+      "* There are no grammatical errors or ambiguities that could lead to confusion.\n",
       "\n",
-      "Reasoning:\n",
-      "\n",
-      "* The response directly answers the question, providing the correct name of the author.\n",
-      "* The sentence structure is clear and easy to understand.\n",
-      "* There's no room for misinterpretation or ambiguity.\n",
-      "\n",
-      "Overall, a perfect score!\n",
+      "The only reason I wouldn't give myself a perfect score is that the response is slightly redundant - it's not necessary to rephrase the question in the answer. A more concise response would be simply \"Jane Austen.\"\n",
       "\n",
       "-------------------------\n"
      ]
@@ -2577,7 +2571,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 5,
    "id": "9d7bca69-97c4-47a5-9aa0-32f116fa37eb",
    "metadata": {
     "id": "9d7bca69-97c4-47a5-9aa0-32f116fa37eb",
@@ -2588,15 +2582,15 @@
      "name": "stderr",
      "output_type": "stream",
      "text": [
-      "Scoring entries: 100%|████████████████████████| 110/110 [01:10<00:00,  1.56it/s]"
+      "Scoring entries: 100%|████████████████████████| 110/110 [01:08<00:00,  1.60it/s]"
      ]
     },
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
       "Number of scores: 110 of 110\n",
-      "Average score: 54.16\n",
+      "Average score: 50.32\n",
       "\n"
      ]
     },
@@ -2642,7 +2636,7 @@
    },
    "source": [
     "- Our model achieves an average score of above 50, which we can use as a reference point to compare the model to other models or to try out other training settings that may improve the model\n",
-    "- Note that ollama is not fully deterministic (as of this writing), so the numbers you are getting might slightly differ from the ones shown above"
+    "- Note that ollama is not fully deterministic across operating systems (as of this writing), so the numbers you are getting might slightly differ from the ones shown above"
    ]
   },
   {
@@ -2733,7 +2727,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.10.11"
+   "version": "3.11.4"
   }
  },
  "nbformat": 4,
diff --git a/ch07/01_main-chapter-code/ollama_evaluate.py b/ch07/01_main-chapter-code/ollama_evaluate.py
@@ -15,11 +15,14 @@ def query_model(prompt, model="llama3", url="http://localhost:11434/api/chat"):
     # Create the data payload as a dictionary
     data = {
         "model": model,
-        "seed": 123,        # for deterministic responses
-        "temperature": 0,   # for deterministic responses
         "messages": [
             {"role": "user", "content": prompt}
-        ]
+        ],
+        "options": {     # Settings below are required for deterministic responses
+            "seed": 123,
+            "temperature": 0,
+            "num_ctx": 2048
+        }
     }
 
     # Convert the dictionary to a JSON formatted string and encode it to bytes
diff --git a/ch07/03_model-evaluation/llm-instruction-eval-ollama.ipynb b/ch07/03_model-evaluation/llm-instruction-eval-ollama.ipynb