updated RoPE statement (#423)

d-kleine · rasbt · web-flow · commit 81eed9afe27f · 2024-10-30T08:00:08.000-05:00
* updated RoPE statement

* updated .gitignore

* Update ch05/07_gpt_to_llama/converting-gpt-to-llama2.ipynb

---------

Co-authored-by: Sebastian Raschka &lt;mail@sebastianraschka.com&gt;
diff --git a/.gitignore b/.gitignore
@@ -44,6 +44,8 @@ ch05/07_gpt_to_llama/Llama-3.1-8B
 ch05/07_gpt_to_llama/Llama-3.1-8B-Instruct
 ch05/07_gpt_to_llama/Llama-3.2-1B
 ch05/07_gpt_to_llama/Llama-3.2-1B-Instruct
+ch05/07_gpt_to_llama/Llama-3.2-3B
+ch05/07_gpt_to_llama/Llama-3.2-3B-Instruct
 
 ch06/01_main-chapter-code/gpt2
 ch06/02_bonus_additional-experiments/gpt2
diff --git a/ch05/07_gpt_to_llama/converting-gpt-to-llama2.ipynb b/ch05/07_gpt_to_llama/converting-gpt-to-llama2.ipynb
@@ -409,7 +409,7 @@
     "self.pos_emb = nn.Embedding(cfg[\"context_length\"], cfg[\"emb_dim\"])\n",
     "```\n",
     "\n",
-    "- Instead of these absolute positional embeddings, Llama uses relative positional embeddings, called rotary position embeddings (RoPE for short)\n",
+    "- Unlike traditional absolute positional embeddings, Llama uses rotary position embeddings (RoPE), which enable it to capture both absolute and relative positional information simultaneously\n",
     "- The reference paper for RoPE is [RoFormer: Enhanced Transformer with Rotary Position Embedding (2021)](https://arxiv.org/abs/2104.09864)"
    ]
   },

Original file line number	Diff line number	Diff line change
`@@ -409,7 +409,7 @@`
`409`	`409`	`"self.pos_emb = nn.Embedding(cfg[\"context_length\"], cfg[\"emb_dim\"])\n",`
`410`	`410`	"```\n",
`411`	`411`	`"\n",`
`412`		`- "- Instead of these absolute positional embeddings, Llama uses relative positional embeddings, called rotary position embeddings (RoPE for short)\n",`
	`412`	`+ "- Unlike traditional absolute positional embeddings, Llama uses rotary position embeddings (RoPE), which enable it to capture both absolute and relative positional information simultaneously\n",`
`413`	`413`	`"- The reference paper for RoPE is [RoFormer: Enhanced Transformer with Rotary Position Embedding (2021)](https://arxiv.org/abs/2104.09864)"`
`414`	`414`	`]`
`415`	`415`	`},`