Update mid-way-recap.mdx

Removed - "define the policy by hand" as it can be misleading and restructured the sentence establishing the focus on Value function and it's inherent nature to lead us to Optimal Policy
huggingface · Mar 9, 2025 · bdb2c6e · bdb2c6e
1 parent be21bbf
commit bdb2c6e
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/units/en/unit2/mid-way-recap.mdx b/units/en/unit2/mid-way-recap.mdx
@@ -6,7 +6,7 @@ We have two types of value-based functions:
 
 - State-value function: outputs the expected return if **the agent starts at a given state and acts according to the policy forever after.**
 - Action-value function: outputs the expected return if **the agent starts in a given state, takes a given action at that state** and then acts accordingly to the policy forever after.
-- In value-based methods, rather than learning the policy, **we define the policy by hand** and we learn a value function. If we have an optimal value function, we **will have an optimal policy.**
+- In value-based methods, rather than learning the policy, **we focus on learning a value function**. An optimal value function, **will lead us to an optimal policy.**
 
 There are two types of methods to update the value function: