[Feat] Heterogeneous Code Part 1: Add Model and Module Code for Chameleon Lumina #377

zhhsplendid · 2024-11-28T09:15:49Z

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

This PR and next PR series will be the code adding heterogeneous support and Chameleon model for InternEvo.

We plan to merge those PRs:

Adding Chameleon model code.
Adding Chameleon DataLoader code.
Adding some training code for Chameleon, for example, z_loss, discard large grad norm, etc.
Configuration of Chameleon and some tests for integrating above.
Heterogeneous support, which is flag-controlled cpu+gloo p2p communication and unbalanced pipeline parallelism.
More tests if needed.

This PR is the first one: adding Chameleon Model code.

Modification

As described above, this PR is the first of a series of PRs: Adding Chameleon model code.

BC-breaking (Optional)

None

Use cases (Optional)

We will add use case after the configuration PR so that we can show the training use case.

Checklist

Before PR:

Pre-commit or other linting tools are used to fix the potential lint issues.
Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests.
The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
The documentation has been modified accordingly, like docstring or example tutorials.

After PR:

If the modification has potential influence on downstream or other related projects, this PR should be tested with those projects.
CLA has been signed and all committers have signed the CLA in this PR.

internlm/model/modeling_chameleon.py

huangting4201 · 2024-11-29T08:04:10Z

internlm/model/modeling_chameleon.py

+            hidden_states = self.norm(hidden_states)
+
+        if hasattr(self, "output"):
+            hidden_states = self.output(hidden_states).float()


这里应该不需要把输出额外转fp32类型，我们NaiveAMPModel里有一个参数output_to_fp32来负责做这件事

huangting4201 · 2024-11-29T08:26:53Z

internlm/model/modules/mha.py

+                k_norm_out = self.k_norm(k_all)
+                k = split_forward_gather_backward(k_norm_out, ParallelMode.TENSOR, dim=-2)
+
+                v = rearrange(v, "b s (h d) -> b s h d", d=self.head_dim)


qk_norm这块可能要额外区分一下，如果is_using_isp()wp并行算法的话，就不需要走gather的逻辑了，因为isp算法forward时权重是完整的，算出来的qkv head也是完整的，不需要gather

huangting4201 · 2024-11-29T08:27:07Z

internlm/model/modules/mha.py

+                k_norm_out = self.k_norm(k_all)
+                k = split_forward_gather_backward(k_norm_out, ParallelMode.TENSOR, dim=-2)
+
+                v = rearrange(v, "b s (h d) -> b s h d", d=self.head_dim)


这块同上

internlm/model/ops/norm.py

Add model and module code for Chameleon Lumina

264a642

mm-assistant bot assigned sunpengsdu Nov 28, 2024

zhhsplendid added 3 commits November 28, 2024 18:13

Modify based on pre-commit

910032b

Modify norm more reuseable

841ed0a

Fix model name

99a3c58

huangting4201 self-requested a review November 29, 2024 06:45

huangting4201 reviewed Nov 29, 2024

View reviewed changes

Modify based on reviewer's comment 1

692a97a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feat] Heterogeneous Code Part 1: Add Model and Module Code for Chameleon Lumina #377

[Feat] Heterogeneous Code Part 1: Add Model and Module Code for Chameleon Lumina #377

zhhsplendid commented Nov 28, 2024 •

edited

Loading

huangting4201 Nov 29, 2024

huangting4201 Nov 29, 2024

huangting4201 Nov 29, 2024

[Feat] Heterogeneous Code Part 1: Add Model and Module Code for Chameleon Lumina #377

Are you sure you want to change the base?

[Feat] Heterogeneous Code Part 1: Add Model and Module Code for Chameleon Lumina #377

Conversation

zhhsplendid commented Nov 28, 2024 • edited Loading

Motivation

Modification

BC-breaking (Optional)

Use cases (Optional)

Checklist

huangting4201 Nov 29, 2024

Choose a reason for hiding this comment

huangting4201 Nov 29, 2024

Choose a reason for hiding this comment

huangting4201 Nov 29, 2024

Choose a reason for hiding this comment

zhhsplendid commented Nov 28, 2024 •

edited

Loading