Commit 2723158
Enable Gradient Accumulation fix across all models + trainer fully in forward() (huggingface#34283)
* Enable grad accum fix across all models + trainer fully in forward()
* handle peft case
* Account for DDP: need to run scale tests
* Use accelerator state
* Quality
* Guard
* Experiment w/ only fairseq fix
* Fairseq only
* Revert multiply_grads fix
* Mult by grad accum to fully bring back solution
* Style
* Good to go now
* Skip fx tests for now
* Bookmark
* Working now1 parent b8450dd commit 2723158
File tree
25 files changed
+81
-31
lines changed- src/transformers
- models
- cohere
- gemma2
- gemma
- glm
- jamba
- mixtral
- mllama
- nemotron
- olmoe
- olmo
- phi3
- phimoe
- phi
- qwen2_moe
- qwen2
- rt_detr
- zamba
- tests/models
- cohere
- mistral
- mixtral
- qwen2_moe
- qwen2
25 files changed
+81
-31
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1114 | 1114 | | |
1115 | 1115 | | |
1116 | 1116 | | |
| 1117 | + | |
1117 | 1118 | | |
1118 | 1119 | | |
1119 | 1120 | | |
| |||
1172 | 1173 | | |
1173 | 1174 | | |
1174 | 1175 | | |
1175 | | - | |
| 1176 | + | |
1176 | 1177 | | |
1177 | 1178 | | |
1178 | 1179 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1030 | 1030 | | |
1031 | 1031 | | |
1032 | 1032 | | |
| 1033 | + | |
1033 | 1034 | | |
1034 | 1035 | | |
1035 | 1036 | | |
| |||
1087 | 1088 | | |
1088 | 1089 | | |
1089 | 1090 | | |
1090 | | - | |
| 1091 | + | |
1091 | 1092 | | |
1092 | 1093 | | |
1093 | 1094 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
961 | 961 | | |
962 | 962 | | |
963 | 963 | | |
| 964 | + | |
964 | 965 | | |
965 | 966 | | |
966 | 967 | | |
| |||
1003 | 1004 | | |
1004 | 1005 | | |
1005 | 1006 | | |
1006 | | - | |
| 1007 | + | |
1007 | 1008 | | |
1008 | 1009 | | |
1009 | 1010 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1002 | 1002 | | |
1003 | 1003 | | |
1004 | 1004 | | |
| 1005 | + | |
1005 | 1006 | | |
1006 | 1007 | | |
1007 | 1008 | | |
| |||
1068 | 1069 | | |
1069 | 1070 | | |
1070 | 1071 | | |
1071 | | - | |
| 1072 | + | |
1072 | 1073 | | |
1073 | 1074 | | |
1074 | 1075 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
756 | 756 | | |
757 | 757 | | |
758 | 758 | | |
| 759 | + | |
759 | 760 | | |
760 | 761 | | |
761 | 762 | | |
| |||
807 | 808 | | |
808 | 809 | | |
809 | 810 | | |
810 | | - | |
| 811 | + | |
811 | 812 | | |
812 | 813 | | |
813 | 814 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1014 | 1014 | | |
1015 | 1015 | | |
1016 | 1016 | | |
| 1017 | + | |
1017 | 1018 | | |
1018 | 1019 | | |
1019 | 1020 | | |
| |||
1071 | 1072 | | |
1072 | 1073 | | |
1073 | 1074 | | |
1074 | | - | |
| 1075 | + | |
1075 | 1076 | | |
1076 | 1077 | | |
1077 | 1078 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1450 | 1450 | | |
1451 | 1451 | | |
1452 | 1452 | | |
| 1453 | + | |
1453 | 1454 | | |
1454 | 1455 | | |
1455 | 1456 | | |
| |||
1515 | 1516 | | |
1516 | 1517 | | |
1517 | 1518 | | |
1518 | | - | |
| 1519 | + | |
1519 | 1520 | | |
1520 | 1521 | | |
1521 | 1522 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1240 | 1240 | | |
1241 | 1241 | | |
1242 | 1242 | | |
| 1243 | + | |
1243 | 1244 | | |
1244 | 1245 | | |
1245 | 1246 | | |
| |||
1303 | 1304 | | |
1304 | 1305 | | |
1305 | 1306 | | |
1306 | | - | |
| 1307 | + | |
1307 | 1308 | | |
1308 | 1309 | | |
1309 | 1310 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1887 | 1887 | | |
1888 | 1888 | | |
1889 | 1889 | | |
| 1890 | + | |
1890 | 1891 | | |
1891 | 1892 | | |
1892 | 1893 | | |
| |||
1949 | 1950 | | |
1950 | 1951 | | |
1951 | 1952 | | |
1952 | | - | |
| 1953 | + | |
1953 | 1954 | | |
1954 | 1955 | | |
1955 | 1956 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1028 | 1028 | | |
1029 | 1029 | | |
1030 | 1030 | | |
| 1031 | + | |
1031 | 1032 | | |
1032 | 1033 | | |
1033 | 1034 | | |
| |||
1085 | 1086 | | |
1086 | 1087 | | |
1087 | 1088 | | |
1088 | | - | |
| 1089 | + | |
1089 | 1090 | | |
1090 | 1091 | | |
1091 | 1092 | | |
| |||
0 commit comments