Refactor: Improve modularity and readability across codebase #558

Mirza-Samad-Ahmed-Baig · 2025-06-28T14:12:25Z

This commit introduces several refactorings across run.c, test.c, and
train.py to enhance code modularity, readability, and maintainability.
The primary goal was to reduce function complexity and nesting depth by
extracting distinct logical blocks into dedicated helper functions.

Key changes include:

run.c:

Extracted multihead_attention function: The complex multihead
attention logic within the forward function has been moved into a
new, self-contained multihead_attention function. This significantly
reduces the nesting level and improves the clarity of the main
forward loop.
Extracted process_utf8_bytes function: The intricate UTF-8 byte
processing within the encode function was isolated into
process_utf8_bytes. This simplifies the encode function and makes
the byte-level operations more understandable.
Extracted render_chat_prompt function: The logic responsible for
formatting chat prompts in the chat function has been moved to a new
render_chat_prompt helper, making the chat function's flow clearer.
Extracted get_chat_prompts function: The logic for acquiring
system and user prompts within the chat function has been
encapsulated in get_chat_prompts, further streamlining the chat
function.

test.c:

Extracted individual test_prompt_encoding functions: The
monolithic test_prompt_encodings function was broken down into
smaller, more focused functions (e.g., test_prompt_encoding_0,
test_prompt_encoding_1, etc.). This improves test readability and
makes it easier to identify and debug specific test failures.

train.py:

Extracted setup_ddp function: The distributed data parallel (DDP)
setup logic has been moved into a dedicated setup_ddp function,
centralizing configuration and improving the main script's clarity.
Extracted initialize_model function: The model initialization
logic, including loading from scratch or resuming from a checkpoint,
is now handled by initialize_model.
Extracted setup_optimizer function: The optimizer setup, including
GradScaler initialization and loading optimizer state from
checkpoints, has been moved to setup_optimizer.
Extracted save_checkpoint function: The logic for saving training
checkpoints has been encapsulated in a save_checkpoint function,
promoting reusability and cleaner code within the training loop.

These changes collectively contribute to a more organized,
readable, and maintainable codebase, making it easier to
understand, debug, and extend in the future.

This reverts commit e7e2c70.

This reverts commit b84bc6e.

This reverts commit b2f5b4c.

This reverts commit 665d7a1.

This reverts commit 981ba8e.

This reverts commit cce4816.

This reverts commit 262c6ee.

This reverts commit 3dabffa.

Mirza-Samad-Ahmed-Baig added 28 commits June 28, 2025 18:28

"refactor"

5ca79b5

"refactor_encode"

b84bc6e

"fix_revert_issue"

e7e2c70

Revert ""fix_revert_issue""

d9fa792

This reverts commit e7e2c70.

Revert ""refactor_encode""

ffb9fcb

This reverts commit b84bc6e.

"refactor_encode_function"

b91660d

"refactor_chat_prompt"

f93e09a

"fix_revert_issue_chat"

b2f5b4c

Revert ""fix_revert_issue_chat""

ebe5c4d

This reverts commit b2f5b4c.

"refactor_chat_prompt_acquisition"

2827b83

"refactor_cli_args"

0a89c64

"refactor_tests"

90d8383

"refactor_model_sampling"

cede08d

"refactor_save_checkpoint"

66ce3f0

"refactor_ddp_setup"

22d933a

"refactor_train_checkpoint_save"

bbffd57

"refactor_model_init"

981ba8e

"fix_revert_issue_train"

665d7a1

Revert ""fix_revert_issue_train""

6f9ebde

This reverts commit 665d7a1.

Revert ""refactor_model_init""

4fb6557

This reverts commit 981ba8e.

"fix_revert_issue_train_2"

de2772a

"refactor_train_save_checkpoint"

262c6ee

"fix_revert_issue_train_3"

cce4816

Revert ""fix_revert_issue_train_3""

0921aba

This reverts commit cce4816.

Revert ""refactor_train_save_checkpoint""

89fa8af

This reverts commit 262c6ee.

"fix_revert_issue_train_4"

3dabffa

Revert ""fix_revert_issue_train_4""

2039166

This reverts commit 3dabffa.

"refactor_train_init_opt_save"

448b5c5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor: Improve modularity and readability across codebase #558

Refactor: Improve modularity and readability across codebase #558

Uh oh!

Mirza-Samad-Ahmed-Baig commented Jun 28, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Refactor: Improve modularity and readability across codebase #558

Are you sure you want to change the base?

Refactor: Improve modularity and readability across codebase #558

Uh oh!

Conversation

Mirza-Samad-Ahmed-Baig commented Jun 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Mirza-Samad-Ahmed-Baig commented Jun 28, 2025 •

edited

Loading